BI and TI

You might also like

Download as pdf or txt
Download as pdf or txt
You are on page 1of 884

Lecture Notes on Data Engineering

and Communications Technologies 107

Aboul Ella Hassanien · Yaoqun Xu ·


Zhijie Zhao · Sabah Mohammed ·
Zhipeng Fan Editors

Business
Intelligence and
Information
Technology
Proceedings of the International
Conference on Business Intelligence
and Information Technology BIIT 2021
Lecture Notes on Data Engineering
and Communications Technologies

Volume 107

Series Editor
Fatos Xhafa, Technical University of Catalonia, Barcelona, Spain
The aim of the book series is to present cutting edge engineering approaches to data
technologies and communications. It will publish latest advances on the engineering
task of building and deploying distributed, scalable and reliable data infrastructures
and communication systems.
The series will have a prominent applied focus on data technologies and
communications with aim to promote the bridging from fundamental research on
data science and networking to data engineering and communications that lead to
industry products, business knowledge and standardisation.
Indexed by SCOPUS, INSPEC, EI Compendex.
All books published in the series are submitted for consideration in Web of Science.

More information about this series at https://link.springer.com/bookseries/15362


Aboul Ella Hassanien Yaoqun Xu
• •

Zhijie Zhao Sabah Mohammed


• •

Zhipeng Fan
Editors

Business Intelligence
and Information Technology
Proceedings of the International Conference
on Business Intelligence and Information
Technology BIIT 2021

123
Editors
Aboul Ella Hassanien Yaoqun Xu
Information Technology Department School of Computer and Information
Cairo University Engineering
Giza, Egypt Harbin University of Commerce
Harbin, Heilongjiang, China
Zhijie Zhao
School of Computer and Information Sabah Mohammed
Engineering Department of Computer Science
Harbin University of Commerce Lakehead University
Harbin, Heilongjiang, China Thunder Bay, ON, Canada

Zhipeng Fan
School of Computer and Information
Engineering
Harbin University of Commerce
Harbin, Heilongjiang, China

ISSN 2367-4512 ISSN 2367-4520 (electronic)


Lecture Notes on Data Engineering and Communications Technologies
ISBN 978-3-030-92631-1 ISBN 978-3-030-92632-8 (eBook)
https://doi.org/10.1007/978-3-030-92632-8
© The Editor(s) (if applicable) and The Author(s), under exclusive license
to Springer Nature Switzerland AG 2022
This work is subject to copyright. All rights are solely and exclusively licensed by the Publisher, whether
the whole or part of the material is concerned, specifically the rights of translation, reprinting, reuse of
illustrations, recitation, broadcasting, reproduction on microfilms or in any other physical way, and
transmission or information storage and retrieval, electronic adaptation, computer software, or by similar
or dissimilar methodology now known or hereafter developed.
The use of general descriptive names, registered names, trademarks, service marks, etc. in this
publication does not imply, even in the absence of a specific statement, that such names are exempt from
the relevant protective laws and regulations and therefore free for general use.
The publisher, the authors and the editors are safe to assume that the advice and information in this
book are believed to be true and accurate at the date of publication. Neither the publisher nor the
authors or the editors give a warranty, expressed or implied, with respect to the material contained
herein or for any errors or omissions that may have been made. The publisher remains neutral with regard
to jurisdictional claims in published maps and institutional affiliations.

This Springer imprint is published by the registered company Springer Nature Switzerland AG
The registered company address is: Gewerbestrasse 11, 6330 Cham, Switzerland
Preface

This volume constitutes the refereed proceedings of the 2021 International


Conference on Business Intelligence and Information Technology (BIIT 2021) held
in Harbin, China, during December 18–20, 2021. BIIT 2021 is organized by the
School of Computer and Information Engineering, Harbin University of
Commerce, and supported by Scientific Research Group in Egypt (SRGE), Egypt.
BIIT is organized to provide an international forum that brings together those
actively involved in the areas of interest, report on up-to-the-minute innovations
and developments, summarize the state of the art, and exchange ideas and advances
in all aspects of business intelligence and information technologies. The papers
cover current research in electronic commerce technology and application, business
intelligence and decision making, digital economy, accounting informatization,
intelligent information processing, image processing and multimedia technology,
signal detection and processing, communication engineering and technology,
information security, automatic control technique, data mining, software develop-
ment, design, blockchain technology, big data technology, artificial intelligence
technology. As well as the regular submission to the main conference, there are 12
special sessions been organized.
The conference proceedings has seven main tracks:
Part (1) Digital economy and e-commerce
Part (2) Data mining and association rules
Part (3) Network security and blockchain
Part (4) Image analysis and processing
Part (5) Machine learning and deep learning and their applications
Part (6) Business intelligence and communications
Part (7) Information technology
On average, all submissions were reviewed by at least two reviewers, with no
distinction between papers submitted for all conference tracks. We are convinced
that the quality and diversity of the topics covered will satisfy both the attendees
and the readers of this conference proceedings. We express our sincere thanks to the
plenary speakers, workshop/session chairs, and International Program Committee

v
vi Preface

members for helping us to formulate a rich technical program. We want to extend


our sincere appreciation for the outstanding work contributed over many months by
the Organizing Committee: local organization chair and publicity chair. We also
wish to express our appreciation to the SRGE members for their assistance. We
want to emphasize that the success of BITT 2021 would not have been possible
without the support of many committed volunteers who generously contributed
their time, expertise, and resources toward making the conference an unqualified
success. Finally, thanks to the Springer team for their support in all stages of the
production of the proceedings. We hope that you will enjoy the conference
program.
Organization

General Chair

Yaoqun Xu Harbin University of Commerce, China

General Co-chairs
Sabah Mohammed Lakehead University, Canada
Ping Han Harbin University of Commerce, China
Wei Wang Harbin University of Commerce, China
Shizhen Bai Harbin University of Commerce, China
Fengge Yao Harbin University of Commerce, China
Zeguo Qiu Harbin University of Commerce, China

Technical Program Committee Chair


Zhijie Zhao Harbin University of Commerce, China

Technical Program Committee Co-chair


Hongwei Mo Harbin Engineering University, China
Weipeng Jing Northeast Forestry University, China

Publication Chair
Aboul Ella Hassanien SRGE, Egypt

Publication Co-chair
Zhipeng Fan Harbin University of Commerce, China

vii
viii Organization

Organizing Committee Chairs


Xiaodong Su Harbin University of Commerce, China
Haitao Xin Harbin University of Commerce, China

Technical Program Committee Members


Charalampos Z. Patrikakis National Technical University of Athens, Greece
Byungjoo Park Hannam University, Korea
Zhiwen Yu Kyoto University, Japan
Ching-Hsien Hsu Asia University, Taiwan
Naveen K. Chilamkurti La Trobe University, Australia
Aboul Ella Hassanien College of Business Administration, Kuwait
Ajith Abraham Norwegian University of Science and
Technology, India
Alicja Wieczorkowska PJIIT, Poland
Antonio Coronato ICAR-CNR, Italy
Arbib Michael University of Southern California, USA
Ashesh Mahidadia University of New South Wales, Australia
Brian King Purdue University, USA
Chantana Chantrapornchai Silpakorn University, Thailand
Chengcui Zhang University of Alabama at Birmingham, USA
Chris van Aart Sogeti B.V, Netherlands
D. Manivannan University of Kentucky, USA
Duman Hakan T Research and Technology, Germany
Gerald Schaefer Aston University, UK
Gilcheol Park Hannam University, Korea
Giovanni Cagalaban Hannam University, Korea
Hakan Duman British Telecom, UK
Han-Chieh Chao National Ilan University, Taiwan
Hideyuki Suzuki The University of Tokyo, Japan
Ismail Khalil Ibrahim Institute of Telecooperation, Austria
James B. D. Joshi University of Pittsburgh, USA
Jason Sigfred Surigao Sur Polytechnic State College,
Philippines
Javier Garcia-Villalba Complutense University of Madrid, Spain
Jemal H. Abawajy Deakin University, Australia
Ali Saberi Distinguished Researcher in Iranian Researchers
Network, Iran
Noskov Mikhail Fedorovic Siberian Federal University, Russia
Kamal Karkonasasi Universiti Sains Malaysia, Malaysia
Harald Kitzmann Kazakh University, Kazakhstan
Andrii Bieliatynskyi National Aviation University, Ukraine
Anelia Kurteva University of Innsbruck, Austria
Uduak Augustine Umoh University of Uyo, Nigeria
Organization ix

Muhammad Hashim National Textile University, Pakistan


Globa Larysa National Technical University of Ukraine,
Ukraine
Alex Mathew Bethany College, USA
Ievleva Olga T. Southern Federal University, Russia
Tan Xiao Jian Tunku Abdul Rahman University College
(TARUC), Malaysia
Antonio Lucadamo University of Sannio, Italia
Anna Crisci University of Naples Federico II, Italia
Mahfujur Rahman Daffodil International University, Bangladesh
Nassira Achich University of Sfax, Tunisia
Contents

Digital Economy and E-commerce Technolgoies


Digital Economy and High-Quality Development
of Manufacturing Industry . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
Xiaoxia Xu and Ping Han
Research on the Mechanism of Promoting the Development
of High-Efficiency and Featured Tropical Agriculture in Hainan
Relying on the Digital Economy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
Fan Jiang, Shaoqing Tian, and Chongli Huang
The Impact of Digital Economy on China’s
Low-Carbon Development . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
Fengge Yao and Lin Li
The Impact of the Digital Economy on Inclusive Growth——
Empirical Study Based on the Spatial Econometric Model . . . . . . . . . . 35
Fengge Yao and Xiaoyu Wang
Evolutionary Game Analysis of Digital Innovation Ecosystem
Governance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51
Liu Kewen and Liu Junji
Measurement and Comparison of Digital Economy Development
of Cities in China . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61
Ping Han and Jiao Li
Occupational Risks and Guarantee Path of the Delivery Man
in Digital Economy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70
Jin Daizhi and Wang Han
Research on E-commerce Recommended Algorithm Based
on Knowledge Graph . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78
Yu Zhang, Jingming Ye, Xin Yue, Sifan Wei, Yu Wang, and Wanli Ren

xi
xii Contents

Research on Regional Heterogeneity in the Impact of Digital


Inclusive Finance on the Diversification of Household Financial
Asset Allocation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 88
Shu-bo Jiang and Xiao-Han Zhao
Research on the Countermeasures for the Development
of Agricultural and Rural Digital Economy . . . . . . . . . . . . . . . . . . . . . . 98
Bai Ying and Jiao Jinpeng
Research on the Countermeasures of Digital Intelligence
Transformation and Upgrading of Innovative
Manufacturing Enterprises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 106
Yanlai Li, Lili Zhai, Caixia Yang, and Kewen Liu

Data Mining and Association Rules


A Robust Matting Method Combined with Sparse-Coded Model . . . . . . 119
Guilin Yao, Huixin Yang, and Shirui Wu
Segmentation of Cervical Cell Cluster by Multiscale Graph
Cut Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 131
Tao Wang
Source Code Author Identification Method Combining Semantics
and Statistical Features . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 141
Xu Sun, Yutong Sun, Leilei Kong, Yong Han, and Hui Ning
The Improvement of Attribute Reduction Algorithm Based
on Information Gain Ratio in Rough Set Theory . . . . . . . . . . . . . . . . . . 152
Wenjing Wang, Min Guo, Tongtong Han, and Shiyong Ning
Accurate Teaching Data Analysis of “Special Effects Production
for Digital Films and TV Programmes” Based on Big Data . . . . . . . . . . 160
Fan Jing
An Improved Clustering Routing Algorithm Based on Leach . . . . . . . . 169
Jintao Yu and Yu Bai
Association Rules Mining Algorithm Based on Information Gain Ratio
Attribute Reduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 181
Tongtong Han, Wenjing Wang, Min Guo, and Shiyong Ning
Based on the Inception and the ResNet Module Improving VGG16
to Classify Commodity Images . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 190
Yuru Zhang and Yang Shen
Improved Algorithm of Multiple Minimum Support Association Rules
Based on Can Tree . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 206
Xiao Zhao and Shi Yong Ning
Contents xiii

Mining Association Rules of Breast Cancer Based on Fuzzy


Rough Set . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 214
Min Guo, Tongtong Han, Wenjing Wang, and Shiyong Ning
Multi-task Model for Named Entity Recognition . . . . . . . . . . . . . . . . . . 225
Dequan Zheng, Baishuo Yong, and Jing Yang
Research on Non-pooling Convolutional Text Classification
Technology Combined with Attention Mechanism . . . . . . . . . . . . . . . . . 234
Hui Li, Zeming Li, Wei Zhao, and Xue Tan

Network Security and Blockchain


A New Credit Data Evaluation Scheme Based on Blockchain
and Smart Contract . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 247
Yu Wang, Zheqing Tang, Peng Hong, Tianshi Wei, Shiyu Wang,
and Yong Du
An Algorithm of Image Encryption Based on Bessel Self-feedback
Chaotic Neural Network . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 257
Yaoqun Xu, Meng Tang, and Jingtao Fan
Evaluation of the Effect of Blockchain Technology Application
in the International Trade . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 267
Jinping Zhang and Ying Fan
Fraud Network Identification Model for Insurance Industry . . . . . . . . . 276
Jiaqiu Wang, Yining Jin, Zaiyu Jiang, Xueyong Hu, and Peng Wang
Global Supply Chain Information Compensation Model Based
on Free Trade Port Blockchain Information Platform . . . . . . . . . . . . . . 288
Shaoqing Tian, Fan Jiang, and Chongli Huang
Multi-layer Intrusion Detection Method Using Graph
Neural Network . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 301
Ling Ma
Research on an Image Hiding Algorithm with Symmetric Means . . . . . 309
Hongxin Wang, Hui Li, and Xue Tan

Image Analysis and Processing


A Detection Network United Local Feature Points and Components
for Fine-Grained Image Classification . . . . . . . . . . . . . . . . . . . . . . . . . . 321
Yong Du, Bin Yu, Peng Hong, Wei Pan, Yang Wang, and Yu Wang
A Random Walks Image Segmentation Method Combined with KNN
Affine Class . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 332
Xiaodong Su, Shizhou Li, Guilin Yao, Hongyu Liang, Yurong Zhang,
and Shirui Wu
xiv Contents

Generative Adversarial Network Image Inpainting Based


on Dual Discriminators . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 344
Jianming Sun, Jinpeng Wu, and Xuena Han
GrabCut Image Segmentation Based on Local Sampling . . . . . . . . . . . . 356
Guilin Yao, Shirui Wu, Huixin Yang, and Shizhou Li
Multi-angle Face Recognition Based on GMRF . . . . . . . . . . . . . . . . . . . 366
Sun Huadong, Zhao Pengfei, and Zhang Yingjing
Multi-scale Object Detection Algorithm Based on Faster R-CNN . . . . . . 379
Xiaodong Su, Yurong Zhang, Chaoyu Wang, Hongyu Liang,
and Shizhou Li
Persistent Homology Apply in Digital Images . . . . . . . . . . . . . . . . . . . . 392
Sun Huadong, Zhang Yingjing, and Zhao Pengfei
Research on the Style Classification Method of Clothing
Commodity Images . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 401
Yanrong Zhang and Rong Song
Topological Feature Analysis of RGB Image Based on Persistent
Homology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 415
Jian Ma, Lizhi Zhang, Huadong Sun, and Zhijie Zhao

Machine Learning and Deep Learning and their Applications


A Novel Deep Image Matting Approach Based on DIM Model . . . . . . . 429
Guilin Yao and Zhiwei Ma
Application of Bloch Spherical Quantum Genetic Algorithm
in Fire-Fighting Route Planning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 441
Wei Zhao, Xuena Han, Hui Li, Zeming Li, and Xue Tan
Chaotic Neural Network with Legendre Function Self-feedback
and Applications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 450
Yu Zhang, SiFan Wei, Yu Wang, and Yaoqun Xu
Deep Reinforcement Learning for Resource Allocation in Multi-cell
Cellular Networks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 461
Ming Sun, Liangjin Hu, Yanchun Wang, Hui Zhang, and Jiaqiu Wang
Gauss Nonlinear Self-feedback Chaotic Neural Network
and Its Application . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 471
Nan Xu, Bin Zhou, and Yamin Wang
Guidance Prediction of Coupling Loop Based on Variable Universe
Fuzzy Controller . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 482
Ming Zhao, Yang Liu, Hui Li, Yun Cao, Yuru Zhang, and Hao Jin
Contents xv

Heart Disease Recognition Algorithm Based on Improved Random


Forest . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 491
HaiTao Xin and Hao Yu
Improved YOLOv3 Road Multi-target Detection Method . . . . . . . . . . . 501
Jingtao Fan, Yaoqun Xu, and Meng Tang
Research on Ctrip Customer Churn Prediction Model Based
on Random Forest . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 511
Zhijie Zhao, Wanting Zhou, Zeguo Qiu, Ang Li, and Jiaying Wang

Business Intelligence and Communications


An Evolutionary Game Analysis of Product Crowdfunding
Opportunistic Behavior Considering Price Acquisition Model . . . . . . . . 527
Guang Yang, Yan Wen, and Kaiwen He
Research on the Model of Word-of-Mouth Communication in Social
Networks Based on Dynamic Simulation . . . . . . . . . . . . . . . . . . . . . . . . 537
Zhipeng Fan, Wen Hu, Wei Liu, and Ming Chen
Spatial Correlation Analysis of Green Finance Development Between
Provinces and Non-provinces Along the Belt and Road in China . . . . . . 547
ChenYang Zheng and Xiaohong Dong
Study on Decision-Making Behavior of Effective Distribution
of Fresh Agricultural Products After COVID-19 Epidemic . . . . . . . . . . 557
Yu Yang, Xin Rong, Jianjun Li, and Fang Yang
Study on the 2-Mode Network Characteristics of the Types and
Issuing Places of Chinese Provincial Green Bonds . . . . . . . . . . . . . . . . . 567
Yuanhao Xiao and Xiaohong Dong
Study on the Influence of Internet Payment on the Velocity of Money
Circulation in China . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 577
Xiangbin Liu and Qiuming Liu
System Ordering Process Based on Uni-, Bi- and Multidirectionality –
Theory and First Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 594
Bernhard Heiden, Bianca Tonino-Heiden, and Volodymyr Alieksieiev
The Impact of Intellectual Property Protection on China’s Import
of Computer and Information Service Trade: Empirical Research
Based on Panel Data of 34 Countries or Regions . . . . . . . . . . . . . . . . . . 605
Hui-ying Yang and Shi-kun Pang
The Influence Mechanism of Business Environment on the Allocation
of Entrepreneurship . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 613
Juan Li and Tian Zhang
xvi Contents

The Transmission and Preventive Measures of Internet


Financial Risk . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 623
Na Zhao and Fengge Yao
Emotion Analysis System Based on SKEP Model . . . . . . . . . . . . . . . . . 632
Zhang Yanrong, Zhang Yuxuan, and Xie Yunxi
Evaluation of Enterprise Development Efficiency Based
on AHP-DEA . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 643
Wenli Geng and Mengyu Gao
Innovation Efficiency of Electronic and Communication Equipment
Industry in China . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 653
Yi Song, Huolong Bi, and Yi Qu
Internet Financial Regulation Based on Evolutionary Game . . . . . . . . . 661
Shu-bo Jiang and Yao-yao Huang
Research on Task Scheduling Method of Mobile Delivery Cloud
Computing Based on HPSO Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . 670
Jianjun Li, Junjun Liu, Yu Yang, and Fangyuan Su
Research on the Evaluation of Forestry Industry Technological
Innovation Ability Based on Entropy TOPSIS Method . . . . . . . . . . . . . 681
Shangkun Lu
Research on the Impact of Temporary Workers’ Psychological
Contract Fulfillment on Task Performance in the Sharing Economy . . . 689
Genlin Zhang, Linlin Tian, and Jie Xie
Research on the Influence of the Investment Facilitation Level
of the Host Country on China’s OFDI——From the Perspective
of Investment Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 703
Chaoqun Niu, Shuying Lei, and Chengwen Kang

Information Technology and Applications


Adaptive Observer-Based Control for a Class of Nonlinear Stochastic
Systems with Parameter Uncertainty . . . . . . . . . . . . . . . . . . . . . . . . . . . 715
Xiufeng Miao and Yaoqun Xu
Open Domain Question Answering Based on
Retriever-Reader Architecture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 723
Dequan Zheng, Jing Yang, and Baishuo Yong
Does Network Infrastructure Improve the Information Efficiency
of Regional Capital Market?—Quasi Natural Experiment Based
on “Broadband China” Strategy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 734
Guang Yang, Dengping Li, and Yan Wen
Contents xvii

Optimized Coloring Algorithm Based on Non-local Neighborhood


Search . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 744
Haitao Xin and Ezhen Peng
Research on Digital Twin for Building Structure Health Monitoring . . . 754
Su Jincheng
Research on Modeling and Solution of Reactive Power Optimization
in Time Period of Power System . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 761
Chen Deyu
Research on Task Allocation Method of Mobile Swarm Intelligence
Perception Based on Hybrid Artificial Fish Swarm Algorithm . . . . . . . . 775
Jianjun Li, Fangyuan Su, Yu Yang, and Junjun Liu
Research on Task Allocation Model of Takeaway Platform Based
on RWS-ACO Optimization Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . 786
Li Jianjun, Xu Xiaodi, Yang Yu, and Yang Fang
Utilizing Vias and Machine Learning for Design and Optimization
of X-band Antenna with Isoflux Pattern for Nanosatellites . . . . . . . . . . 796
Maha A. Maged, Ahmed Youssef, Fatma Newagy,
Mohammed El-Telbany, Ramy A. Gerguis, Nayera M. Abdulghany,
George S. Youssef, and Youssef Y. Zaky
Research on Automatic Generation System of the Secondary Safety
Measures of Smart Substation Based on Archival Data . . . . . . . . . . . . . 805
Sun Liye
Decision-Making Framework of Supply Chain Service Innovation
Based on Big Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 817
Haibo Zhu
Determination of Headways Distribution Between Vehicles
on City Streets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 826
Andrii Bieliatynskyi, Oleksandr Stepanchuk, and Oleksandr Pylypenko
Does Digital Finance Promote the Upgrading
of Industrial Structure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 836
Yi Qu, Hai Xie, and Xing Liu
Electricity Consumption Prediction Based on Time Series Data
Features Integrate with Long Short-Term Memory Model . . . . . . . . . . 844
Jiaqiu Wang, Hao Mou, Hai Lin, Yining Jin, and Ruijie Wang
Intelligent Search Method in Power Grid Based on the Combination
of Elasticsearch and Knowledge Graph . . . . . . . . . . . . . . . . . . . . . . . . . 854
Jiaqiu Wang, Xinhua Yang, Yining Jin, Xueyong Hu, and Le Sun
xviii Contents

Longitudinal Guidance Control of Landing Signal Officer Based


on Variable Universe Fuzzy Logic . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 865
Ming Zhao, Yang Liu, Hui Li, Yun Cao, Jian Xu, and Guilin Yao

Author Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 873


Digital Economy and E-commerce
Technolgoies
Digital Economy and High-Quality Development
of Manufacturing Industry

Xiaoxia Xu(B) and Ping Han

Faculty of Economics, Harbin University of Commerce, Harbin 150028, Heilongjiang, China

Abstract. We try to establish the index system of high-quality development of


the manufacturing industry from the five dimensions of innovation, coordina-
tion, green, sharing, and opening. The development level of the digital economy
is measured from three dimensions, including digital infrastructure, digital pop-
ularization, and digital economy application depth. Based on panel data of 30
provinces from 2011 to 2019, this paper uses a metrological model to explore
the influence of digital economy development on the manufacturing industry. The
results show that the digital economy has a significant effect on the development
of the manufacturing industry. Still, there is obvious regional heterogeneity: The
eastern, central, and northeast regions have a significant effect, while the western
regions have no significant effect.

Keywords: Digital economy · High-quality development of manufacturing


industry · Regional heterogeneity

1 Introduction
China’s “14th five-year plan” emphasizes the need to promote the deep integration of
the Digital Economy and the real economy, enabling the transformation and promoting
of traditional manufacturing industries, engendering new industries, new modes of busi-
ness, and new engines of economic development. Manufacturing is the most important
part of the real economy. Promoting the high-quality development of the manufacturing
industry is an important strategic measure to strengthen the country through manufactur-
ing and speed up the construction of a new development pattern. This paper discusses the
internal mechanism of the influence of the digital economy on the manufacturing industry
and applies a panel data model to analyze the heterogeneity of regional influence.
The measurement system of the digital economy is developed around three dimen-
sions: The digital infrastructure economy, the scale of the digital economy, and the appli-
cation of mobile digitalization (Wei Zhuangyu 2020) [1] Xu Xianchun(2020) believe that
we can be defined and calculated the scope of the Digital Economy by using the structure
coefficient of industry value-added, the adjustment coefficient of the digital economy and
the increase rate of industry [2] Yu Shan and Fan Xiufeng (2021) considered the selection
of secondary indexes can select from the development of communication industry, such
as the Internet penetration rate, the total amount of communication service, the length
of mobile optical cable and so on [3].

© The Author(s), under exclusive license to Springer Nature Switzerland AG 2022


A. E. Hassanien et al. (Eds.): BIIT 2021, LNDECT 107, pp. 3–14, 2022.
https://doi.org/10.1007/978-3-030-92632-8_1
4 X. Xu and P. Han

The impact of the transformation and upgrading of the digital economy enabling
the manufacturing industry is studied mainly from two aspects. One is how the digital
economy improves the quality and efficiency of manufacturing; the other is how the
digital economy promotes the value creation of manufacturing. Li Chunfa(2020) the
digital economy can expand the boundaries of a manufacturing division of labor, reduce
transaction costs, transfer value distribution, and improve manufacturing productivity.
It can also cause changes in demand and stimulate consumption potential, forcing the
manufacturing sector to upgrade [4]. Jiao Yong (2020) holds that the impact of the digital
economy on the manufacturing industry can be divided into short-term and long-term.
The short-term impact is value remolding, and the long-term impact is value creation.
In the form of factor-driven to innovation-driven, product-driven to demand-driven shift
[5]. Li Yingjie and Han Ping (2021) believe that the mechanism of the digital economy
promoting the high-quality development of the manufacturing industry is mainly through
quality change, efficiency change, and power change [6].

2 Mechanism Analysis and Hypothesis Development

2.1 Reshapes the Ecology of Manufacturing Industry

Digital Information Technology is gradually breaking the limitation of space-time and


the possibility of production technology. The whole production possibility curve of the
manufacturing industry moves outward, and the production efficiency is improved. With
the development of the Digital Economy, the division of labor among enterprises in the
industrial chain has become more refined. Because of the limited factors, manufacturing
enterprises have gradually shifted from the original pursuit of “big and complete” to
“specialized” production models. At the same time, each node of the industrial chain
will derive more new industries and new forms of business. Digital Economy changes
the value distribution of the manufacturing industry chain [7]. The “smile curve” shows
the value distribution of the traditional manufacturing industry chain, and the value
distribution of the upstream R & D Enterprises and the downstream marketing service
enterprises is relatively high. In contrast, the lowest part of the “smile curve” is the
assembly and processing enterprises. With the development of the digital economy,
artificial intelligence, the Internet of things, and other technologies, the manufacturing
industry chain’s production efficiency and value-added space will be improved, and the
“smile curve” will become smooth.

2.2 Reduce Ease of Cost

The digital economy can reduce the transaction cost of the manufacturing industry: First,
it can reduce the Information asymmetry between enterprises and enterprises, enterprises
and consumers, enterprises, and the allocation of production factors. In the era of the
Digital Economy, the rapid spread of technologies such as big data and the Internet has
greatly improved the transparency and openness of the market while reducing the prob-
lems of “moral hazard” and “adverse selection” arising from Information asymmetry
between the consumer and the producer, between the producer and the production factor
Digital Economy and High-Quality Development of Manufacturing Industry 5

correspondence degree enhances greatly, the entire society’s factor allocation efficiency
enhances obviously. Second, the birth of the giant platform economy itself is the embod-
iment of reducing transaction costs. Essentially, the platform business content is not the
product itself but “reduces transaction costs. “. Whether it is “sharing economic model”,
or online shopping platform, takeaway platform. The core idea is to collect idle social
resources and establish a communication platform between producers and consumers
to reduce the transaction costs of both sides and improve the matching degree between
supply and demand.

2.3 Stimulating Demand Creation

The digital economy has stimulated demand creation in the manufacturing sector, which
is mainly reflected in: First, the rise of big data analytics, Internet platforms that can accu-
rately depict consumer preferences, as a special production factor, consumer demand
preference flows backward along the industrial chain, so the producer can catch the
change of consumption level more quickly and adjust the production plan better. Sec-
ond, with the rapid integration of traditional industry and the Internet, the manufacturing
industry has undergone great changes, gradually transforming from large-scale assembly
line production to Mass customization production. Through digital information tech-
nology, the producer collects and analyzes the product’s complete life cycle [8]. The
producer’s demand is communicated immediately, and the production line realizes the
flexible adjustment by the data flow. In conclusion, the research hypothesis of this paper
is put forward:

1. The development of the Digital Economy is conducive to the development of a


high-quality manufacturing industry. Hypothesis
2. The influence of the Digital Economy on the high-quality development of the
manufacturing industry has obvious regional heterogeneity.

3 Status Analysis
3.1 Index System Design

According to the economic development principle of “innovation, coordination, green,


sharing and opening”, this paper constructs the index system of high-quality development
of manufacturing industry; as shown in the Table1, the comprehensive score of regional
manufacturing high-quality development is obtained by entropy weight method.
Level of development of the digital economy. In constructing the evaluation system
of digital development level, this paper uses the practice of Shen Yunhong (2020) [9] and
Wei Zhuangyu (2021) [1], for reference, from the Digital Economy Infrastructure Devel-
opment level, digital popularization, digital economy terminal application in-depth three
aspects to building an evaluation system-specific indicator system as shown in Table 2:
6 X. Xu and P. Han

Table 1. High-quality development index system of manufacturing industry

High quality index Connotation Types


Innovation R & D/GDP Positive index
Harmony Share of high-end manufacturing Positive index
Greening Investment in industrial pollution control as a Positive index
proportion of gross industrial product
Opening up Share of manufacturing export delivery value in Positive index
manufacturing sales output value
Shared transformation The proportion of workers’ remuneration in Positive index
industrial value added

Table 2. Index system of digital economy development level

Primary index Secondary index


Digital infrastructure Cable line density
The popularity of digitization Internet broadband access port
Mobile phone penetration
Digital economy applications Software revenue
Enterprise information level

3.2 Entropy Weight Method

Determination of the Weight of the Indicator System. Based on the five develop-
ment concepts of “innovation, coordination, green, open and sharing”, a comprehen-
sive index system for high-quality development of manufacturing industry has been
established, as well as from the digital economy infrastructure development level, the
digitization popularization dynamics, the digital economy terminal application depth to
the digital economy development level measurement, the concrete steps are as follows:

Step 1: To eliminate the influence of dimension on a comprehensive evaluation, the


index is treated as dimensionless:
 
xij − min xij

Positive Index : xij =   i   (1)
max xij − min xij
i
 
 max xij − xij
Negative Index : xij =     (2)
max xij − min xij
i
Digital Economy and High-Quality Development of Manufacturing Industry 7

Step 2: Calculate the weight of the J index in year i. Yij ;


n
Yij ln(Yij )
i=1
Step 3 : Calculate the information entropy of index J: ej = − (3)
ln(n)

Step 4 : Calculate the information entropy redundancy of index J: dj =1 − ej (4)

dj
Step 5 : Calculate the weights : wj = (5)

m
dj
j=1

Using the above method to calculate the index’s weight in the evaluation system, the
concrete results are shown in Table 3.

Table 3. Index weights of manufacturing and digital economy development level

Entropy value Degree of redundancy Weight


Manufacturing Innovation 0.955 0.045 0.203
Harmony 0.972 0.028 0.126
Greening 0.941 0.059 0.267
Opening up 0.955 0.045 0.205
Shared 0.956 0.044 0.200
transformation
Digital economy Cable line density 0.873 0.127 0.327
Internet broadband 0.941 0.059 0.151
access port
Mobile phone 0.974 0.026 0.068
penetration
Software revenue 0.833 0.167 0.431
Enterprise 0.991 0.009 0.023
information level

Analysis of Measurement Results. From 2011 to 2014, the high-quality development


level of China’s manufacturing industry is on the rise, which is mainly caused by inno-
vation, coordination, and green. From 2015 to 2019, the development level of China’s
high-quality manufacturing industry declined slightly, mainly caused by the decline of
opening-up level, which crowded out the economic benefits of innovation and synergy.
Unlike the trend of high-quality development in the manufacturing industry, the devel-
opment of the digital economy in China has been increasing year by year. In 2011, the
average score of the development level of the digital economy in China was 1.66, and in
8 X. Xu and P. Han

2019, the score rose to 5.95. From the perspective of regional economic development, the
development level of China’s digital economy presents obvious regional heterogeneity.
The development level of the Eastern Region is higher than that of the rest of the country.
The growth rate of the northeast region is relatively slow, the growth rate of the digital
economy in central China is relatively fast, and the development level of other regions
is relatively balanced except Shanxi. The western region is the most backward region
in the development of the digital economy in China. Still, it has a good momentum of
development and shows a rising trend year after year.

4 Empirical Analysis
4.1 Variable Design
1. Explanatory variable: High Level of quality development in manufacturing. Indicator
selection concerning the above measurement results.
2. The core explanatory variable: The level of development of the digital economy. The
above measurements inform the selection of indicators.
3. Control Variables: To reduce the lack of robustness of the measurement model caused
by missing variables, this paper selects the following control variables according to
the current academic research results: (1) human capital (Edu). The paper takes
primary, secondary, high, and higher education into consideration. (2) foreign direct
investment: This paper uses the ratio of provincial-level FDI to provincial-level
GDP as the proxy variable. (3) Government support (Gov). The government’s role
in transforming and upgrading the manufacturing sector is expressed in the share of
annual regional fiscal expenditure in GDP. (4) Urbanization (Urb). The proportion
of the urban population to the region’s total population will be taken as the basis
for measuring regional Urbanization. (5) Financial Development (Fin). This paper
measures the balance of deposits and loans to the regional GDP at the end of the
year. (6) the level of Economic Development (Dev). This paper uses GDP per capita
to measure the level of economic development.

4.2 Econometric Model Setting and Descriptive Statistics of Variables


To test the validity of the hypothesis that the digital economy is conducive to the devel-
opment of manufacturing industry with high quality, the following econometric models
are established
n j
ln yit = c + β1 ln digitalit + αj ln Xit + μi + νt + εit (6)
j=1

In the model (6), yit represents the manufacturing high-quality development level of
area i at time t, lndigitalit represents the digital economy development level of area i at
time t, C is a constant term, J is the number of control variables, and X is the control
variable, μi , νt , εit they are regional heterogeneity, time heterogeneity, and random error.
Considering the availability and completeness of data, adopts the data from 2011
to 2019 of 30 provinces, except Tibet. Table 4 is descriptive statistics of explanatory
variables, explanatory variables, and control variables.
Digital Economy and High-Quality Development of Manufacturing Industry 9

Table 4. Descriptive statistics of variables

N Mean Std Max Min


lny 270 1.222 0 .413 0 .248 2.320
lndigital 270 0.753 1.051 −1.477 3.122
lnedu 270 2.214 0.093 1.912 2.540
fdi 270 0.059 0.055 0.008 0.251
gov 270 0.264 0.115 0.119 0.758
urb 270 0.576 0.123 0.343 0 .941
fin 270 3.163 1.140 1.529 7.901
lndev 270 10.757 0.442 9.691 12.011

4.3 Empirical Test and Result Analysis


Unit Root Test. Panel data contain the characteristics of time series number and cross-
section data. In regression analysis, it is possible that the same time trend leads to the
level of significance. The panel data should therefore be subjected to a unit root test
before regression analysis.

According to the test results, each variable rejects the original hypothesis that there
is no unit root under 1% significance level. All variables are stationary sequences, and
there is no pseudo-regression in panel data regression.

Empirical Results Analysis. In examining the impact of the level of development of


the Digital Economy on high-quality manufacturing development, in order to obtain
more accurate estimates, so the Least Squares (QLS), Fixed Effect Model (FE), and
Random Effect Model (RE) were used to measure the Effect (Table 5).

First, the model is validated by a mixed regression model and a fixed-effect model,
and the selection of the model is tested using F statistics. The results show that the inter-
individual intercept terms are different, and the mixed effect model is unsuitable. Using
the random-effects model and the Hausman test, the original hypothesis is rejected at
the 1% significance level, which shows that the random disturbance is related to the
explanatory variable, so the fixed effects model should be chosen.
The Regression Coefficient of the development level of the digital economy is 0.293
and passes the test under the significance level of 0.05. After removing the influence
of Control Scalars, the regression coefficient of the influence of the digital economy on
manufacturing quality is 0.056. It can be concluded that the development level of the
digital economy promotes the high-quality development of the manufacturing industry
in China, and the higher the level of the digital economy is, the better the development
of the manufacturing industry. Increasing the investment of digital infrastructure con-
struction, increasing the popularization of the digital economy, and the application of
digital economy terminals will help promote the high-quality development of China’s
manufacturing industry. So hypothesis 1 is validated.
10 X. Xu and P. Han

Table 5. Baseline regression results

Variable OLS FE RE
Lndigital 0.239*** 0.007 0.056* 0.293** 0.121*** 0.160**
(5.74) (0.11) (1.84) (2.28) (4.56) (2.55)
lnedu −0.988** 0.881 0.189
(−2.15) (1.21) (0.32)
FDI 1.667** −3.378*** −0.259
(2.72) (−4.13) (−0.37)
gov −0.667** 1.749** −0.217
(−2.51) (2.30) (−0.59)
urb 0.979** −1.905** 1.088**
(2.22) (−2.25) (2.10)
fin 0.101* 0.174*** 0.112***
(4.04) (3.12) (2.95)
lndev 0.165 0.416* 0.391***
(0.94) (1.70) (2.66)
cons 1.042*** (18.92) 0.821*** 1.179*** 3.806 1.131*** 3.976**
(0.48) (44.37) (1.31) (22.28) (2.26)
R−squared 0.3695 0.5333 0.5454 0.4050 0.5454 0.5420
F/Wald 32.99 45.21 3.37 7.44 20.83 58.18
[0.0000] [0.0000] [ 0.0675] [0.0000] [0.0000] [0.0000]
Hausman 58.14
[0.0000]
N 270 270 270 270 270 270
Note: *, * *, * * * are expressed at 10%, 5% and 1% significance levels respectively. () under FE
model is t statistic, and () under RE is Z statistic

On the one hand, the digital economy supported by 5G technology and the Internet
of things can reduce information asymmetry, accelerate the flow of production factors,
improve resource allocation, and reduce transaction costs. On the other hand, with the
development of the digital economy, the manufacturing industry has gradually changed
from a traditional product orientation to demand customization, which is the embodiment
of value creation in the manufacturing industry. Hypothesis 1 is verified. The control
variable:
Human Capital (Edu). The Regression Coefficient of human capital is 0.881. The
accumulation of human capital supports the manufacturing industry and improves inno-
vation efficiency and management level. (2) Foreign direct investment. In theory, foreign
direct investment can promote the manufacturing industry to learn advanced manage-
ment experience. However, in the fixed model, the Foreign Direct Investment Coefficient
was −3.378 and passed the hypothesis test at a 1% significance level. The reasons are:
since 2009, the growth rate of the real use of FDI in China has slowed down obviously,
and the secondary sector of the economy of the real use of FDI and its proportion have
Digital Economy and High-Quality Development of Manufacturing Industry 11

declined in recent years. (3) Government support(Gov). In the fixed-effect model, the
number of words in which the government played a role was 1.749, and it passed the
test at the significant line level of 5%. This shows that the government plays a role
in promoting the upgrading of the manufacturing industry, and fiscal expenditure as a
capital element promotes the high-quality development of the manufacturing industry.
(4) Urbanization. In the fixed-effect model, the influence Coefficient of Urbanization
by country is −1.905. It is tested at the level of 1% significance, which shows that
the degree of Urbanization significantly inhibits the high-quality development of the
manufacturing industry. Because Urbanization makes manufacturing employment costs
rise and will bring about negative effects such as the hollowing out of industries. (5).
Financial Services (Fin). In the fixed-effect model, the influence Coefficient of financial
development level is 0.174, and it passes the test under the 5% significance level. It
shows that financial service plays an important role in promoting the development of the
manufacturing industry. (6) Level of economic development (lndev). The Regression
Coefficient of economic development level was 0.416, and it passed the test under the
10% significance level. This indicates that economic growth is largely consistent with
high-quality development in the manufacturing sector.

4.4 Robustness Test

In order to ensure the reliability of the empirical test results, this paper tests the robustness
of the model. It divides the sample of 30 provinces, cities, and autonomous regions
into the eastern, central, western, and northeastern regions according to the principle
of dividing the four economic regions, to examine the heterogeneity between different
regions. Table 6 shows that the development level of the digital economy has positive
effects on the eastern, northeastern, central, and western regions. Still, the positive effects
on the western regions are the least obvious. It shows that the digital economy in the
western region is relatively low. In order to promote the high-quality development of
the manufacturing industry, the northeast region plays the most important role. This
phenomenon may be because the sample size of the three provinces in the northeast region
is relatively small, making the regression coefficient of the variable abnormal. In general,
China’s digital economy has a positive effect on the development of manufacturing
quality, but there is obvious heterogeneity among regions. Hypothesis 2 has been verified.
12 X. Xu and P. Han

Table 6. Regional heterogeneity of robustness tests

Variable Eastern Northeastern Midsection Western


Lndigital 0.215* 0.614*** 0.307** 0.032
(1.45) (3.18) (2.43) (0.28)
lnedu .623 0.474 −0.787 0.454**
(0.54) (0.14) (−1.12) (0.47)
FDI −1.738** −4.632* −7.814*** −4.529*
(−2.28) (−1.57) (−2.94) (−1.38)
gov 2.607** −0.197 (−0.13) 0.454 −0.113**
(2.21) (0.38) (−0.17)
urb −1.972** −5.153*** (−2.94) 2.387* 3.491**
(−2.52) (1.78) (2.31)
fin 0.167* 0.266** 0.235*** 0.067
(1.85) (2.10) (3.07) (0.75)
lndev 0.494* 0.195 0.640 0.708**
(1.97) (0.28) (1.63) (2.03)
cons 5.657 4.528 7.712* 5.715*
(1.51) (0.48) (1.70) (1.77)
R-squared 0.3293 0.9998 0.9802 0.1691
F/Wald 4.81 51.91 135.72 7.94
[0.0000] [ 0.0000] [0.0000] [ 0.3375]
Hausman 37.32 0.25 4.96 22.25
[ 0.0000] [0.8829] [0.4209] [0.0450]
Model type FE RE RE RE
N 90 27 54 99
Digital Economy and High-Quality Development of Manufacturing Industry 13

5 Conclusions and Recommendations


Integrating the manufacturing industry with the Digital Economy and improving the
manufacturing industry’s digital level is an urgent problem to be solved. Based on the
provincial panel data from 2011 to 2019, this paper investigates the influence of the
development level of the digital economy on the manufacturing industry by using the
fixed and threshold effects models.

Findings:

– 1. The digital economy promotes the high-quality development of the manufacturing


industry, increasing the investment of the digital infrastructure construction, increasing
the popularization of the digital economy, and the application of the digital economy
terminals will promote the high-quality development of the manufacturing industry
in China.
– 2. Digital economy’s influence on the high-quality development level of manufactur-
ing industry has obvious regional heterogeneity, especially in the eastern, central, and
northeast regions.

Based on the research conclusion, this paper puts forward the following countermea-
sures and suggestions, expecting the digital economy to play a better role in promoting
the high-quality development of the manufacturing industry.
Enterprise-level: Manufacturing enterprises should vigorously promote the construc-
tion of digital platforms, make full use of big data, artificial intelligence, and other dig-
ital information technology, to promote the manufacturing industry in an intelligent,
efficient direction. At the same time, the innovative digital technology integration appli-
cation scene promotes the diffusion of advanced technology. It promotes the industrial
chain innovation upgrade, thus derived more new industries and new forms of busi-
ness. Strengthen the cooperation and sharing with scientific research institutions and
universities to improve the efficiency of technology transformation.
On the Industrial Organization level: We should dig deep into the value of the digital
economy, cultivate and develop new industries, upgrade traditional industries, push the
industrial chain, value chain to the high-end extension. In terms of industrial layout, we
should follow the law of regional heterogeneity. The distribution of manufacture under
the digital economy should be based on the difference of regional factor endowment and
location advantage, adapt to local conditions, and prevent the malignant competition
caused by the homogeneous development of the industry.
Macro-policy level: To develop the digital economy to do a top-level design, high-
quality manufacturing development to create a good business environment. Accelerate
the deployment of 5G, Internet of things, artificial intelligence, cloud computing and
other new generation of digital information technology. At the same time, the infor-
mation sharing platform between government and enterprise, enterprise and enterprise,
enterprise and consumer is constructed.
14 X. Xu and P. Han

References
1. Wei, Z.Y., Li, Y.T., Wu, K.D.: Can the digital economy promote the high-quality development
of the manufacturing industry? an empirical analysis on provincial panel data. J. Wuhan Finan.,
37–45 (2021)
2. Xu, X.C., Zhang, M.H.: A study on the scale of chinese digital economy - from the perspective
of international comparison. China’s Ind. Econ. (05), 23–41 (2020)
3. Yu, S., Fan, X.F., Jiang, H.W.: the impact of digital economy on china’s manufacturing qual-
ity going-out — from the perspective of export technological complexity enhancement. J.
Guangdong Univ. Finan. Econ. 36(02), 16–27 (2021)
4. Li, C.F, Chou, C.: The mechanism of digital economy driving the transformation and upgrading
of manufacturing industry—an analysis from the perspective of industrial chain. Bus. Res. (02),
73–82 (2020)
5. Jiao, Y.: Transformation of economy enabling manufacturing: from value remolding to value
creation. Economist (06), 87–94 (2020)
6. Li, Y.J., Han, P.: Mechanism and path of high-quality development of manufacturing industry
in digital economy. Macroecon. Manag. (05), 36–45 (2021)
7. Mohsen, A.: The impact of 5g on the evolution of intelligent automation and industry
digitization. J. Ambient Intell. Hum. Comput. (2021)
8. Gaffley, G., Pelser, T.G.: Developing a digital transformation model to enhance the strategy
development process for leadership in the south african manufacturing sector. South Afr. J.
Bus. Manag. 52(1), 12 (2021)
9. Shen, Y.H., Huang, H.: The impact of digital economy on the optimization and upgrading of
manufacturing industry structure 40(03), 147–154 (2021)
Research on the Mechanism of Promoting
the Development of High-Efficiency
and Featured Tropical Agriculture in Hainan
Relying on the Digital Economy

Fan Jiang1(B) , Shaoqing Tian1 , and Chongli Huang2


1 College of Applied Science and Technology, Hainan University, Haikou 570228, China
2 Library, Hainan University, 570228 Haikou, China

Abstract. From the perspective of current economic development trends, the dig-
ital economy has become a new economic growth point. As the only Hainan Free
Trade Port (FTP) in China, the FTP provides new opportunities for Hainan to
develop tropical and efficient modern agriculture with characteristics. However,
Hainan has a relatively low starting point for developing High-Efficiency and Char-
acteristic Tropical Agriculture (HECTA) in Hainan. It is necessary to explore an
effective mechanism for promoting the development of HECTA in Hainan relying
on the digital economy. It is necessary to establish and improve the cooperation
mechanism, data sharing mechanism, and talents that realize the full coverage of
digital infrastructure construction: the team training mechanism and the guarantee
mechanism at the legal and normative level.

Keywords: Digital economy · Hainan tropical high-efficiency · Characteristic


agriculture · Mechanism

1 Introduction
The outbreak of the new crown pneumonia epidemic has directly led to people’s social
distancing. To reduce contact and complete normal production and life, people need
the help of numbers. Therefore, online offices, online education, online medical care,
and online games are needed. Fresh food e-commerce, as the representative of the new
format of the digital economy, quickly fills up positions. Reality forces the digital econ-
omy to become a new growth point for economic development, giving the economy new
opportunities in the crisis. Based on the different duration of the epidemic, the growth
rate of the digital economy will be about 2.7–3 times the GDP growth rate, which has
a significant effect on stabilizing economic growth [1]. The Chinese government work
report pointed out: “E-commerce online shopping, online services, etc. Business for-
mats have played an important role in the fight against the epidemic. We must continue
to introduce supporting policies, comprehensively promote ‘Internet+,’ and create new
advantages in the digital economy.“ FTP is China’s only free trade port. From its devel-
opment perspective, the digital economy will also become an important starting point
for the high-quality development of FTP.

© The Author(s), under exclusive license to Springer Nature Switzerland AG 2022


A. E. Hassanien et al. (Eds.): BIIT 2021, LNDECT 107, pp. 15–24, 2022.
https://doi.org/10.1007/978-3-030-92632-8_2
16 F. Jiang et al.

2 Related Literature Research and Analysis


2.1 Digital Economy Is Gradually Becoming a Hotspot in Economics Research

The digital economy has gradually become a research hotspot in economics. It is a


concept of economics. It is the recognition, selection, filtering, storage, use of big data
(digital knowledge and information) by humans, guiding and realizing the rapid and
optimal allocation and regeneration of resources and achieving high-quality economic
development. The essence of the digital economy is a new form of resource allocation
optimization using a shared network platform as an organizational form and information
technology as a means [2, 8]. The essence of a digital economy is a shared network
platform as an organizational form and information technology as a means—a new form
of resource allocation optimization [3]. Wei et al. [4] believe that digital technology has
given birth to a digital economy with data as a key production factor, showing that the
identity of participants is relatively vague, and they are more concerned about product
use and service extension. A different form from the traditional economy.
A search on CNKI found that the earliest research on the digital economy in the
CNKI database can be traced back to 1994. In his article, Yuan Zhengguang described
the digital revolution as a new economic war and the development of digital technology
in the world. The trend is the main field of view of his theoretical views, but after 1994,
the research on the digital economy did not immediately become a research hotspot. The
number of relevant research documents from 1994 to 2020 included in CNKI was only
3,714. After that, with the continuous development of information technology in China,
more and more academic achievements focused on the digital economy. On December
30, 1996, the US “Business Week” proposed that an economic trend emerged at the end
of the 20th century: the revolution in information technology. Information technology
has given rise to the production and sale of intangible goods and services (such as
audio-visual products, insurance) and tangible goods. Certain influence; In 1999, the
Heilongjiang Finance Specialty Research Group proposed that the rapid development
of information technology has created a virtual world for humanity. The virtual world
has completely broken the spatial distance barriers in the material world, and the digital
economy will surely become a system. The US Department of Commerce issued a report,
“The Emerging Digital Economy 2”. In this report, the development of “e-commerce”
was analyzed. Yang proposed a reflection on the digital economy. That is the digital
economy, a bubble economy? Generally speaking, before the arrival of the 21st century,
people’s understanding of the digital economy is not profound and comprehensive, which
has a lot to do with the arrival of the information technology revolution. Some scholars
believe that the advent of the digital economy is inevitable, but some scholars are on the
sidelines or even skeptical about the digital economy. Especially due to the impact of
the Asian financial crisis in 1998, many scholars are cautious about whether the digital
economy can bring about a bubble economy. Ren pointed out that the trend of the digital
economy is unstoppable, but the status of the digital economy has been exaggerated.
Since 2000, under the influence of China’s accession to the WTO, academic circles
have gradually become more and more rational in their understanding of the digital
economy. Lu proposed that China’s entry into the WTO will make the development of
China’s IT industry face an unprecedented sense of urgency. Hong Kong should be built
Research on the Mechanism of Promoting the Development of High-Efficiency 17

into an “information center” dominated by information service industries to drive the


Pearl River Delta and even Guangdong as a whole and create a new digital economy in
China in the 21st century. Special zone. Since 2000, people have paid more and more
attention to issues related to the digital economy. They have begun to carry out a full
range of interpretation and research on the digital economy from multiple perspectives,
such as consumer rights and digital infrastructure. According to the layout from the coast
to the inland, the city has also begun to emphasize the digital economy from an official
perspective. Countries around the world have also begun to compete for the development
of the digital economy. Therefore, the development practice of the digital economy has
driven the theory of it.

2.2 The Development and Integration of Digital Economy and Agriculture Is


the Focus of Attention in the Economic

With the deepening of the understanding of the digital economy, the digital economy
has begun to be further integrated with various fields. With the emergence of the new
crown epidemic, the development of the digital economy has increased. Organic inte-
gration with “agriculture, rural areas, and farmers” has become the focus of attention
in the economic field in the past two years. Digital agriculture is an advanced stage of
agricultural modernization, and it is the only way for my country to move from an agri-
cultural country to an agricultural power. According to statistics, in 2020, the scale of
China’s rural e-commerce reached 28801.57 billion yuan, maintaining a very fast growth
rate. The important position of e-commerce in agricultural product sales channels also
highlights the integration of the digital economy and agricultural development in China.
Development.
Chen X. M. [5] analyzed the positive effects of the emergence of the digital economy
on the development of the agricultural and rural economy and put forward suggestions
and countermeasures for the problems faced in the integration of the digital economy and
the agricultural and rural economy, aiming to promote further the in-depth integration
of the digital economy and the agricultural and rural economy. Li analyzed the income
growth and income distribution effects of the rural digital economy and pointed out,
“the results of system integration show that the digital economy has a positive effect
on farmers’ income, but it also exacerbates the income gap. Obvious income-increasing
effect and negative effect on fair income. They believe that the digital economy has a more
significant impact on farmers’ agricultural income than non-agricultural income; groups
with higher socioeconomic status and education will benefit more from the development
of the digital economy. The digital economy has strengthened the widening income
gap under the conditions of promoting the overall growth of farmers’ income, posing
new challenges to public policies committed to promoting inclusive development [5].
Cai et al. [7] analyzed the opportunities, obstacles, and strategies of rural economic
development in the field of the digital economy. They pointed out that “in the field of
the digital economy, rural economic development should make full use of the significant
effects of policies and the effectiveness of e-commerce in helping farmers.
Digital technology empowers production, technological innovation releases vitality
and other important opportunities, and vigorously promotes the agricultural and digital
18 F. Jiang et al.

economies’ integrated development. However, the governance system of agricultural-


related digital economy urgently needs to be improved, and the infrastructure of the
agricultural-related digital economy is relatively weak. Multiple obstacles such as the
insufficient release of digital science and technology vitality severely restrict the agri-
cultural economy’s sustainable development. To this end, priority should be given to
dynamically improving laws, regulations, and supporting policies, increasing the sup-
ply of digital information infrastructure resources and innovating an effective release
model for the vitality of the digital economy. And other strategies to steadily promote
the superposition, amplification, and multiplication functions of the digital economy in
rural economic development [6]. Li et al. [6] pointed out a reasonable path for the digi-
tal economy to play a role in promoting the revitalization of rural industries, including
infrastructure construction and institutional guarantees. The task force of the Information
Center of the Ministry of Agriculture and Rural Affairs pointed out that the integration
of digital economy and agricultural development is currently facing certain challenges,
such as the lagging of core key technology research and development, the incomplete
information and data resource sharing mechanism, and the inability of using modern
information technology to solve practical problems in the agricultural industry.

2.3 Research Review

Current research mainly focuses on the connotation, and theoretical basis of the digital
economy itself, and its integration and development with specific industries are still
being explored. Especially in the construction of the FTP, the research on the integration
of the digital economy and tropical HECTA needs to be further deepened.
How HECTA can truly achieve leapfrog development with the help of the digital
economy and obtain more practical benefits for the guidance of practice remains further
explored.

3 The Necessity of Relying on the Digital Economy to Promote


the Development of HECTA in Hainan

3.1 Hainan Has a Relatively Low Starting Point for the Development of HECTA

Hainan is my country’s tropical treasure island, with unique tropical agricultural resource
endowments. The land area is 35,400 km2 , accounting for 42.5% of the country’s tropical
land area. The annual average temperature is 23–26 °C, the annual sunshine hours are
1832–2558 h, and the average rainfall is 1600 mm. Rich in biological species, annual
growth of crops, surrounded by the sea to form a natural animal disease barrier, and
unique conditions for the development of HECTA. Hainan’s tropical characteristic and
efficient agriculture not only includes tropical crops such as natural rubber, tropical fruits
such as bananas and mangos, dragon fruit, lychees, winter melons such as peppers and
beans, sweet corn, and tropical flowers such as chrysanthemums, green radish pears and
so on. However, from a national and even global perspective, Hainan Province has a late
start and poor foundation for the development of HECTA.
Research on the Mechanism of Promoting the Development of High-Efficiency 19

3.2 The Construction of an FTP Provides New Opportunities for Hainan


to Develop HECTA

In 2018, Xi Jinping [2] specifically pointed out in his speech at 413, “We must strengthen
the construction of the National Southern Propagation Research and Breeding Base
(Hainan), build a national tropical agricultural science center, and support Hainan to build
a transfer base for the introduction of global animal and plant germplasm resources”.
Under the circumstances, Hainan Province has a unique advantage. It has gradually
formed southern propagation, winter melons and vegetables, tropical fruits, tropical
flowers, animal husbandry, marine fisheries, leisure agriculture, and tropical cash crops,
sold to the whole country. Even globally, a new path of HECTA has been formed. The
“Overall Plan for the Construction of FTP” proposes that the construction of a free
trade port must conform to Hainan’s positioning. “Giving full play to Hainan’s rich
natural resources, unique geographical location, and backing on the ultra-large-scale
domestic market and hinterland economy, we will seize important opportunities for a
new round of global technological revolution and industrial transformation, and focus
on the development of tourism, modern service industries, and high-tech industries. To
accelerate the cultivation of new advantages in cooperation and competition with Hainan
characteristics.”

3.3 The Digital Economy Has Inserted “Digital Wings” for Hainan’s
Development of HECTA

The digital economy is the decision-making deployment of the central government. In


the instructions of General Secretary Xi Jinping [2] and the relevant opinion documents
of the central government, it is proposed to build smart Hainan and vigorously develop
the digital economy. Especially in the “Smart Hainan Overall Plan (2020–2025)”, it is
proposed to build Hainan into an open international information and communication
pilot area, a model area for sophisticated, intelligent social governance, an intelligent
experience island for international tourism consumption, and an open digital economy
innovation highland. FTP is the highest level of openness in the world today. This gives
Hainan many advantages that other regions cannot match in the development of the
digital economy. Using the advantages of FTP to promote the tracking management of
the supply chain effectively, transparency of information and the transfer of trust and
value through digital knowledge and identification have become the focus of hotspots.
It is a major topic that requires great attention and in-depth research.
The necessity analysis for the Development of Hainan’s THECA industry to rely on
the digital economy is shown in Fig. 1.
Chen X. M. [5] believes that the digital economy will become an important start-
ing point for the FTP to achieve high-quality development. Hainan must seize the good
opportunities of the global digital economy to carry out top-level design, such as increas-
ing the integration of digital economy, digital society, and digital government to build the
digital foundation of FTP, scientific understanding platform, effective standardization,
and guidance platform Development; use modular thinking to promote the digitization
of traditional industries actively, use digits to empower traditional industries, transform
20 F. Jiang et al.

Low starting point for the development of THECA

Necessity analysis
Free trade port construction provides new opportunities

Digital economy boosts THECA

Fig. 1. Necessity analysis chart

and upgrade traditional industries, and improve industrial efficiency. Promote the digi-
tization, networking, and intelligence of the entire agricultural chain, and promote the
formation of “digital wings” in the industry, which can improve the efficiency of agricul-
tural production and ensure the quality of agricultural products. The “Overall Plan for
the Construction of FTP” also puts forward, “Data flows in a safe and orderly manner.
Under the premise of ensuring the safe and controllable data flow, expand the opening
of the data field, innovate the design of security systems, realize the full convergence of
data, and cultivate the development of a digital economy.“
Many studies have shown that the global supply chain structure is accelerating adjust-
ment, and the global supply chain, industrial chain localization, regionalization, and
decentralization are accelerating the evolution. China proposes a new development pat-
tern and pays more attention to coordination, integration, and reconstruction of the supply
chain. The international community also attaches great importance to the diversified lay-
out of the supply chain. The FTP deepens the structural reform of the agricultural supply
side, integrates and utilizes the advantages of tropical agricultural resources, vigorously
adjusts the agricultural structure, and works hard on new supplies to create the “trump
card” of HECTA. Therefore, creating new potentials for the digital economy, “great
changes in the digitally driven industrial chain,” and “new economic growth points in
the digital age” have become hot topics in FTP. Hainan’s HECTA should focus on intel-
ligent agricultural production, networked management, and Managing digitalization and
online services. Strengthen the foundation of agricultural and rural digitalization, inte-
grate data resources, enhance data acquisition capabilities, increase data development
and application, deepen integration and sharing, strengthen management services, etc.,
to form a HECTA industry layout relying on digital economy A long-term mechanism
to blaze a new path for agricultural and rural development in the new era, enlarge and
strengthen the agricultural industry, and help the rural revitalization strategy.
Research on the Mechanism of Promoting the Development of High-Efficiency 21

4 An Effective Mechanism to Promote the Development of Hainan’s


HECTA with the Digital Economy
4.1 Establish and Complete a Cooperation Mechanism to Achieve Full Coverage
of Digital Infrastructure Construction
Digital infrastructure construction covers 5G Internet, data centers, artificial intelligence,
industrial Internet, and other fields. Digital infrastructure, new infrastructure, and tra-
ditional infrastructure are mutually reinforcing and inseparable. To build the layout of
Hainan’s HECTA industry based on the digital economy, it is necessary to continuously
increase the coverage of the Internet and information technology projects in Hainan’s
rural areas and promote information services to penetrate the rural grassroots. The devel-
opment of digital agriculture can also superimpose and drive the development of banking,
insurance, futures, big data, the Internet of Things, and other industries. Therefore, it is
still necessary to break the technical barriers of information technology to serve rural
finance and find large Internet companies to participate in the development of rural dig-
ital finance. Based on efforts to build a trinity multi-party cooperation mechanism of
“government, enterprise, and bank,” improve the construction of rural information net-
work platform, build corresponding payment system, credit investigation system, and
network supervision system, and maintain and update information platform.

4.2 Relying on the Digital Economy to Promote the Effective Mechanism


of the Development of Hainan’s HECTA
At present, in the construction of FTP, there are problems such as insufficient sharing
and openness of big agricultural data, information islands, data barriers, data fragmen-
tation, and information asymmetry, that is, the problem of unsmooth information flow.
To avoid the so-called information islands, data barriers, data fragmentation, and infor-
mation asymmetry simultaneously as agricultural digitization and to open up a broader
development space and a fairer trading platform, we must focus on moving from point to
line to surface. The integrated development pattern of the country further promotes the
integrated development of the digital economy and the agricultural and rural economy.
There are several major subjects and elements here, which must be coordinated to reflect
the integration effect. In addition to rural areas, agricultural departments and enterprises,
scientific research units, and industry associations must also play their due roles. Only
in this way can big data be promoted in agricultural production. The operation, manage-
ment, service, other links, and the final lesson form a result that can be replicated and
promoted.
In addition to grasping several major thematic elements, certain principles and norms
should also be observed when collecting and compiling data. The “Agricultural and Rural
Big Data Pilot Program” should “through the construction of an upgraded agricultural
and big rural data center, through the integration of software and hardware resources,
and structural reconstruction, a provincial-level agricultural and rural big data-sharing
platform that is linked up and down and covers comprehensive coverage will be formed,
and the agricultural information-sharing mechanism will be improved., To promote joint
construction through sharing, formulate a catalog system of agricultural information
22 F. Jiang et al.

resources and related standards, deepen the construction of special agricultural data,
and promote information sharing and joint construction.“ At the same time, improving
information transparency, regulating hidden rules, introducing incentives and safeguards,
promoting information sharing and resource exchange among regions and departments
with policy-based support methods, and achieving mutual benefit and win-win results
are also effective means.

4.3 Establish and Improve the Training Mechanism of Agricultural Digital


Pioneers
The digital economy is bound to be inseparable from the pioneers, the exploration and
guidance of digital agriculture pioneers. The forerunners here include outstanding young
farmers and pioneer agricultural enterprises, and government leaders in agriculture.
Digital agriculture pioneers can save a significant increase in agricultural productivity
and lead the talent team to expand the digital agricultural economy and become an
important bridge for urban-rural exchanges. But cultivating relevant talents is systematic
engineering, requiring steady progress and long-term success. First of all, the training
of talents must be based on education. Modern information technology must be fully
used to carry out regular special training based on agricultural vocational education
and distance education, especially new media operation skills training. The training
content should include content creation, new media operations, brand marketing, etc.
The digital agriculture pioneer can live up to its name, independently spread the beauty
of the hometown for a long time, and drive the development of the tropical agricultural
industry. Only in this way can the talents cultivated be experts in tropical characteristics
and efficient agriculture and a new type of agricultural production and operation subject
proficient in the Internet and financial economy. Secondly, the training of talents should
also focus on practical applications. Therefore, we must train a group of practical digital
application talents who can use data to analyze the market and achieve more precision
and effort in production, packaging, management, and sales. Third, it is advocated to
work hard to build a “Digital Agriculture Academy,” model relevant talent training, and
form a unique digital talent training process and method for tropical characteristic and
efficient agriculture.

4.4 Establish and Improve the Guarantee Mechanism for the Legal Information
Flow Security of Data and Information Collection
As the digital economy uses big data, collecting information is the most basic link,
and the legitimacy of information collection and the security of information flow are
particularly important. Therefore, risk prevention and control must be strengthened to
ensure data security. Based on the original laws and regulations, supplement, revise
and improve the legal systems related to the digital economy and digital finance, must
highlight their forward-looking, systematic, and operability, and establish a dynamic
tracking mechanism, build a standardized measurement index system, and ensure big
data Legalize information collection, and formulate strict norms and standards for data
sources, ownership of responsibility and rights, protection of intellectual property rights,
and profit distribution.
Research on the Mechanism of Promoting the Development of High-Efficiency 23

Figure 2 shows an effective mechanism for building THECA industrial layout driven
by the digital economy.

Cooperation
Mechanism

Effective
Training Mechanism Security Mechanism
Mechanisms

Sharing Mechanism

Fig. 2. An effective mechanism for the layout of the A industry driven by the digital economy

5 Conclusion

This article analyzes the need to develop Hainan’s HECTA industries to rely on the digital
economy. To promote the development of the digital economy under the construction of
FTP, build an effective mechanism for the layout of Hainan’s HECTA industries driven
by the digital economy. The construction of a full coverage cooperation mechanism for
digital infrastructure construction, a data-sharing mechanism, a training mechanism for
agricultural digital pioneers, and a security mechanism for information flow has laid a
theoretical foundation for developing FTP’s digital economy in the HECTA.
In the construction of the FTP, the research on the impact of the development of the
digital economy on the HECTA industry is still in its infancy, and related applications are
still being improved. There are still shortcomings in the mechanism we have constructed,
and further research is needed in the future.

Acknowledgments. This work was supported by the Hainan University 2020–2021 Construction
of Basic-level Party Organizations Special Project Research(Hddj07, Research on the Construc-
tion of Basic-level Party Organizations in Colleges and Universities of FTP), Hainan University
2021 Education and Teaching Reform Research Project (HDJY2159, Hainan University applied
undergraduate top-notch innovative talent training model and practical research driven by the free
trade port construction) and Philosophy and Social Science Planning Project of Hainan Province
(HNSK(YB)21–41, Research on the Centennial Course and Experience of the Political Ecological
Construction of the Communist Party of China). This work was supported by the Humanities and
Social Sciences Research Innovation Team of Hainan University, Hainan Free Trade Port Cross-
border E-commerce Service Innovation Research Team (HDSKTD202025). This work was also
supported by Hainan Provincial Natural Science Foundation of China (720RC567, Research on
24 F. Jiang et al.

Information Compensation Mechanism of Free Trade Zone to Global Supply Chain Innovation
Coordination under BlockChain Technology, and 720RC569, Tourism Value Chain Distribution
and Ecological Optimization Mechanism of Hainan International Tourism Consumption Center
Based on System Dynamics).

Authors’ Contributions. Fan Jiang was responsible for proposing the overall idea and frame-
work of the manuscript. Shaoqing Tian was responsible for the writing of the first draft of the
manuscript and translation. Chongli Huang was responsible for the revision of the manuscript and
proofreading. Fan Jiang, Shaoqing Tian, and Chongli Huang contributed equally to this paper and
are regarded as the first authors.

References
1. Tang, D.D., Liu, X.L., Ni, H.F., Yang, Y.W., Huang, Q.H., Zhang, X.J.: Global economic
changes, china’s potential growth rate and high-quality development post-epidemic period.
Econ. Res. 55(08), 4–23 (2020)
2. Pei, C.H., Liu, H.G.: An economic analysis of Xi Jinping’s new era of opening to the outside
world. Econ. Res. 53(02), 4–19 (2018)
3. Li, C.F., Li, D.D., Zhou, C.: The mechanism of the digital economy driving the transformation
and upgrading of the manufacturing industry: an analysis based on the perspective of the
industrial chain. Bus. Res. (02), 73–82 (2020)
4. Wei, J., Liu, J.L., Liu, Y.: Digital economics: connotation, theoretical foundation, and important
research issues. Sci. Technol. Progr. Countermeas. (7), 1–7 (2021)
5. Chen, X.M.: Research on the integrated development of digital economy and agricultural and
rural economy. Econ. Manag. Abst. (12), 5–6 (2021)
6. Li, Y., Ke, J.S. Three-level digital divide: income growth and income distribution effects of
rural digital economy. Agric. Technol. Econ. (08), 119–132 (2021)
7. Cai, Z.Z., Su, X.D.: Opportunities, obstacles, and strategies of rural economy in the field of
digital economy. Agric. Econ. 07, 35–37 (2021)
8. Fu, R.Y.: The digital economy will become an important starting point for the high-quality
development of hainan free trade port. Hainan Daily, 04–22 (2021)
The Impact of Digital Economy on China’s
Low-Carbon Development

Fengge Yao and Lin Li(B)

School of Finance, Harbin University of Commerce, Harbin 150000, Heilongjiang,


People’s Republic of China
102244@hrbcu.edu.cn, credia1120@163.com

Abstract. As a new resource type, the digital economy permeates all economies
and societies, becoming an essential engine for future development. Based on
Chinese provincial panel data from 2015–2018, the article investigates the impact
of the digital economy on the development of China’s low-carbon economy using
a mediating effects model. The conclusion of the article is as follows. First, the
digital economy has an essential role in promoting the development of China’s
low-carbon economy. Second, the digital economy can encourage the development
of China’s low-carbon economy through technological innovation. Third, there is
heterogeneity in the impact of the digital economy on the development of China’s
low-carbon economy, and its impact is in the order of strong to weak: east, west,
and central. The article provides empirical evidence at the provincial level to
develop the digital economy in China’s low-carbon economy. It offers policy
recommendations for the digital economy to contribute more effectively to China’s
low-carbon economy.

Keywords: Digital economy · Low carbon economy · Regional development

1 Introduction
Global warming and the development of a low-carbon economy (LC) are increasingly
becoming a consensus [1]. Global governance has entered the stage of full implementa-
tion of the Paris Agreement, and China has entered a critical period of the low-carbon
transition.
As the future of the world economy, the digital economy (DE) is playing an increas-
ingly important role in global economic growth [2], and its higher penetration could
bring dividends to China at different levels and broaden the innovation field [3].
Currently, the process of DE development has both opportunities and challenges [4].
The 14th Five-Year Plan and the 2035 Vision Outline propose to embrace the digital era
and drive overall changes in production, life, and governance with digital transformation.
As shown in Fig. 1, the size of China’s DE in terms of value-added has expanded from
2.6 trillion yuan in 2005 to 35.9 trillion yuan in 2020, and the DE is growing rapidly
and impacting our economic life.
Driven by the concept of sustainable development, the relationship between the DE
and the environment is becoming increasingly close [5]. Li Y. et al. [6] showed that
© The Author(s), under exclusive license to Springer Nature Switzerland AG 2022
A. E. Hassanien et al. (Eds.): BIIT 2021, LNDECT 107, pp. 25–34, 2022.
https://doi.org/10.1007/978-3-030-92632-8_3
26 F. Yao and L. Li

400 41.36 45
thousand
350 34.8 36.2 40
32.9
300 35
26.1 30 Overall size of the digital
250 econom(hundred million
20.3 25
200 RMB)
20
14.2 15.2 Digital economy accounts
150
15 for the GDP ratio value of
100 (%)
10
50 5
0 0
2005 2008 2011 2014 2017 2018 2019 2020

Fig. 1. China’s DE development status

carbon emissions could be effectively curbed by introducing the DE in energy use. Li X.


et al. [7] showed an inverted U-shaped non-linear relationship between CO2 emissions
and the DE, which did not significantly curb carbon emissions at the beginning of the
implementation of the DE.
In the face of the LC concept, scholars have focused more on whether the DE can
effectively curb carbon emissions and is still inconclusive. However, to implement the
concept of LC, we need to change the current production model. Therefore, this paper
explores the impact of the DE on China’s LC in the context of the current new model,
new development, and new ideas.

2 Research Hypothesis

According to Porter’s hypothesis [8], environmental regulation policies may have long-
run or short-run effects on the competitiveness of firms. In the short run, environmental
regulation can increase firms’ costs and reduce their competitiveness; in the long run,
environmental regulation can force firms to choose cheaper and more environmentally
friendly technologies, improving their competitiveness. Unlike the traditional economy,
the DE brings efficiency and polarization in the area of technology integration. The
high penetration rate of the DE can effectively break the limits of time and space, blur
the boundaries between industries, direct capital flows, and improve resource utilization
efficiency. In such an era where information and data are gradually becoming the core
capital of enterprises, companies will actively take advantage of the DE to achieve
technological innovation and actively cater to the concept of LC.

Based on this, two research hypotheses are proposed.


H1 : The DE can effectively promote the LC in China.
H2 : The DE can promote the development of an LC in China through technological
innovation thus.
The Impact of Digital Economy on China’s Low-Carbon Development 27

3 Methodology and Data


3.1 Model Building
3.1.1 Low Carbon Economic Development Measurement
Drawing on Rong H.‘s research [9], the data envelopment analysis method was selected
to measure the low-carbon total factor productivity of 30 provinces from 2015–2018 to
measure low-carbon economic development. Tone [10] proposed the SBM model with
non-angle and non-radial characteristics, which can solve the inefficiency factor caused
by the slack problem. Suppose there are j decision units, each containing M input factors
x, n1 desired outputs a and n2 non-desired outputs b. The function is defined as shown
below.

1  si_
M
1 n1
sγα n2
slb
ρ = min(1 − ) [1 + ( + )] (1)
M xki n1 + n2 αkγ bkl
i=1 γ =1 l=1

n
xik = xij λj + si− , i = 1, . . . , M (2)
j=1,j=k

n
αik = αij λj − sγα , γ = 1, . . . , n1 (3)
j=1,j=k

n
blk = blj λj − slb , l = 1, . . . , n2 (4)
j=1,j=k

λj ≥ 0, si_ ≥ 0, sγα ≥ 0, slb ≥ 0 (5)

ρ is the low carbon economic efficiency value to be calculated for China; si_ , sγα , slb
are the slack variables for inputs, desired outputs, and non − desired outputs;λ is the
decision unit weight, and the production unit is efficient when ρ = 1; when ρ < 1, there
is efficiency loss in the production unit.
The results calculated using the SBM model are static and do not directly reflect
the productivity changes. According to the idea of Oh [11], this paper adopts the SBM
model as the basis and uses the GML index to measure the value of low carbon economic
efficiency in China, which can be defined as follows.
  1+DG (
T x t ,y t ,bt
)
GMLt,t+1 xt+1 , yt+1 , bt+1 ; xt , yt , bt = 1+DG (xt+1 ,yt+1 ,bt+1 )
T (6)

3.1.2 Baseline Regression Model

LC it = α0 + α1 DE + α2 Control it + μi + λt + εit (7)


In the baseline regression model where the explanatory variable low − carbom is the
level of low carbon development in China, DE stands for Digital Economy Composite
Index, Control represents a series of control variables, μi ,λt are individual fixed effects
and time fixed effects. εit is the random error influenced by the province time trend term,
clustered at the provincial and municipal levels.
28 F. Yao and L. Li

3.1.3 Intermediary Model


In the benchmark based on the proposed hypothesis, the mediating effect of technological
innovation is tested, and the following mediating effect model is developed.

1) Benchmark model:

LC it = α0 + α1 DE it + α2 Control it + μi + λt + εit (8)

2) Technological innovation:

Tecit = β0 + β1 DEit + β2 Controlit + μi + λt + εit (9)

LC it = γ0 + γ1 DE it + γ2 PILit + γ3 Control it + μi + λt + εit (10)

3.2 Data Selection

3.2.1 Explained Variable


In this paper, the low-carbon productivity of 30 Chinese provinces from 2015–2018 is
used as the assessment indicator of China’s LC, and a cumulative multiplicative treatment
is done. The asset input is the perpetual inventory asset calculation of Huang G. et al.
[12], the labor input is the total number of people employed at the end of the year
(10,000), and the energy input is the total energy consumption (million tons of standard
coal). The expected output is gross national product (GDP). GDP and asset inputs are
deflated using 2014 as the base period, respectively. The carbon emission data processed
using the IPCC sectoral approach [13] was selected as the unanticipated output, and this
data was published from the official website of CEADs.

3.2.2 Explanatory Variables


The DE development index is an explanatory variable, combined with domestic research,
to measure China’s current DE in three aspects: digital foundation, digital industry, and
digital environment. The content of its index is shown in Table 1.
The Impact of Digital Economy on China’s Low-Carbon Development 29

Table 1. DE measurement indicators and weights

Indicator Weighting Indicator Weighting


Digital foundation 43.32 Hardware facilities 19.08
Network resources 24.24
Digital industry 39.86 E-commerce 9.46
Market scale 10.35
Emerging industries 20.05
Digital environment 16.87 Innovation capability 6.32
Intellectual support 4.24
Digital applications 6.26
Data source: China Statistical Yearbook

3.2.3 Control Variables


Considering the different levels of carbon emission under different economic levels, eco-
nomic development (gdp), urbanization scale (urban), per capita income (income), and
foreign trade dependence (trade) are selected as control variables. The data information
is shown in Table 2.

Table 2. Control variable

Symbol connotation Unit


GDP Logarithm of GDP 100 million yuan
Urban Proportion of urban population 100%
Income Logarithm of per capita income of residents yuan
Trade Logarithm of foreign investment 100 million yuan
Data source: China Statistical Yearbook

4 Results and Discussion

4.1 Baseline Regression Test Results

The results of the baseline regression are shown in Table 3. Table 3-(3) shows the
final results. The results from the baseline regressions show that DE has a significant
contribution to the development of LC in China for a fixed number of provinces and
times. This confirms that hypothesis H1 holds that the DE has a significant positive
effect on low carbon development in China.
30 F. Yao and L. Li

Table 3. Baseline regression

(1) (2) (3) (4)


Variables LC LC LC LC
DE 4.1284*** 4.5560*** 3.6879*** 4.5788***
(0.7311) (0.8203) (1.1354) (1.2003)
GDP 0.3561*** 0.3552*** 2.2006**
(0.0314) (0.0316) (0.8805)
Urban 0.0040 0.0051 −0.1002***
(0.0038) (0.0041) (0.0232)
Income 0.1427* −0.0745 −0.3430**
(0.0809) (0.2182) (0.1453)
Trade 0.0080*** 0.0088*** 0.0107*
(0.0017) (0.0018) (0.0057)
Constant −4.0743*** −1.9636 −10.4746 1.1048***
(0.8253) (2.1266) (7.5843) (0.0452)
Year No No Yes Yes
Province No Yes Yes Yes
Observations 120 120 120 120
R-squared 0.8444 0.8474 0.6325 0.5101
Standard errors in parentheses
*** p < 0.01, ** p < 0.05, * p < 0.1

4.2 Technology Innovation Conduction Path

In the transmission path of technological innovation, the number of effective invention


patents (10,000) of industrial enterprises above the scale is selected as a measure of the
product innovation level of China’s production units. The results of testing the inter-
mediary path of technological innovation are shown in Table 4, which confirms that
hypothesis H2 is valid and the transmission path of the DE through technological inno-
vation and thus promoting the development of LC in China is valid, and its intermediary
effect reaches 23.70%.
The Impact of Digital Economy on China’s Low-Carbon Development 31

Table 4. Intermediary path test

Variables LC Tec LC
DE 5.0171*** 27.5557*** 3.8280***
(0.8546) (10.62332) (0.7459)
Tec 0.0432***
(0.0064)
GDP 0.3581*** 1.4740*** 0.2945***
(0.0312) (0.3881) (0.0281)
Urban 0.0034 −0.1673*** 0.0106***
(0.0039) (0.0482) (0.0035)
Income −0.0434 −2.8867* 0.0812
(0.1371) (1.7041) (0.1177)
Trade 0.0085*** 0.1695*** 0.0012
(0.0017) (0.0215) (0.0018)
Constant −126.4274* −2596.556*** −14.37508
(74.71983) (928.8478) 65.51425
Year Yes Yes Yes
Province Yes Yes Yes
Observations 120 120 44
R-squared 0.8499 0.6154 0.4492
Standard errors in parentheses
*** p < 0.01, ** p < 0.05, * p < 0.1

4.3 Heterogeneity Analysis of East, Central and West Regions

China as a developing country, the level of economic development shows a regional


imbalance, so in the effect of the DE on LC, the region is divided into eastern, central,
and western according to the national, regional division standard further will analyze
the effect of the DE on LC. Its test results are shown in Table 5.
32 F. Yao and L. Li

Table 5. Heterogeneity analysis

East Central Western


Variables LC LC LC
DE 3.9986*** −17.7811 2.3201
(1.4203) (16.6313) (4.5739)
GDP 5.1093*** 0.4085 0.0812
(1.7159) (3.2295) (0.8267)
Urban −0.1114*** −0.0387 0.0418
(0.0360) (0.1072) (0.0408)
Income −0.1878 0.2921 0.1221
(0.2394) (0.9658) (0.2242)
Trade 0.0274*** −0.0030 0.0168
(0.0094) (0.0390) (0.0107)
Constant −40.4776** −3.4533 −3.3505
(16.7121) (23.1992) (6.3950)
Year Yes Yes Yes
Province Yes Yes Yes
Observations 52 24 44
R-squared 0.7578 0.6941 0.4492
Standard errors in parentheses
*** p < 0.01, ** p < 0.05, * p < 0.1

In response to this phenomenon, the following explanation is attempted: the central


region is in a relatively contradictory growth stage, geographically close to the coastal
provinces, to a certain extent, can enjoy the convenience of resources, manpower, and
technology brought by the location of the coastal provinces, but relative to the east-
ern coastal areas, the competitiveness is relatively weak. At the same time, the central
region has a complex industrial structure and concentrated industrial production, and it
is impossible to find a balance between economic benefits and digital equipment inputs
in the short term to change the industrial development model to achieve LC. Therefore,
DE did not promote the LC in the central region at the initial stage of DE implementa-
tion. Compared with the central region, the western region is not better able to take over
the convenience brought by the eastern region. Under the strategy of “exchanging cages
for birds,” the western region has taken over more traditional high-carbon industries
transferred from home and abroad. At the same time, because the western region has a
single industrial structure and is deeply inland, under the support of national preferential
policies, in the short term, the DE has a positive effect on the development of the western
region,. Still, this effect is not significant due to the limitation of infrastructure. In con-
trast, the eastern region has a reasonable industrial structure and complete infrastructure
so that the DE can be effectively implemented. Therefore, it leads to regional differences
in the implementation of the DE on the LC in China.
The Impact of Digital Economy on China’s Low-Carbon Development 33

5 Conclusions
Based on the sample data of 30 provinces in China (except Tibet) from 2015–2018,
this paper uses the natural logarithm of the number of new product R&D projects as
a measure of product innovation to examine whether the DE can effectively promote
the development of China’s LC through the level of product innovation and then. The
empirical results lead to the following conclusion. First, the development of the DE
has a significant role in promoting the LC in China. Second, the DE can promote the
LC in China through the intermediary path of technological innovation. Third, there is
heterogeneity in the impact of the DE on the LC in China, and its impact effect is in the
order of strong to weak: east, west, and central.
Based on the above findings, this paper further puts forward targeted policy
recommendations.
First, importance should be attached to the intermediary role of technological inno-
vation in developing the DE for the LC. Modern digital information technology is used
to channel capital flows to enterprises using low-carbon technologies, further encourage
enterprises to actively realize low-carbon technological innovation, improve the effi-
ciency of traditional industries, establish the Internet in various industries, promote the
integration of DE technologies with the real economy, and create a new digital industrial
ecology, thereby promoting LC.
Second, strengthen the infrastructure development, especially the DE infrastructure
development in the central and western regions. The eastern region will fully play its
unique regional advantages, actively promote the LC process, and then drive the central
region with its advantages. The central region should actively introduce many advanced
technologies and new industries in the DE and guide traditional industries to change
their production models to promote the rationalization and low-carbon development of
industrial structures. At the same time, the eastern and central regions will benefit from
the accumulated talents, experiences, and technologies to the western region to establish
a solid foundation of DE at the national level to realize the LC nationwide.

References
1. Muntasir, M., Zahoor, A., Shabbir, A., Haider, M., Abdul, R., Vishal, D.: Reinvigorating
the role of clean energy transition for achieving a low-carbon economy: evidence from
Bangladesh. Environ. Sci. Pollut. Res. Int. (2021)
2. Hu, L., Yao, S., Zhou, Z., Miao, N.: The importance of the construction of digital economy
major under the background of new liberal arts. World Sci. Res. J. 7(8), 104–111 (2021)
3. Sorescu, A., Schreier, M.: Innovation in the digital economy: a broader view of its scope,
antecedents, and consequences. J. Acad. Mark. Sci. 49(4), 627–631 (2021)
4. Cheng, C., Huang, H.: Big data and industrial innovation progress in jiangxi province incre-
mental effect highlights enabling digital economy cultivation. J. Phys. Conf. Ser. 1852(2),
022005 (2021)
5. Li, Z., Li, N., Wen, H.: Digital economy and environmental quality: evidence from 217 cities
in China. Sustainability. 13(14), 8058 (2021)
6. Li, Y., Yang, X., Ran, Q., Wu, H., Muhammad, I., Munir, A.: energy structure, digital economy,
and carbon emissions: evidence from China. Environ. Sci. Pollut. Res. Int. 28, 1–24 (2021)
34 F. Yao and L. Li

7. Li, X., Liu, J., Ni, P.: The Impact of the digital economy on CO2 emissions: a theoretical and
empirical analysis. Sustainability 13(13), 7267 (2021)
8. Porter, M.E., van der Linde, C.: Toward a new conception of the environment-competitiveness
relationship. J. Econ. Perspect. 9(4), 97–118 (1995)
9. Rong, H.: The research on fiscal and financial policy to coordinate support development of
low carbon economy- a case study of Heilongjiang Province of China. Int. J. u - e - Serv. Sci.
Technol. 9(12), 39–52 (2016)
10. Tone, K.: A slacks-based measure of efficiency in data envelopment analysis. Eur. J. Oper.
Res. 130(3), 498–509 (2001)
11. Oh, D.H.: A global malmquist-luenberger productivity index. J. Prod. Anal. 34(3), 183–197
(2010)
12. Huang, G., Pan, W., Hu, C., Pan, W., Dai, W.: Energy utilization efficiency of China
considering carbon emissions—based on provincial panel data. Sustainability. 13(2), 877
(2021)
13. Shan, Y., et al.: New provincial CO2 emission inventories in China based on apparent energy
consumption data and updated emission factors. Appl. Energy 184, 742–750 (2016)
The Impact of the Digital Economy on Inclusive
Growth——Empirical Study Based
on the Spatial Econometric Model

Fengge Yao and Xiaoyu Wang(B)

Harbin University of Commerce, Harbin 150028, China

Abstract. This paper drew on the “China Digital Economy Index” released by
Caixin Digital Alliance Research Centre, and measured the inclusive growth index
of 30 provinces (cities and districts) in China based on the entropy weighting
method, then used Stata 16.0 to establish the spatial econometric model to explore
the impact of the digital economy on China’s inclusive growth. The results showed
that the digital economy has a positive contribution to inclusive growth in China.
There is a significant positive spatial spillover effect, indicating that the develop-
ment of the digital economy contributes to improving the level of inclusive growth
between regions in China. Therefore, in-depth exploration of the impact of the
digital economy on inclusive growth has important theoretical value and practical
significance for consolidating China’s achievements in poverty eradication and
realizing inclusive growth.

Keywords: Digital economy · Inclusive growth · Spatial econometric model

1 Introduction
In December 2019, the Central Economic Work Conference proposed that “Efforts
should be made to promote high-quality development and vigorously develop the dig-
ital economy”, the important assertion clearly defined the basic direction of the trans-
formation of the economic development model. At present, China’s economy is from
high-speed growth to inclusive growth, which is an inevitable choice for economic devel-
opment in the new era. The digital economy can improve economic efficiency by inno-
vating the economic development model and weakening the differences of urban-rural
binary structure, thus improving economic efficiency, which is conducive to maintain-
ing equitable access to the fruits of development for the economy and society, and thus
contributing to sustainable economic development, which is also a basic requirement for
inclusive growth.

2 Literature Review
The《G20 Digital Economy Development and Cooperation Agreement》was signed at
the G20 Hangzhou Summit held in September 2016, which comprehensively elaborated

© The Author(s), under exclusive license to Springer Nature Switzerland AG 2022


A. E. Hassanien et al. (Eds.): BIIT 2021, LNDECT 107, pp. 35–50, 2022.
https://doi.org/10.1007/978-3-030-92632-8_4
36 F. Yao and X. Wang

the concept and significance of the inclusive growth, i.e. the digital economy refers to
a series of economic activities that use digital knowledge and information as factors
of production, modern information networks as microcarriers, and the effective use of
Information and Communication Technology (ICT) as an important driving force for
efficiency improvement and economic structure optimization. Research on the digital
economy has also been discussed and explored conceptually and theoretically by scholars
at home and abroad. Negroponte (1996) proposed that the digital economy is an eco-
nomic system that uses ICT extensively to influence people’s lives through digitization,
informatization, and networking [1]. Wang et al. (2019) argued that the digital econ-
omy is a new economic form based on information and knowledge with new-generation
information technology as the means, information industry as the mainstay, and informa-
tion products and services as the main content [2]. Zhang and Liu (2019) summarised
the characteristics of the digital economy, namely that the production factor is data,
its development model is industrial integration, development is characterized by green
sustainability [3].
The ADB first introduced the concept of inclusive growth in its report “Shared
Growth for Social Harmony” in 2007, which argued that “inclusive growth” was growth
with equal opportunities, emphasized that economic growth should not only focus on
aggregate growth but also on reducing regional economic development gaps and enhanc-
ing equality of opportunity in the process of economic development. Although the aca-
demic community has not yet formed a unified concept of inclusive growth, a review of
the existing literature has shown that the meaning of inclusive growth mainly includes
three aspects: economic growth, equal opportunities, and income distribution. Rauni-
yar and Kanbur (2010) considered inclusive growth as growth with reduced inequal-
ity and emphasized income distribution’s relative equity [4]. In measuring inclusive
growth, Zhang and Wan et al.(2019) included income growth and income distribution
together in constructing the inclusive growth system [5]. Wu and Zhou (2018) mea-
sured China’s inclusive growth index from the perspective of green development, which
mainly involved economic growth, income distribution, fairness in opportunity, and envi-
ronmental optimization [6]. Yu and Wang (2012) measured China’s inclusive growth in
four aspects: sustainability of economic growth, fairness of development opportunities,
income disparity, and basic social security, respectively [7].
In terms of the impact of the digital economy on economic growth, scholars in China
have reached a more consistent conclusion that the digital economy has a positive impact
on economic growth. Li and Yang (2021) proposed that the development of the digital
economy effectively promoted the growth of total factor productivity (TFP), and the
effect of the digital economy has a positive spatial spillover effect, contributing to raising
the TFP level of neighboring cities [8]. Xu (2004) constructed a dynamic production
function to explore the impact of the digital economy on economic growth, and the study
showed that the digital economy was positively correlated with the GDP growth index
[9]. Han (2018) established a dynamic panel model, concluded that the software and
information technology service industry can promote China’s economic growth [10]. Xia
et al. (2018) concluded that the digital economy has a significant contribution to China’s
economic growth by constructing a non-competitive employment input-occupied output
model [11]. Zhang and Shi (2019) argued that the impact of the digital economy on the
The Impact of the Digital Economy on Inclusive Growth 37

national economy is mainly through the substitution effect, penetration effect, innovation
effect, and industrial association effect [12]. Tian and Chen (2021) suggested that the
digital economy can effectively break the constraints on economic development caused
by the limited supply of production factors, improve TFP, and play a potential and
momentum in promoting industrial transformation and optimizing economic structure
[13].
Throughout the existing literature, scholars at home and abroad have explored the
digital economy, mainly focusing on the digital economy’s impact on economic growth.
What is the relationship between the digital economy and inclusive growth? Few scholars
have conducted empirical analyses on this issue. In this context, this paper explored the
impact of the digital economy on China’s inclusive growth based on a spatial econometric
model, which will improve China’s digital economy-related policies and further improve
China’s inclusive growth level.

3 The Construction of the Inclusive Growth System


3.1 Inclusive Growth Evaluation System
Inclusive growth refers to thinking about economic growth from an integrated and com-
prehensive perspective. Drawing on the research of Tang and Long et al. (2020), this
paper took sustainable economic development, equitable access to opportunities, and
sharing of development outcomes as primary indicators, and selected 11 secondary indi-
cators and 18 tertiary indicators to construct the inclusive growth evaluation system [14].
As shown in Table 1:

Table 1. The inclusive growth evaluation system

Primary Secondary Tertiary indicator Indicator measure Effect


indicator indicator
Sustainable Level of GDP per capita GDP/total population +
economic development
development Technological R&D investment R&D +
progress intensity expenditure/GDP
Fiscal revenue Share of fiscal tax Fiscal tax +
revenue revenue/GDP
Resource Energy consumption Energy −
utilization per unit of output consumption/GDP
Environmental Exhaust emission per (SO2 emission + NOx −
pollution unit emission)/GDP
Industrial wastewater Wastewater −
emission per unit of emission/GDP
GDP
(continued)
38 F. Yao and X. Wang

Table 1. (continued)

Primary Secondary Tertiary indicator Indicator measure Effect


indicator indicator
Equitable Access to Intensity of Expenditure on +
access to education investment in education/total
opportunities education population
Average years of (Number of illiterate +
schooling people*1 + number of
people with primary
education*6 + number
of people with lower
secondary education*9
+ number of people
with upper secondary
and secondary
education*12 +
number of people with
tertiary education and
above*16)/total
population aged 6 and
above
Employment Employment Number of people +
opportunities opportunities employed in
Proportion of people secondary and tertiary
employed in sectors/total
secondary and tertiary employment
sectors
Unemployment rate Number of people −
registered as
unemployed in urban
areas/total population
Health care Number of health Number of health +
opportunities technicians per ten technicians/total
thousand people number of people (ten
thousand)
Number of medical Number of medical +
and health institutions and health
per 10000 people institutions/total
number of people (ten
thousand)
(continued)
The Impact of the Digital Economy on Inclusive Growth 39

Table 1. (continued)

Primary Secondary Tertiary indicator Indicator measure Effect


indicator indicator
Sharing of Income Per capita income +
development ——
outcomes Disposable income Disposable income per −
ratio between urban urban
and rural residents resident/Disposable
income per rural
resident
Consumption Per capita +
consumption ——
Level of urban-rural Per capita −
consumption consumption level of
coordination urban residents/Per
capita consumption
level of rural residents
Social Security Share of basic pension Basic pension fund +
fund expenditures expenditures/GDP
Share of basic health Basic health insurance +
insurance fund fund expenditure/GDP
expenditure

3.2 Measuring the Level of Inclusive Growth


The entropy weighting method can reduce the influence of subjective factors when
weighting indicators compared with the traditional subjective empowerment method.
Therefore, this paper used the entropy weighting method to measure and evaluate the
importance of each indicator, and the specific steps are as follows:
In the first step, the panel data consisting of M provinces (cities and districts), T
years, and N indicators are dimensionless processed in turn using the extreme difference
method.
If Xitj is a positive indicator:

Xitj − min(Xitj )
Yitj = (1)
max(Xitj ) − min(Xitj )
If Xitj is a negative indicator:

max(Xitj ) − Xitj
Yitj = (2)
max(Xitj ) − min(Xitj )
In Eqs. (1) and (2), i represents the province (cities and districts), t represents the
year, j represents the indicator, Xitj represents the initial indicator value, Yitj represents
the indicator value after dimensionless processing.
40 F. Yao and X. Wang

In the second step, the entropy value of the indicator Ej is calculated:


T M
t=1 i = 1 (fitj ∗ ln fitj )
Ej = − (3)
ln(TM)
Yitj
fitj = T M (4)
t =1 i = 1 Yitj

In the third step, the coefficient of variation Gi for the indicator is calculated:

Gi = 1 − Ej (5)

In the fourth step, the weights of the indicator Wi is calculated:


Gi
Wi = N (6)
j = 1 Gi

The results of the calculations are shown in Table 2:

Table 2. China’s inclusive growth index from 2012 to 2018

Province 2012 2013 2014 2015 2016 2017 2018 Seven-year


average
Beijing 0.6552 0.7122 0.7021 0.7279 0.7426 0.7656 0.8119 0.7311
Tianjin 0.3928 0.4482 0.4694 0.4815 0.4936 0.5059 0.5342 0.4751
Hebei 0.2080 0.2230 0.2469 0.2730 0.3092 0.3333 0.3709 0.2806
Shanxi 0.2280 0.2506 0.2620 0.2748 0.2824 0.3248 0.3562 0.2827
Inner 0.2149 0.2411 0.2635 0.2832 0.3008 0.3283 0.3401 0.2817
Mongolia
Liaoning 0.2794 0.3040 0.2990 0.2919 0.3231 0.3438 0.3674 0.3155
Jilin 0.2129 0.2365 0.2500 0.2691 0.2817 0.2928 0.3211 0.2663
Heilongjiang 0.1982 0.2126 0.2224 0.2325 0.2481 0.2751 0.2912 0.2400
Shanghai 0.4790 0.5147 0.5240 0.5569 0.5907 0.6273 0.6719 0.5664
Jiangsu 0.3415 0.3755 0.4012 0.4277 0.4462 0.4724 0.5000 0.4235
Zhejiang 0.3408 0.3781 0.4089 0.4357 0.4559 0.4869 0.5210 0.4324
Anhui 0.2138 0.2369 0.2688 0.2953 0.2981 0.3225 0.3766 0.2874
Fujian 0.2304 0.2603 0.2870 0.3008 0.3114 0.3389 0.3640 0.2990
Jiangxi 0.2142 0.2330 0.2546 0.2749 0.2811 0.3199 0.3490 0.2752
Shandong 0.2369 0.2722 0.2971 0.3161 0.3287 0.3531 0.3685 0.3104
Henan 0.1794 0.2027 0.2310 0.2507 0.2592 0.3082 0.3376 0.2527
(continued)
The Impact of the Digital Economy on Inclusive Growth 41

Table 2. (continued)

Province 2012 2013 2014 2015 2016 2017 2018 Seven-year


average
Hubei 0.1985 0.2281 0.2631 0.2948 0.3077 0.3393 0.3673 0.2856
Hunan 0.1845 0.2043 0.2305 0.2538 0.2649 0.3068 0.3354 0.2543
Guangdong 0.2719 0.3221 0.3421 0.3690 0.3911 0.4165 0.4451 0.3654
Guangxi 0.1720 0.1739 0.2230 0.2461 0.2552 0.2965 0.3277 0.2421
Hainan 0.2525 0.2753 0.3003 0.3277 0.3302 0.3500 0.3938 0.3186
Chongqing 0.2486 0.2622 0.2946 0.3247 0.3355 0.3601 0.3933 0.3170
Sichuan 0.2193 0.2456 0.2729 0.2984 0.3037 0.3335 0.3716 0.2921
Guizhou 0.1723 0.1865 0.2200 0.2415 0.2492 0.2720 0.3160 0.2368
Yunnan 0.1447 0.1682 0.1838 0.2059 0.2180 0.2729 0.3018 0.2136
Shanxi 0.2420 0.2654 0.2873 0.3088 0.3120 0.3339 0.3766 0.3037
Gansu 0.1713 0.2057 0.2344 0.2693 0.2767 0.3248 0.3391 0.2602
Qinghai 0.2233 0.2340 0.2730 0.2771 0.2855 0.3145 0.3577 0.2807
Ningxia 0.1532 0.2052 0.2334 0.2560 0.2783 0.3046 0.3118 0.2489
Xinjiang 0.2201 0.2371 0.2610 0.2772 0.2933 0.2940 0.3312 0.2734
National 0.2500 0.2772 0.3002 0.3214 0.3351 0.3639 0.3950 0.3204
average
Data source: calculated from raw data in the China Statistical Yearbook for previous years

4 Variable Selection, Data Sources, and Model Construction


4.1 Selection of Variables
The Core Explanatory Variable. Digital Economy (dige) is based on the “China Digi-
tal Economy Index” released by the Caixin Digital Alliance Research Centre. The index
system consists of four primary indicators, 14 secondary indicators, and 38 tertiary
indicators: digital economy industry index, digital economy integration index, digital
economy spillover index, and digital economy infrastructure index. The “China digital
economy index” measures the growth of the digital economy through network big data
mining, which is representative to a certain extent. Therefore, take it as a measure of the
digital economy.

Explained Variable. Inclusive growth (ig), this paper selected three primary indicators,
11 secondary indicators and 18 tertiary indicators to build China’s inclusive growth
system, then used the entropy weighting method to calculate the inclusive growth index
as the explained variable.

Control Variables. Drawing on the research of scholars on the impact of inclusive


growth, the control variables selected in this paper are as follows: government expen-
diture scale (gov), industrial structure (is), foreign direct investment (fdi, and take the
42 F. Yao and X. Wang

logarithm of them) and the degree of openness to the outside world (open) and price
level (pl, and take the logarithm of them).

A description of the relevant variables selected for this paper is shown in Table 3:

Table 3. Description of variables

Variables Variable name Representation Calculation method


symbol
Explanatory variable Digital economy dige China Digital Economy
Index
Explained variable Inclusive growth ig Entropy weighting
method
Control variables Government gov Government fiscal
expenditure scale expenditure/GDP
Industrial structure is Sum of value-added of
secondary and tertiary
industries/GDP
Degree of openness to open Import and export trade
the outside world volume/GDP
Foreign direct lnfdi Foreign direct
investment investment in logarithm
Price level lnpl Consumer price index in
logarithm

4.2 Data Sources

In this paper, the panel data of 30 provinces (cities and districts) in China from 2012 to
2018 were selected as the research samples. Among them, the digital economy (dige) was
obtained from the “China Digital Economy Index” released by Caixin Digital Alliance
Research Centre, while the original data of other variables were obtained from the China
Statistical Yearbook, the statistical yearbooks of various provinces (cities and districts)
and the websites of the National Bureau of Statistics, etc. Since the data of Tibet, Hong
Kong, Macao, and Taiwan are discontinuous and seriously missing, they are not included
in the scope of the study.
The Impact of the Digital Economy on Inclusive Growth 43

Table 4. Results of descriptive statistics of data

Variables Obs Mean Std Max Min


ig 210 0.3204 0.1174 0.8119 0.1447
dige 210 0.2770 0.1662 0.9204 0.0145
gov 210 0.2494 0.1020 0.6269 0.1181
is 210 0.4625 0.0934 0.8098 0.3094
open 210 0.2671 0.2974 1.4403 0.0169
lnfdi 210 0.3555 0.3553 1.7926 0.0477
lnpl 210 4.6255 0.0062 4.6439 4.6108

To ensure that the panel data of each variable in the paper is steady and to avoid
pseudo-regression, robustness tests were conducted before the regression analysis, and
the LLC test and Fisher-PP test were used in this paper.

Table 5. Robustness test results

Variables LLC test Fisher-PP test


ig −14.9380*** 9.9086***
dige −4.7504*** 37.0451***
gov −3.7356*** 3.5707***
is −18.6752*** 17.2874***
open −8.0658*** 5.9785***
lnfdi −64.2398*** 2.5923***
lnpl −38.7018*** 1.4934*
Note: ***, ** and * denote significant at the 1%, 5% and 10% levels respectively

As can be seen from Table 5, all variables were robust and therefore brought into the
model for regression analysis.

4.3 Model Construction


This paper used the SAR model and SDM model for empirical analysis, and the models
are constructed as follows:
Igit = β0 + β1 Digeit + β2 Controlit + ρWij Igit + μi + φi + εit (7)


n
Igit = β0 + λ wij Igjt + β1 Digeit + β2 Controlit
j=i

n 
n (8)
+η wij Digejt + θ wij Controljt + μi + φi + εit
j=i j=i
44 F. Yao and X. Wang

In model (7) and model (8), I g represents inclusive growth, Dige represents digital
economy, Control represents control variables, β0 represents constant terms, ρ and λ
represent spatial regression coefficients of SAR and SDM models respectively, Wij are
spatial weight matrices, ε is the random error term, μ is the regional fixed effect, ϕ is
the time fixed effect, i and j represent provinces, t represents years.
In terms of spatial weight matrix settings, this paper chose the economic weight
matrix, where economic distance was measured using the difference between the GDP
of each province in 2020.

5 Analysis of the Empirical Results

5.1 Spatial Correlation Test

Drawing on Cui and Wang et al. (2018), this paper chose the M oran’s I index test
to analyze the global spatial autocorrelation of inclusive growth in China [15]. The
calculation method is as follows:
 n n  
i=1 j = 1 wij (xi − x̄) xj − x̄
M oran’s I =   (9)
S 2 ni= 1 nj= 1 wij

Where M oran’s I is the overall M oran’s I , x is the inclusive growth index, S 2 is


the variance of the inclusive growth index, and wij is the element corresponding to the
spatial weight matrix. The value of M oran’s I is taken within the interval [−1,1], and the
larger the absolute value of M oran’s I , the stronger the spatial correlation. The overall
M oran’s I of China’s inclusive growth index from 2012 to 2018 is shown in Table 6,
and the M oran’s I are distributed in the interval of 0.289—0.329, and all of them are
significantly positive at the 1% level. This indicated that the level of inclusive growth
in China shows a significant positive spatial correlation, i.e. regions with a high level
of inclusive growth can positively influence the surrounding regions, and the level of
inclusive growth has a significant positive spillover effect in space.

Table 6. Overall M oran’s I of inclusive growth in China from 2012 to 2018

Year Moran’s I Sd(I) Z p-value


2012 0.307*** 0.077 4.426 0.000
2013 0.324*** 0.078 4.619 0.000
2014 0.329*** 0.079 4.629 0.000
2015 0.311*** 0.079 4.391 0.000
2016 0.315*** 0.080 4.395 0.000
2017 0.295*** 0.079 4.168 0.000
2018 0.289*** 0.079 4.104 0.000
Note: ***, ** and * denote significant at the 1%, 5% and 10% levels respectively
The Impact of the Digital Economy on Inclusive Growth 45

To examine the spatial correlation of inclusive growth among provinces in China,


the M oran’s I scatter plots of inclusive growth in China in 2012 and 2018 were further
plotted, and the M oran’s I scatter distribution is shown in Fig. 1. The M oran’s I of
inclusive growth in most regions in China are concentrated in the first and third quadrants,
indicating that inclusive growth exhibits high-high aggregation and low-low aggregation
characteristics, showing a strong positive spatial correlation.

Fig. 1. Moran’s I scatterplot for inclusive growth (2012 & 2018)

5.2 Analysis of Spatial Measurement Results


Based on clarifying the existence of spatial correlation of the digital economy on inclusive
growth, selecting the appropriate spatial measurement model. This paper drew on Elhorst
(2014) to conduct the Hausman test on the model and based on the test results, the fixed-
effect model was adopted [16]. Then, the robustness of the model was tested, and the LR
test is used to determine whether the SDM model will degenerate into the SAR model
and SEM model, as shown in Table 7:
In the LR test, the SDM model reject the original hypothesis at the 1% significance
level, indicating that the SDM model will not degenerate into the SAR model and SEM
model, so chose the SAR model and SDM model to investigate the relationship between
the digital economy and inclusive growth.

Table 7. LR test results

SAR SEM
Test-Statistic p-value Test-Statistic p-value
66.05 0.0000 158.93 0.0000

The regression results are shown in Table 8. In terms of spatial reflection coefficient,
the coefficients of SAR and SDM models are positive, and both are significant at the 1%
level. Among them, the spatial reflection coefficient of the spatial autoregressive model
46 F. Yao and X. Wang

ρ is 0.696; the spatial reflection coefficient of the spatial Dobbin model λ is 0.348,
indicating that there is a significant positive spatial dependence on inclusive growth in
China. In terms of the core explanatory variables, the coefficients of the digital economy
for the two models are 0.159 and 0.0948 respectively, which are significantly positive at
the 1% level, indicating that the digital economy still has a positive effect on inclusive
growth after spatial spillover effects are taken into account. The coefficient of the spatial
lagged term of the digital economy (W *dige) is 0.217, which is significantly positive
at the 1% level, i.e. there is a significant spatial spillover effect of the digital economy
on inclusive growth in other provinces, indicating that an increase in the regional digital
economy can improve the level of inclusive growth in the region and also generate pos-
itive externalities to neighboring regions. In terms of control variables, the findings of
the above two models are basically the same, with the coefficients of government expen-
diture level and industrial optimization both being positive and passing the significance
test, indicating that both still have a positive effect on the level of inclusive growth after
the spatial spillover effect is taken into account. The three indicators of trade openness,
foreign direct investment, and price level failed the significance test, indicating that the
spillover effects of these three variables on inclusive growth are not significant.

Table 8. Economic weighting matrix regression results

Variables SAR SDM


dige 0.159*** 0.0948***
(0.0310) (0.0355)
gov 0.247*** 0.295***
(0.0959) (0.108)
is −0.111* −0.163**
(0.0605) (0.0666)
open 0.0288 0.0197
(0.0186) (0.0172)
lnfdi 0.00394 −0.00655
(0.0105) (0.0091)
lnpl −0.0982 −0.122
(0.125) (0.203)
R2 0.284 0.312
W*dige 0.217***
(0.0536)
ρ/ λ 0.696*** 0.348***
(0.0707) (0.124)
N 210 210
Log-L 670.0208 697.1205
Note: *** denotes p < 0.01, ** denotes p < 0.05, * denotes p < 0.1, t-statistic in brackets
The Impact of the Digital Economy on Inclusive Growth 47

5.3 Spatial Effect Decomposition

To further analyze the direct effects and spatial spillover effects of each explanatory
variable, the spatial effects were decomposed based on SAR and SDM models, and the
results of the spatial effects decomposition are shown in Table 9:

Table 9. Results of spatial effect decomposition

Variables SAR SDM


Direct effect Spillover Total Direct effect Spillover Total
effect effect effect effect
de 0.177*** 0.346*** 0.523*** 0.109*** 0.367*** 0.476***
(0.0302) (0.0798) (0.0827) (0.0337) (0.0513) (0.0425)
gov 0.275** (0.111) 0.539* 0.814* 0.267*** −0.630*** −0.363**
(0.324) (0.425) (0.101) (0.189) (0.179)
is −0.124* −0.242 −0.366 −0.153** (0.0642) 0.0860 −0.0671
(0.0694) (0.183) (0.249) (0.137) (0.151)
open 0.0321 0.0629 0.0950* 0.0195 (0.0158) 0.00583 0.0253
(0.0203) (0.0383) (0.0571) (0.0451) (0.0445)
lnfdi −0.00438 0.00859 0.0130 −0.0050 0.0387 0.0337
(0.0116) (0.0220) (0.0335) (0.0087) (0.0299) (0.0305)
lnpl −0.109 −0.214 −0.323 −0.134 −0.425 −0.5584
(0.139) (0.279) (0.415) (0.190) (0.439) (0.397)
R2 0.284 0.284 0.284 0.312 0.312 0.312
N 210 210 210 210 210 210
Log-L 670.0208 670.0208 670.0208 697.1205 697.1205 697.1205
Note: *** denotes p < 0.01, ** denotes p < 0.05, * denotes p < 0.1, t-statistic in brackets

As shown in Table 9, the direct effect, spillover effect, and total effect of the SAR
model and SDM model have passed the significance test, indicating that the digital econ-
omy not only contributes to the level of inclusive growth in the local region but also has a
spatial spillover effect that positively affects the inclusive growth of surrounding regions.
In addition, the direct effect coefficient is smaller than the spillover effect coefficient,
indicating that the positive impact of the digital economy development on the local
region’s inclusive growth is smaller than the positive impact on the surrounding region’s
inclusive growth. As for the other control variables, the gov passed the significance test,
indicating that the gov has both a direct effect on inclusive growth in the local region and
a spatial spillover effect on inclusive growth in the surrounding regions; while it only
has a direct effect on inclusive growth in the local region and no spatial spillover effect
on inclusive growth in the surrounding regions, fdi, open and pl failed the significance
test.
48 F. Yao and X. Wang

5.4 Robustness Test


In this paper, we chose the method of changing the spatial weight matrix to test the
robustness of the spatial econometric model by changing the economic weight matrix
to the geographic distance weight matrix.

Table 10. Test results of the geographic distance matrix

Variables SAR SDM


dige 0.195*** 0.110***
(0.0304) (0.0350)
gov 0.269*** 0.342***
(0.103) (0.0970)
is −0.0769 −0.0986
(0.0648) (0.0701)
open 0.0388** 0.0256*
(0.0174) (0.0142)
lnfdi −0.00126 0.000373
(0.0108) (0.00951)
lnpl −0.0951 −0.160
(0.127) (0.247)
ρ/ λ 0.603*** 0.320***
(0.0666) (0.0927)
R2 0.325 0.475
W*dige 0.182***
(0.0458)
N 210 210
Log-L 664.0966 694.0684
Note: *** denotes p < 0.01, ** denotes p < 0.05, * denotes p < 0.1, t-statistic in brackets

Table 10 reported the regression results after changing the spatial weight matrix.
Under the geographical distance matrix, the coefficients of the SAR and SDM models
were both positive and passed the significance test, and their values were similar to those
shown in Table 8. In addition, the coefficient values of the other control variables were
also generally consistent with the results shown in Table 8, again verifying the existence
of significant positive spatial dependence of inclusive growth in China, indicating that
the main findings of this paper are robust and consistent with the reality.

6 Conclusions and Recommendations


6.1 Conclusions
Firstly, this paper adopted the “China Digital Economy Index” released by Caixin Digital
Alliance Research Centre and measured the inclusive growth index of 30 provinces
The Impact of the Digital Economy on Inclusive Growth 49

(cities and districts) in China based on the entropy weighting method. Secondly, used
Stata 16.0 establishes the spatial econometric model to further analyze the impact of the
digital economy on inclusive growth. The results showed that the digital economy has a
positive contribution to inclusive growth in China. There is a significant positive spatial
spillover effect, indicating that the development of the digital economy contributes to
improving the level of inclusive growth between regions in China. Based on the research
conclusion, the following recommendations are put forward:

6.2 Recommendations
Improve Laws and Regulations, Perfect the Digital Security Supervision System.
China’s digital economy has developed relatively late, and the corresponding laws, reg-
ulations, and regulatory systems are not yet perfect. Therefore, improving laws and
regulations and perfecting the digital security regulatory system is the primary task to
achieve inclusive economic growth in China. Firstly, regulators should develop a reason-
able and effective digital security system and improve the security system of the digital
economy market in order to ensure the steady and orderly operation of the digital econ-
omy market. Secondly, regulators should clarify the scope and regulatory requirements
for the digital economy and establish an information disclosure system related to the
digital economy.
Improve Infrastructure Construction, Strengthen the Support of the Talent Team.
The development of the digital economy and the realization of inclusive economic growth
are inseparable from the construction of the digital infrastructure and the construction of
talent teams in the digital field. Firstly, strengthen the construction of digital infrastruc-
ture in regions with backward economic development, and expand the scope of services
provided by the digital economy in various industries. Secondly, implement the policy
of introducing professionals. Governments at all levels take the initiative to absorb and
introduce professionals in the field of the digital economy and increase subsidies for
landing talents.
Innovate Digital Technology, Accelerate the Digitization of Industry Development.
Technological innovation is the key to developing the digital economy. Firstly, China
should increase investment in the research and development of talents in the fields of
big data, cloud computing, the Internet, and artificial intelligence. Secondly, large enter-
prises should be encouraged to establish their own digital technology research center so
that digital technology can be deeply integrated with various industries and accelerate
the digital transformation and upgrading of industrial development. Finally, establish a
long-term mechanism for universities, scientific research institutes, and industrial enter-
prises to jointly innovate the development of digital technology, promote the integrated
development of industry, university, and research, timely transform research results, and
further promote China’s inclusive economic growth.

References
1. Negroponte, N.: Being digital. Yonsei Bus. Rev. 33(1) (1996)
50 F. Yao and X. Wang

2. Wang, W., Wang, J.: Research on the development trend and promotion policies of China’s
digital economy. Econ. Aspect (01), 69–75 (2019)
3. Zhang, H., Liu, Z., Wang, S.: Analysis of China’s high-quality economic development path
under the background of digital economy. Bus. Econ. Res. (23), 183–186 (2019)
4. Rauniyar, G., Kanbur, R.: Inclusive growth and inclusive development: a review and synthesis
of Asian development bank literature. J. Asia Pac. Econ. 15(4), 455–469 (2010)
5. Zhang, X., Wan, G., Zhang, J., He, Z.: Digital economy, inclusive finance, and inclusive
growth. Econ. Res. 54(08), 71–86 (2019)
6. Wu, W., Zhou, X.: Measurement, evaluation, and influencing factors of China’s inclusive
green growth. Soc. Sci. Res. 01, 27–37 (2018)
7. Yu, M., Wang, X.: Inclusive growth of china’s economy: measurement and evaluation. Econ.
Rev. 03, 30–38 (2012)
8. Li, Z., Yang, Q.: How does the digital economy affect china’s high-quality economic
development? Disc. Mod. Econ. (07), 10–19 (2021).
9. Xu, S., Mao, X.: Analysis of the contribution of the information industry to economic growth.
Manag. World (08), 75–80 (2004)
10. Han, B., Li, S.: Software and Information technology service industry and china’s economic
growth. Res. Quant. Econ. Technol. Econ. 35(11), 128–141 (2018)
11. Xia, Y., Wang, H., Zhang, F., Guo, J.: Research on the impact of the digital economy on
China’s economic growth and non-agricultural employment —based on input occupancy
output model. J. Chin. Acad. Sci. 33(07), 707–716 (2018)
12. Zhang, H., Shi, L.: Digital economy: new power in the new era. J. Beijing Jiaotong Univ.
(Soc. Sci. Ed.) 18(02), 10–22 (2019)
13. Tian, G., Chen, F.: Xi Jinping’s historical logic, core meaning, and its value in the development
of the digital economy. Theory Guide (01), 4–9 (2021)
14. Tang, Y., long, Y., Zheng, Z.: Research on inclusive growth effect of digital inclusive finance
— an empirical analysis based on 12 provinces in Western China. Southwest Finan. (09),
60–73 (2020)
15. Cui, Q., Wang, W., Zhang, S.: Financial deepening, industrial structure upgrading and tech-
nological innovation—an empirical analysis based on spatial dobbin model. Ind. Technol.
Econ. 37(02), 42–50 (2018)
16. Paul Elhorst, J.: Matlab software for spatial panels. Int. Reg. Sci. Rev. 37(3), 389–405 (2014)
Evolutionary Game Analysis of Digital
Innovation Ecosystem Governance

Liu Kewen(B) and Liu Junji

Harbin University of Commerce, Harbin 150028, China

Abstract. Under the background of the rapid development of the digital econ-
omy, managing the digital innovation ecosystem is an important topic. Based on
the evolutionary game method, this paper studies the evolutionary game strategy
selection of the participants in the governance of the digital innovation ecosys-
tem. It is found that: ➀ The participants in the governance of the digital inno-
vation ecosystem include three aspects: the government, the platform, and other
participants. The roles in the process of the evolutionary game are different. The
government plays the role of macro policy guidance, the platform strengthening its
governance is the key, and other participants play an auxiliary role. By construct-
ing the “government-platform-participant” tripartite evolutionary game model,
this paper simulates and analyzes the evolutionary strategy of the three players
in the governance to achieve the balance and stability of the governance effect
and puts forward some suggestions to promote the governance of the digital inno-
vation ecosystem. ➁ Governance principle of the digital innovation ecosystem:
establish an effective incentive and punishment mechanism, follow the principle
of “promoting innovation, subject fairness and welfare maximization,” and coor-
dinate the relationship between relevant stakeholders and their benefit distribution
through legislation and other policy levels, to maximize the welfare of the whole
digital economy.

Keywords: Digital innovation ecosystem · Governance · Evolutionary game

1 Introduction

In recent years, the digital economy has become an important part of the national econ-
omy of all countries. It is facing new challenges in the process of digital innovation. The
governance problem has become a research hotspot in economics, management, soci-
ology, and so on. Through the research on the evolutionary game of the participants in
the governance of the digital innovation ecosystem, this paper discusses the goal of pro-
moting the healthy development of the digital innovation ecosystem with a sustainable
competitive advantage and realizing the balance of interests of all participants. Digital
innovation refers to the combination of information, computing, communication, and
connectivity technologies used in the innovation process, including product innovation,
production process improvement, organizational model change, and business model
innovation [1–3]. The innovation ecosystem is a complex system composed of platform

© The Author(s), under exclusive license to Springer Nature Switzerland AG 2022


A. E. Hassanien et al. (Eds.): BIIT 2021, LNDECT 107, pp. 51–60, 2022.
https://doi.org/10.1007/978-3-030-92632-8_5
52 L. Kewen and L. Junji

enterprises, consumers, markets and their natural, social and economic environment.
Core enterprises in Apple, IBM, Wal Mart, and other innovation ecosystems enable
other organizational members to create value by establishing service, tool, technology,
and value network platforms. At the same time, it strengthens the overall innovation and
productivity improvement of the ecosystem [4]. With the symbiotic evolution of mem-
bers, the participants of the innovation ecosystem follow the unified platform standards
and rules and create value that a single enterprise cannot create through open innova-
tion [5]. The digital innovation ecosystem involves many participants. Relying solely
on ecosystem platforms or government governance is not enough to regulate transaction
behavior. It is necessary to design a governance mechanism with the participation of
all stakeholders [6]. The existing research results of innovation ecosystem governance
are rich and diverse. Some scholars study ecosystem governance from the perspective
of innovation process governance, analyzing innovation elements, network relations,
Embeddedness, and collaborative innovation. Some analyze the internal relationship
of the network and the governance of value activities from the perspective of network
nodes [7, 8]. Some analyze the impact of knowledge and resource transfer governance
on innovation performance from the perspective of the output of the innovation ecosys-
tem [9, 10]. But there are three research gaps: lack of attention to the game between
governance subjects, less discussion on government supervision and incentive role, and
lack of analysis of other relevant participants.
This article aims to analyze the strategy choice of the evolutionary game between the
governance subjects of the digital innovation ecosystem, explore the optimal strategy
and effective governance mechanism, and the role of government, platform, and other
participants in governance.

2 Analysis on Dovernance Subject of Digital Innovation Ecosystem


2.1 Characteristics of Digital Innovation Ecosystem
According to Ander’s research framework [11], the innovation elements, participants,
and relationships among the digital innovation ecosystem subjects have the following
characteristics: first, the digitization of innovation elements and digital resources, digital
technology, and digital infrastructure are the key elements of the innovation ecosystem.
The second is the virtualization of participants. Based on digital technology, the con-
tact breadth and flexibility among participants have been significantly improved. Many
highly heterogeneous participants have become the subjects of value creation, and the
innovation subjects rely on the platform to realize borderless virtual links [12]. Third, the
relationship between subjects is ecological. The multi-level digital architecture allows
participants to carry out activities at different architectural levels, and the relationship
between participants is more dynamic and open. Innovative cooperation has changed
from a linear supply chain model to a multi-layer network ecosystem model [13, 14].

2.2 Composition of Participants in Digital Innovation Ecosystem Governance


In the governance of the digital innovation ecosystem, the participants include five
aspects: first, the government plays a supporting and guiding role in the governance
Evolutionary Game Analysis of Digital Innovation Ecosystem Governance 53

process. Mainly system supply and administrative services; Second, Internet platform
enterprises; Third, platform service providers; Fourth, consumers; Fifth, social institu-
tions, including universities, third-party institutions, and other non-governmental orga-
nizations, can not be ignored in promoting governance. They usually play a bridge role
in legal services, news disclosure, educational assistance, and responsibility supervision
and have positive functions in government failure and market failure. For the convenience
of analysis, the last three parts are collectively referred to as the participants. This paper
divides the participants of digital innovation ecosystem governance into three parts,
including government regulators, platforms, and participants, as shown in Fig. 1.

Participants
Government Platform (Service providers, social
institutions, consumers, etc)

Fig. 1. Composition of governance subjects of the digital innovation ecosystem.

2.3 The Relationship Between the Participants in the Governance of Digital


Innovation Ecosystem

In the governance of the digital innovation ecosystem, the participants have information
asymmetry and irrational decision-makers. The governance effect is closely related to
the strategic choice of the government, platform, and participants.

The Needs of Participants Are Different. The government is the maker of policies
and regulations and has the responsibility of supervision. Governance is the ability of the
government to formulate and implement rules and provide services [15]. At the same
time, governance is an ability and a specific and complex interactive network across
different actors [16].

As the resource integration and manager of the innovation ecosystem, the platform
has the responsibility to manage the operation order of the platform. However, the profit-
seeking motivation of the platform supervises the participants to a certain extent. Plat-
form governance can control the opportunistic behavior of members of the innovation
ecosystem and ensure the effective operation of the innovation ecosystem, such as the
reputation mechanism of member star rating and the seller rating of Taobao. At the same
time, the platform may also use information asymmetry to take opportunistic behav-
ior to misappropriate the interests of its members. For example, in Apple’s innovation
ecosystem, technological innovation platforms such as iPhone and iPad dominate. There
are opportunistic behaviors that infringe on the interests of supporting enterprises such
as application software developers. Participants participate in governance for their own
needs.
54 L. Kewen and L. Junji

There Is a Conflict of Interest Among the Three Participants. Most platforms are
for profit. If the platform has violations of laws and regulations, including monopoly
and other situations that disrupt the market order and cause losses to participants, the
platform will face legal sanctions; Relevant government departments are responsible for
formulating laws and regulations to protect the interests of all participants; Participants
cannot distinguish information due to information asymmetry. They must obtain public
information such as platform transaction specifications.

The Governance Effect Depends on the Balance of the Three Parties’ Strategy
Choice. The government is responsible for formulating regulations to regulate all mem-
bers and coordinating the relationship between all participants. The platform needs to
use appropriate technical means to strengthen standardization and ensure compliance
operation; Participants need to improve self-discipline.

The platform owner is responsible for maintaining the long-term prosperity and
development of the entire ecosystem [17]. It can be seen that platform leaders have great
power, and there are many sources and manifestations of power. But the other side of
power is a responsibility, which comes with power and responsibility.

3 Construction of Evolutionary Game Model


3.1 Description of Model Assumptions and Variables
To build an evolutionary game model and obtain the best evolutionary game strategy,
the following assumptions are proposed:
Hypothesis 1: the government, platform, and other participants involved in the game
process are limited rationality.
Hypothesis 2: in the digital innovation ecosystem, the three parties can adopt two
strategies: governance or non-governance. When the platform owner implements gov-
ernance, the income is P1, the governance cost is C1, the income without governance is
P0, and the cost is C0. The benefits brought by the government’s implementation of gov-
ernance are Pg1, the governance cost is Cg1, the benefits brought by the government’s
implementation of governance are Pg0, and the cost is Cg0. When the platform imple-
ments governance, the government will reward R1 to the platform, and if the platform
does not implement governance, the government will punish F0. When other participants
participate in governance, their benefits are pu1, the costs are Cu1, and the benefits to
the platform are Pf1. When they do not participate in governance, the benefits are pu0,
the costs are Cu0, and the benefits to the platform are Pf0.
Hypothesis 3: in order to build a tripartite game model, 0 ~ 1 variables are used. x
represents the strength of platform governance, x = 1 represents the governance strategy
adopted by the platform, and x = 0 represents the non-governance strategy adopted by the
platform. y represents the strength of government governance, y = 1 represents the gov-
ernment’s governance strategy, and y = 0 represents the government’s non-governance
strategy. z represents the strength of other participants participating in governance, z =
1 represents the governance strategy adopted by participants, and z = 0 represents the
strategy in which participants do not participate. Where, 0 ≤ x, y, z ≤ 1.
Evolutionary Game Analysis of Digital Innovation Ecosystem Governance 55

3.2 Tripartite Evolutionary Game Model


According to the above assumptions, the return functions of the three-party evolutionary
game are:
Expected revenue when the platform adopts governance strategy:
Ex1 = z ∗ Pf1 + y ∗ R1 + (P1 − C1) (1)
Expected revenue when the platform does not adopt governance strategy:
Ex2 = z ∗ Pf0 − y ∗ f0 + (P0 − C0) (2)
Average expected return when the platform adopts mixed governance strategy:
Ex = x ∗ Ex1 + (1 − x) ∗ Ex2 (3)
Expected return when the government adopts governance strategy:
Ey1 = (Pg1 − Cg1) − x ∗ R1 (4)
Expected return when the government does not adopt governance strategy:
Ey2 = (Pg0 − Cg0) − x ∗ f0 (5)
The average expected return of the government’s hybrid governance strategy:
Ey = y ∗ Ey1 + (1 − y) ∗ Ey2 (6)
Expected benefits of participants participating in governance strategies:
Ez1 = Pu1 − Cu1 (7)
Expected benefits when participants do not participate in governance strategies:
Ez2 = Pu0 − Cu0 (8)
Average expected return when participants adopt mixed governance strategy:
Ez = z ∗ Ez1 + (1 − z) ∗ Ez2 (9)
Thus, the replicated dynamic equation of the platform is obtained:
⎧ dx


⎨ dt
x∗(Ex1 − Ex)
F(x) = (10)

⎪ x∗(1 − x)∗(Ex1 − Ex2)

x∗(1 − x)∗(z∗(Pf1 − Pf0) + y∗(R1 + F0) + P1 − C1 − P0 + C0)
The replicated dynamic equation of the government:


⎪ y∗(1 − y)∗(Pg1 − Cg1 − x∗(R1 − F0) − Pg0 + Cg0)

y∗(1 − y)∗(Ey1 − Ey2)
F(y) = (11)

⎪ y∗(Ey1 − Ey)
⎩ dy
dt
The replicated dynamic equation of other participants:
F(z) = dz/dt = z ∗ (Ez1 − Ez) = z ∗ (1 − z) ∗ (Ez1 − Ez2) = z ∗ (1 − z) ∗ (Pu1 − Cu1 − Pu0 + Cu0) (12)
56 L. Kewen and L. Junji

3.3 Evolutionary Game Equilibrium Analysis


According to the relevant properties of evolutionary game, the derivatives of the above
Eqs. (10)–(12) are obtained respectively:

F (x) = dF(x)/dx = (1 − 2x)*(Ex1 − Ex2) (13)

F (y) = dF(y)/dy = (1 − 2y) ∗ (Ey1 − Ey2) (14)

F (z) = dF(z)/dz = (1 − 2z) ∗ (Ez1 − Ez2) (15)

Simultaneous Eqs. (13), (14) and (15) obtain eight equilibrium points at which all
three parties of the game reach equilibrium, namely (0, 0, 0), (1, 0, 0), (0, 1, 0), (0, 0,
1), (1, 1, 0), (1, 0, 1), (0, 1, 1) and (1, 1, 1). In the digital innovation ecosystem, the
government, the platform, and the participants play the game and learn by imitation.
Over time, the three parties optimize their strategies and finally achieve the equilibrium
of the game. According to hypothesis (3), it is meaningful for any initial point and its
evolved point to meet v = {(x, y, z) | 0 ≤ x ≤ 1, 0 ≤ y ≤ 1, 0 ≤ Z ≤ 1}. Therefore, the
eight equilibrium points are meaningful.

3.4 Model Simulation Analysis


The equilibrium analysis of the evolutionary game among the government, platform and
users in the digital innovation ecosystem shows that the strategy choice of the three
parties is closely related to their respective benefits. To analyze the governance strategy
selection behavior of the three parties under different benefits, this paper uses MATLAB
software for simulation analysis.
Initially, when the platform adopts the governance strategy, the revenue P1 = 100,
and the cost C1 = 50. When the platform adopts the non-governance strategy, the revenue
P0 = 50 and the cost C0 = 10. When the government adopts the governance strategy,
the income Pg1 = 40 and the cost Cg1 = 20. When the government adopts the non-
governance strategy, the income Pg0 = 20, and the cost Cg0 = 10. When the platform
governs, the government rewards the platform R1 = 20, and if the platform governs, it
will be punished F0 = 40. When the user participates in governance, the user’s income
is Pu1 = 30, and the cost is Cu1 = 15. When the user does not participate in governance,
the user’s income is Pu0 = 15, and the cost is Cu0 = 5. When users participate in
governance, the benefits to the platform Pf1 = 20; when users do not participate in
governance, the benefits to the platform pf0 = 5. The simulation results are as follows:

(1) If the platform implements the governance strategy, the initial value of revenue is
set to P1 = 100, 110, and 120, respectively, as shown in Fig. 2. With the increase
of the initial income of the platform, the three curves converge quickly, and all
three parties choose governance as the final strategy. With the increasing revenue
of the platform and the strengthening of platform governance, the government’s
governance strategy is stable in a small range, and the change of participants is small.
Figure 2 shows that driven by interests, and the platform tends to choose the strategy
Evolutionary Game Analysis of Digital Innovation Ecosystem Governance 57

of governance so as to achieve collaborative governance with the government and


participants and achieve the balance of the tripartite game. At the same time, it also
makes the whole ecosystem more healthy and sustainable development to promote
the improvement of innovation efficiency and form a virtuous circle.

Fig. 2. Dynamic evolution diagram of platform revenue adopting governance strategy under
different initial conditions

(2) If the government implements the governance strategy, the initial value of income
is set to Pg1 = 40, 50, and 60, respectively, as shown in Fig. 3. It can be found
in Fig. 3 that when the initial income of the government is low, it is easier to
promote the platform to adopt governance strategies, and it is more suitable to
take the middle state of the three curves. Therefore, to increase the overall benefits
of the digital innovation ecosystem, the government needs to formulate policies
and regulations to guide the platform to participate in governance actively. At the
same time, the platform attracts participants to participate in platform governance

Fig. 3. Dynamic evolution diagram of government revenue adopting governance strategy under
different initial conditions
58 L. Kewen and L. Junji

through operation to achieve the balance of the tripartite game. In addition, the
government’s increased punishment on the platform can not promote governance,
as shown in Fig. 4.

Fig. 4. Dynamic evolution diagram under different initial conditions of government punishment
on the platform

(3) Participants choose to participate in the governance strategy, and the initial income
value is set to Pu1 = 30, 45, and 60, respectively, as shown in Fig. 5. It can be
found in Fig. 5 that the lower-income of participants is conducive to promoting
the implementation of governance of the platform, reflecting that the platform and
participants are a mutually beneficial and symbiotic community of interests, which
is the essential feature of the digital innovation ecosystem.

Fig. 5. Dynamic evolution diagram of governance strategies adopted by participants under


different initial conditions
Evolutionary Game Analysis of Digital Innovation Ecosystem Governance 59

4 Conclusions and Recommendations


This paper constructs an evolutionary game model of behavior strategy of governance
subject composed of government, platform, and participants, and carries out simulation.
In the governance of the digital innovation ecosystem, each subject’s different strate-
gies lead to different evolution paths. It is necessary to seek the balance of interests
in contradictions and conflicts to develop the digital innovation ecosystem healthily
and orderly. The ideal state of governance effect is government macro supervision,
efficient platform governance, and participant self-discipline. The simulation analysis
shows that when the government adopts the governance strategy, the punishment on
the platform needs to be appropriate. When the platform chooses the governance strat-
egy and the profit-seeking motivation of competitive development, it will strengthen its
governance to enhance its core competitiveness and ensure the healthy development of
the digital innovation ecosystem. Participants directly or indirectly affect digital innova-
tion ecosystem governance and are an important external driving force of standardized
governance.
Firstly, to achieve digital innovation ecosystem governance, the government takes
practical and effective policies and measures to encourage and support innovation and
implements governance through the supervision mechanism and incentive policy guid-
ance platform. Secondly, as the core of the digital economy, the platform provides various
services for participants, such as car sharing. As a scientific and technological innova-
tion of travel services, it operates in public transport related to the national economy
and the people’s livelihood. While supporting innovation, the government should also
supervise the platform and participants to bear corresponding social responsibilities. The
government and the platform work together to build a governance framework to protect
participants’ rights and interests and social welfare through effective governance.
Therefore, establish an effective incentive and punishment mechanism, follow the
principle of “promoting innovation, subject fairness and welfare maximization”, and
coordinate the relationship between relevant stakeholders and their benefit distribution
through legislation and other policy levels to maximize the welfare of the whole digital
economy.

References
1. Yoo, Y., Henfridsson, O., Lyytinen, K.: Research commentary–the new organizing logic of
digital innovation: an agenda for information systems research. Inf. Syst. Res. 21(4), 724–735
(2010)
2. Yoo, Y., Boland, R.J., Jr., Lyytinen, K., Majchrzak, A.: Organizing for Innovation in the
Digitized World. Organ. Sci. 23(5), 1398–1408 (2012)
3. Nambisan, S., Lyytinen, K., Majchrzak, A., Song, M.: Digital innovation management:
reinventing innovation management research in a digital world. MIS Q. 41(1), 223–238 (2017)
4. Moore, J.F.: Predators and prey: a new ecology of competition. Harv. Bus. Rev. 71(3), 75–86
(1993)
5. Iansiti, M., Richards, G.L.: Information technology ecosystem: structure, health and perfor-
mance. Antitrust Bull. 51(1), 77–110 (2006)
60 L. Kewen and L. Junji

6. Schmeiss, J., Hoelzle, K., Tech, R.P.: Designing governance mechanisms in platform ecosys-
tems: addressing the paradox of openness through blockchain technology. California Manag.
Rev. 62(1), 121–143 (2019)
7. Rong, K., Lin, Y., Shi, Y., Yu, J.: Linking business ecosystem lifecycle with platform strategy: a
triple view of technology, application and organization. Int. J. Technol. Manage. 62(1), 75–94
(2013)
8. Chen, S.H., Lin, W.T.: The dynamic role of universities in developing an emerging sector: a
case study of the biotechnology sector. Technol. Forecast. Soc. Chang. 123, 283–297 (2016)
9. Borgh, M.V.D., Cloodt, M., Romme, A.G.L.: Value creation by knowledge-based ecosystems:
evidence from a field study. R&D Manag. 42(2), 150–169 (2012)
10. Amitrano, C.C., Coppola, M., Tregua, M.: Knowledge sharing in innovation ecosystems: a
focus on functional food industry. Int. J. Innov. Technol. Manag. 14(5), 1–18 (2017)
11. Adner, R.: Ecosystem as structure: an actionable construct for strategy. J. Manag. 43(1), 39–58
(2017)
12. Sawhney, M., Verona, G., Prandelli, E.: Collaborating to create: the internet as a platform for
customer engagement in product innovation. J. Interact. Mark. 19(4), 4–17 (2005)
13. Ansari, S., Garud, R., Kumaraswamy, A.: The disruptor’s dilemma: TiVo and the US television
ecosystem. Strateg. Manag. J. 37(9), 1829–1853 (2016)
14. Hanelt, A., Bohnsack, R., Marz, D., et al.: A systematic review of the literature on digital
transformation: insights and implications for strategy and organizational change. J. Manag.
Stud. (2020). https://doi.org/10.1111/joms.12639(2020)
15. Fukuyama, F.: What is governance? Governance 26(3), 347–368 (2013)
16. Gorwa, R.: What is platform governance? Inf. Commun. Soc. (2019). https://doi.org/10.1080/
1369118X.2019.1573914
17. Boudreau, K.J., Hagiu, A.: Platform Rules: Multi-Sided Platforms as Regulators. Working
Paper. Harvard University (2008)
Measurement and Comparison of Digital
Economy Development of Cities in China

Ping Han(B) and Jiao Li

Harbin University of Commerce, Harbin 150028, China

Abstract. As a new growth pole, the digital economy has become an important
force for competition among cities. Based on the existing research results, this
paper defines the connotation of the digital economy. Following the principles of
scientific, normative, effective, and conciseness, the index system of the digital
economy is constructed from four aspects: digital infrastructure, digital industri-
alization, industrial digitization, and the development environment of the digital
economy. Then measure and compare the digital economic development level of
15 sub-provincial cities from 2012–2019. The results show that the overall digi-
tal economy shows a steady upward trend, but there are great differences among
the 15 sub-provincial cities. The level of industrial digitization and development
environment of the digital economy is high, the level of digital infrastructure and
digital industrialization needs to be improved. According to the above conclusions,
suggestions are put forward to improve the development of the digital economy,
and these suggestions guide policy formulation in different cities.

Keywords: Sub-provincial cities · Digital economy · Measurement

1 Introduction
In 2018, China’s digital economy reached 31.3 trillion yuan, accounting for 34.8% of
GDP. The digital economy plays a core role in China’s economic growth. With the deep
integration of the new generation of information technology and urban construction,
smart and digital cities have become an important trend in future development, which
provides new opportunities for urban transformation. Digital development reflects in
industry, social governance, and so on. In 2018, the urban digital economy accounted
for more than 90% of the total digital economy. Shenzhen promotes the development
of 5G and industrial Internet based on the advantages of the electronic information
industry and software industry. Hangzhou leads the transformation of the service industry
by relying on e-commerce and other industrial digitization advantages and leading the
forefront of digital industrialization such as cloud computing. Each city closely combines
its industrial foundation, resource endowment with digital economic development and
builds a digital development road with its characteristics. sub-provincial cities are the
core cities in the region. Their level of digital economy development is between ordinary
cities and cities such as Beijing and Shanghai. These cities are likely to become a solid
force in promoting regional digital economy development in the future. However, there

© The Author(s), under exclusive license to Springer Nature Switzerland AG 2022


A. E. Hassanien et al. (Eds.): BIIT 2021, LNDECT 107, pp. 61–69, 2022.
https://doi.org/10.1007/978-3-030-92632-8_6
62 P. Han and J. Li

are few studies on the digital economy development of sub-provincial cities. Therefore,
it is necessary to accurately grasp the level of digital economy development of sub-
provincial cities and understand their difference, advantages, and disadvantages.

2 Literature Review
At present, the digital economy is a hot topic in academic circles. There are many research
results, which can be summarized into three categories. The first category is the impact of
the digital economy on other research objects, such as the impact of the digital economy
on the competitiveness of export products [1]. The second category is the research on the
digital economy itself, including connotation definition [2], statistical classification [3],
development difficulties [4], and other aspects. The third category is the research on new
formats, such as platform economy. There are few types of research on constructing index
systems of the digital economy and measuring the level of digital economy development
in different regions. At present, the construction of a digital economy index system has
two dimensions.
Constructing digital economy index system based on digital influence path. BEA
constructs an index system from four dimensions: the degree of digitization in the eco-
nomic field, the impact of digitization in economic activities, the new digital field, and
economic indicators [5]. OECD measures the level of digital economy development
from four parts: infrastructure, social use, related innovation, and the impact on the
economy and employment [6]. IDI constructs an index system from three aspects: ICT
access, ICT use, and ICT skills. CAICT constructs an index system from three aspects:
basic conditions of digital economy development, digital industrialization, and industrial
digitization.
Constructing a digital economy index system based on the external driving force of
the digital economy and its influence. WEF believes that the environment, readiness,
and application are driving forces for digital economy development, and they affect
the digital development of the economy and society. Therefore, WEF constructs an
index system from four aspects: environment, readiness, application, and influence.
Chuanhui Liu and Zhipeng Yang(2021) [7] constructs an index system from five aspects:
digital infrastructure, digital industrialization, scientific education, human resources, and
economic development.
The research on the construction and measurement of digital economy index systems
has progressed through literature review. Most of them take countries and provinces as
the research objects for relevant measurement, and few take cities as the research objects.
From the perspective of connotation, this paper defines the connotation of the digital
economy as follows: digital infrastructure is the premise of development, a digital indus-
try that provides technical support, is the leading industry, the traditional industry realizes
digital development and will deepen the depth and breadth in a good development envi-
ronment. The contributions of this paper are as follows: First, based on the connotation
of definition, the index system is constructed from four dimensions: digital infrastruc-
ture, digital industrialization, industrial digitization, and development environment. The
available index data are used for relevant measurement. Second, the development level
of the digital economy is analyzed from the perspective of sub-provincial cities.
Measurement and Comparison of Digital Economy Development 63

3 Indicator System and Research Method


3.1 Digital Economy Indicator System

Guided by the scientific, normative, effective, and concise principles, taking into account
data availability. According to the connotation of the digital economy, the index system
is constructed from four dimensions: digital infrastructure, industrial digitization, digital
industrialization, and development environment. As shown in Table 1.
In this paper, research object is 15 sub-provincial cities. They are Shenyang, Dalian,
Ningbo, Xiamen, Changchun, Harbin, Jinan, Qingdao, Nanjing, Hangzhou, Wuhan,
Chengdu, Xi’an, Guangzhou, Shenzhen.

Table 1. Digital economy indicator system.

Comprehensive index First-level Second-level indicators Weight Reference


indicators and measurement unit literature
Development index of Digital Number of end-of-year 0.1032 [7, 8]
digital economy infrastructure mobile phone users
(ten thousand people)
Number of Internet 0.1016 [7, 8]
broadband access users
(ten thousand people)
Digital Telecommunications 0.1738 [7, 8]
industrialization revenue (ten thousand
yuan)
Information 0.1566 [7, 9]
transmission and
computer services,
software industry (ten
thousand yuan)
Industrial Value added of the 0.0149 [10]
digitization primary industry (ten
thousand yuan)
Value added of the 0.0141 [10]
secondary industry (ten
thousand yuan)
Value added of the 0.0347 [10]
third industry (ten
thousand yuan)
Development Expenditure on science 0.2488 [8]
environment and technology (ten
thousand yuan)
(continued)
64 P. Han and J. Li

Table 1. (continued)

Comprehensive index First-level Second-level indicators Weight Reference


indicators and measurement unit literature
Number of students in 0.0961 [8]
colleges and
universities (ten
thousand people)
Digital inclusive 0.0562 [11, 12]
financial index
Note: The data are mainly from “China City Statistical Yearbook”, and some missing data are
supplemented by the average annual growth rate method

3.2 Research Method

In this paper, the entropy weight method is used to weight index. This method can
ensure the full use of original data information, make the evaluation results more objec-
tive. Based on the basic data of 15 sub-provincial cities from 2012–2019, we can obtain
the index weight in Table 1 and the digital economy development index in Table 2 and
Table 3.

Standardization
Due to the obvious dimension difference between the different indicators, in order to
ensure accuracy, this paper uses the extreme value method to standardize the original
data. The formula is as follows.
Positive indexes:
xij −min{xj }
xij = max{xj }−min{xj }
(1)

Negative indexes:
max{xj }−xij
xij = max{xj }−min{xj }
(2)

xij is the standardized data of the j-th index of the i-th research object, i = 1, 2, …,
m; j = 1, 2, …, n.

Calculate the Proportion of the j-th Index


x
pij = m ij (3)
i = 1 xij

Calculate the Entropy Value of the j-th Index

m
ej = −k ∗ i=1 pij lnpij (4)
Measurement and Comparison of Digital Economy Development 65

1
k=
lnm

Calculate the Difference Coefficient of the j-th Index

gj = 1 − ej (5)

Calculate the Weight of the j-th Index

g
wj = n j (6)
j = 1 gj

Calculation of Comprehensive Index

n
Si = j = 1 wj∗ xij (7)

4 Analysis of the Development Level of the Digital Economy


4.1 Analysis of the Development Level of the Digital Economy in Sub-provincial
Cities

Based on the relevant data of 15 sub-provincial cities from 2012–2019, this paper uses
MATLAB software to realize the entropy weight method. The comprehensive develop-
ment index for 15 sub-provincial cities is shown in Table 2. The analysis found that the
overall development level of the digital economy in each city shows a steady growth
trend from 2012–2019. This shows that the digital economy has gradually penetrated
all aspects of social life and is supported by the outside world. The development trend
is good. Guangzhou has been ranked first in the quality of digital economy develop-
ment from 2012 to 2019. The comprehensive development index of the digital economy
increased from 0.427 in 2012 to 0.6738 in 2019. These fully illustrate the strong digital
economy competitiveness of Guangzhou. From the average ranking, the average value
of the comprehensive development index of the digital economy is 0.4849, 0.4773, and
0.4718 in Shenzhen, Chengdu, and Wuhan from 2012 to 2019. Shenzhen, Chengdu,
and Wuhan rank second, third and fourth, respectively. These three cities have obvious
advantages over other cities. Shenzhen and Chengdu develop rapidly. Their compre-
hensive development index of the digital economy increased from 0.2723 and 0.2868
in 2012 to 0.6738 and 0.6456 in 2019, respectively. Wuhan has a good foundation for
digital economy development. Its comprehensive development index of the digital econ-
omy was 0.3773 in 2012, ranking second among 15 sub-provincial cities. With steady
growth, its comprehensive development index is 0.5899 in 2019.
66 P. Han and J. Li

The development of the digital economy in Changchun is weak. The average is


0.2465, ranking last. Guangzhou’s average is 2.3 times its average. The comprehensive
development index of Changchun is 0.1487 in 2012, ranking 14th. The comprehensive
development index of Changchun is 0.3207 in 2019, ranking 15th. This shows that the
development of the digital economy in Changchun is weak and slow. Nanjing, Hangzhou,
Xi’an, Jinan, Qingdao, Ningbo, Harbin, Shenyang, Xiamen, and Dalian are ranked fifth
to fourteenth in the average ranking. Overall, the quality of digital economy development
in eastern coastal cities is higher, and digital economy development in northeastern cities
is lower. The development gap between cities is very prominent.

Table 2. 2012–2019 Digital economy development index of sub-provincial cities.

City 2012 2013 2014 2015 2016 2017 2018 2019 Mean Rank
Shenyang 0.1944 0.2009 0.2306 0.2774 0.2306 0.3048 0.3448 0.3614 0.2681 12
Dalian 0.1944 0.1991 0.1959 0.2334 0.2624 0.2559 0.3312 0.3217 0.2486 14
Changchun 0.1487 0.1728 0.2076 0.2404 0.2745 0.2775 0.321 0.3207 0.2465 15
Harbin 0.1936 0.2414 0.2644 0.3167 0.3321 0.2678 0.2975 0.3641 0.2847 11
Nanjing 0.271 0.3176 0.355 0.4019 0.4588 0.4825 0.5121 0.5683 0.4209 5
Hangzhou 0.2864 0.3102 0.3473 0.382 0.4546 0.491 0.518 0.569 0.4198 6
Ningbo 0.1804 0.21 0.2489 0.2756 0.3291 0.3583 0.381 0.423 0.3009 10
Xiamen 0.1456 0.1641 0.203 0.2296 0.2775 0.3056 0.3414 0.3743 0.2551 13
Jinan 0.2225 0.27 0.2988 0.3308 0.3682 0.3612 0.3722 0.4652 0.3361 8
Qingdao 0.189 0.2205 0.3202 0.2826 0.3244 0.3602 0.3884 0.3998 0.3106 9
Wuhan 0.3773 0.3744 0.4114 0.4436 0.5048 0.5312 0.542 0.5899 0.4718 4
Guangzhou 0.427 0.4956 0.5103 0.5195 0.5658 0.6332 0.6891 0.7388 0.5724 1
Shenzhen 0.2723 0.3271 0.3673 0.4422 0.5675 0.5791 0.6084 0.6456 0.4773 2
Chengdu 0.2868 0.3214 0.3673 0.4422 0.5675 0.5791 0.6084 0.6456 0.4773 3
Xi’an 0.2689 0.3278 0.3428 0.3799 0.4242 0.4635 0.4554 0.5111 0.3967 7

4.2 Analysis of Each Dimension


The entropy weight method is realized by MATLAB software. The development index
for each dimension of 15 sub-provincial cities is shown in Table 3. From digital infras-
tructure, the average ranking of Guangzhou, Shenzhen, Chengdu, Wuhan, and Hangzhou
from 2012 to 2019 is the top 5, the average ranking of Harbin, Shenyang, Changchun,
Dalian and Xiamen from 2012 to 2019 is the bottom 5. We can see that most of the
top five cities are located in the eastern coastal areas or central developed areas with a
superior economic base. These cities pay more attention to the construction of digital
infrastructure and have an earlier layout. For example, Shenzhen proposed consolidating
new infrastructure construction and improving network access coverage and ipv6 user
penetration rate in “Implementation Plan of Shenzhen Digital Economy Industry Inno-
vation and Development”. Guangzhou put forward to improve infrastructure, promote
Measurement and Comparison of Digital Economy Development 67

the construction of “new infrastructure” project, and establish an efficient sharing mech-
anism of digital infrastructure in “Measures to Accelerate the Creation of a Leading City
for Digital Economy Innovation”. However, Harbin and Xiamen lack clear goals for the
construction of digital infrastructure. Although Changchun and Shenyang put forward
to promote digital infrastructure construction, they lack funds and technical investment.
Therefore, the ranking of the digital infrastructure index is relatively backward.
From digital industrialization, 15 cities are in a fluctuating upward trend from 2012
to 2019. The digital industrialization index of Chengdu, Shenzhen, Guangzhou, and
Xiamen increased rapidly, from 0.005, 0.17, 0.128, and 0.001 in 2012 to 0.032, 0.042,
0.033, and 0.017 in 2019. The average ranking of Shenzhen, Chengdu, and Guangzhou
from 2012 to 2019 is in the top 3. Guangzhou is located in China’s first array of sub-
division fields of the digital industry, such as the new generation communication and
satellite navigation industry, artificial intelligence industry, software, information tech-
nology service industry, big data industry, etc. Shenzhen gathers a group of high-end
software vendors, such as Jindie Tianyan, Tongxin software. Chengdu builds an inno-
vative development pilot zone to promote industrial Internet, software, big data, and
other industries. The average ranking of Harbin, Shenyang, Changchun, Qingdao, and
Ningbo from 2012 to 2019 is in the bottom 5. Their index grows slowly, from 0.003,
0.003, 0.002, 0.001 and 0.002 in 2012 to 0.006, 0.003, 0.003, 0.003 and 0.003 in 2019.
From industrial digitization, 15 cities are in a fluctuating upward trend from 2012 to
2019. The gap between cities is not very prominent. This shows that the role of the digital
economy is becoming more and more obvious in cultivating new momentum, improving
product and service quality. Cities are laying out to grab the commanding heights of a
new round of competition. The average ranking of Chengdu, Wuhan, Shenzhen, Nanjing,
and Hangzhou from 2012 to 2019 is the top 5, respectively 0.101, 0.095, 0.093, 0.09,
and 0.087. The average ranking of Harbin, Changchun, Xiamen, Dalian and Shenyang
from 2012 to 2019 is the bottom 5, respectively 0.081, 0.08, 0.078, 0.078, and 0.073.
From the development environment perspective, 15 cities are in a steady upward trend
from 2012–2019. This shows that the development environment stimulates the potential
of digital economy and injects vitality into the development of digital economy. The
average ranking of Guangzhou, Wuhan, Nanjing, Hangzhou and Shenzhen from 2012 to
2019 is the top 5. Their index grows from 0.186, 0.171, 0.138, 0.132 and 0.079 in 2012 to
0.451, 0.4, 0.369, 0.349 and 0.402 in 2019. There is a significant gap between these cities
and those with backward rankings. These cities attach great importance to personnel
training and technological innovation. Hangzhou, Shenzhen, Wuhan, Guangzhou and
Nanjing are all among the top 10 in science and technology innovation rankings in
“Science and Technology Innovation Competitiveness Report of Chinese Cities in 2020”.
These cities are strong in technological innovation and have little difference. The average
ranking of Shenyang, Ningbo, Xiamen, Dalian and Changchun from 2012 to 2019 is
the bottom 5. Their index grows from 0.068, 0.058, 0.063, 0.061 and 0.046 in 2012 to
0.231, 0.257, 0.246, 0.215 and 0.218 in 2019.
68 P. Han and J. Li

Table 3. Development index and mean value of four dimensions of 2012–2019 digital economy.

City Digital infrastructure Digital industrialization Industrial digitization Development


environment
2012 2019 Mean 2012 2019 Mean 2012 2019 Mean 2012 2019 Mean
Shenyang 0.024 0.044 0.033 0.003 0.003 0.003 0.099 0.084 0.073 0.068 0.231 0.159
Dalian 0.015 0.025 0.017 0.004 0.008 0.007 0.111 0.073 0.078 0.061 0.215 0.147
Changchun 0.009 0.033 0.021 0.002 0.003 0.003 0.092 0.076 0.079 0.046 0.218 0.143
Harbin 0.022 0.043 0.033 0.003 0.006 0.005 0.104 0.083 0.081 0.064 0.233 0.165
Nanjing 0.033 0.087 0.053 0.005 0.02 0.016 0.095 0.091 0.09 0.138 0.369 0.262
Hangzhou 0.052 0.101 0.077 0.011 0.023 0.017 0.09 0.097 0.087 0.132 0.349 0.239
Ningbo 0.035 0.072 0.054 0.002 0.003 0.003 0.085 0.091 0.086 0.058 0.257 0.158
Xiamen 0.005 0.023 0.012 0.001 0.017 0.006 0.077 0.088 0.078 0.063 0.246 0.158
Jinan 0.023 0.061 0.04 0.002 0.008 0.008 0.085 0.121 0.086 0.112 0.275 0.203
Qingdao 0.036 0.062 0.06 0.001 0.003 0.003 0.089 0.075 0.085 0.063 0.26 0.163
Wuhan 0.065 0.095 0.081 0.004 0.012 0.007 0.137 0.083 0.095 0.171 0.4 0.288
Guangzhou 0.143 0.162 0.151 0.013 0.033 0.022 0.086 0.093 0.087 0.186 0.451 0.313
Shenzhen 0.091 0.134 0.125 0.017 0.042 0.028 0.086 0.095 0.093 0.079 0.402 0.238
Chengdu 0.06 0.163 0.114 0.005 0.032 0.027 0.099 0.106 0.101 0.123 0.345 0.235
Xi’an 0.052 0.085 0.072 0.01 0.013 0.012 0.09 0.091 0.086 0.116 0.321 0.227

5 Conclusions and Recommendations


Based on the existing research results, this paper constructs the index system from
four dimensions of digital industrialization, industrial digitization, digital infrastructure,
and development environment according to the connotation of the digital economy.
This indicator system consists of four primary indicators and ten secondary indicators.
The entropy method is used to empower the index. By analyzing the digital economy
development level of 15 sub-provincial cities, we can draw the following conclusions:
Overall, the digital economy development level of 15 sub-provincial cities is in a
steady upward trend, and there are large differences between cities. From the aver-
age level, Guangzhou has the highest level of digital economy development. The aver-
age index of the digital economy in 2012–2019 is greater than 0.5. The gap between
Guangzhou and other cities is obvious. The second is the development level of 5 cities,
Shenzhen, Chengdu, Wuhan, Nanjing, and Hangzhou. The average index of the five cities
in 2012–2019 is between 0.4 and 0.5. The digital economy of these five cities is gaining
momentum. The development level of the 4 cities, which are Xi’an, Jinan, Qingdao, and
Ningbo, is in the middle. The average index of the four cities in 2012–2019 is between
0.3 and 0.4. The development level of 5 cities, Harbin, Shenyang, Xiamen, Dalian and
Changchun, is relatively weak. The average index of the five cities in 2012–2019 is
between 0.2 and 0.3. There is a significant difference between the head city and the tail
city. From all dimensions, the index of industrial digitization and development environ-
ment is relatively high. The index of digital industrialization and digital infrastructure
is relatively low.
Measurement and Comparison of Digital Economy Development 69

Based on the above research conclusions, the following implications can be obtained.
The digital economy, as a new economy form, is deeply integrated with the real economy.
It provides new opportunities for urban development, such as industrial digitization, dig-
ital governance, application, etc. However, there is still room for further improvement
in the development of the urban digital economy. The development differences between
cities can develop a digital economy according to factor endowments and economic
conditions. Specifically, the government of the Northeast region should give relevant
policies to promote infrastructure construction, including 5G base stations, cloud com-
puting centers, and industrial Internet. In addition, the government should also build
digital infrastructure to improve the digital transformation of traditional industries and
stimulate more potential users. More importantly, the government should also pay atten-
tion to scientific and technological innovation, because science and technology is the
core driving force for the development of the digital economy. The digital industry foun-
dation of eastern coastal cities is good, and these cities should focus on key technological
breakthroughs. In addition, cities in the Northeast need to strengthen cooperation with
developed cities to drive the innovative development of the digital economy.

References
1. Wang, Y.: Research on The impact of digital economy on the competitiveness of export prod-
ucts—taking the Yangtze river delta city group as an example. Price Theory Pract. 15183(2),
149–153 (2021)
2. Li, H.J., Zhang, J.L.: Some understandings on the definition of digital economy. Enterpr.
Econ. 20279(7), 13–22 (2021)
3. Liu, W., Xu, X.C., Xiong, Z.Q.: the international progress of digital economy classification
and China’s exploration. Finan. Trade Econ. 8299(7), 1–17 (2021)
4. Yu, Y.W., Chen, G.X.: Related issues and policy recommendations for the development of
China’s digital economy. Southwest Finan. 12472(7), 39–49 (2021)
5. Barefoot, K., Curtis, D., Jolliff, W., Nicholson, J.R., Omohundro, R.: Defining and Mea-
suring the Digital Economy. US Department of Commerce Bureau of Economic Analysis.
Washington, DC, 15 (2018)
6. Colecchia, A., et al.: Measuring the digital economy: a new perspective. In: OECD (2014)
7. Liu, C.H., Yang, Z.P.: Analysis of the characteristics of the temporal and spatial differences in
the measurement of urban agglomeration digital economy: taking six urban agglomerations
as examples. Mod. Manag. Sci. 9924(4), 92–111 (2021)
8. Wang, J., Zhu, J., Luo, Q.: China’s digital economy development level and evolution
measurement. Quant. Econ. Tech. Econ. Res. 7626(7), 26–42 (2021)
9. Liu, J., Yang, Y.J., Zhang, S.F.: Research on the measurement and driving factors of China’s
digital economy. Shanghai Econ. Res. 6648(6), 81–96 (2020)
10. Zhong, W., Zheng, M.G.: The influence and mechanism of digital economy on regional
coordinated development. J. Shenzhen Univ. (Human. Social Sci. Ed.) 4664(4), 79–87 (2021)
11. Zhao, T., Zhang, Z., Liang, S.K.: Digital economy, entrepreneurial activity and high-quality
development: empirical evidence from Chinese cities. Manag. World 8482(10), 65–76 (2020)
12. Guo, F., Wang, J.Y., Wang, F., Kong, T., Zhang, X., Cheng, Z.Y.: Measuring the development
of China’s digital financial inclusion: index compilation and spatial characteristics. Econ. (Q.)
1458(4), 1401–1418 (2020)
Occupational Risks and Guarantee Path
of the Delivery Man in Digital Economy

Jin Daizhi and Wang Han(B)

Harbin University of Commerce, Harbin 150028, Heilongjiang, China

Abstract. With the digital economy thriving, there are gaps and problems in
protecting the occupational rights and interests of the delivery man that should be
solved urgently. Under Institutional Analysis and Development (IAD) framework,
some deeper reasons lead to risks about safety, health, stability, and insurance,
including digitized involution of platform companies, evaluation replacement in
social networks, the dilemma of livelihood, and problems of social security cov-
erage. In order to remove occupational risks and protect the occupational rights
and interests of the delivery man effectively, the government and forms should
upgrade working conditions and workflow, improve wages and provide training,
design evaluation mechanism, and industry guidelines, strengthen the basic social
insurance, and occupational injury insurance exploration.

Keywords: Digital economy · IAD framework · The delivery man ·


Occupational risks

1 Introduction
Digital economy research is the current hot topic, and the research of digital economy
and people’s welfare is the hot topic of China in the future [1]. Under the digital econ-
omy, platform and sharing are closely related to employment and social security, among
which the practitioners of new employment forms have new characteristics and needs.
The COVID-19 caused the development of online consumption and the digital economy
at a faster speed. Therefore the number of new employment forms of employees soared,
such as food delivery and online ride-hailing. And the protection of their rights and inter-
ests has got more and more attention. In recent years, scholars from many countries have
paid attention to the guarantee of practitioners of new employment forms. Kannan K.P.
and Papola T.S. (2007) proposed to improve the quality of employment in the informal
sector by improving income, working conditions, employment, and social protection [2].
But Rosenblat A. and Stark L. (2016), who researched Uber, denied classifying flexible
ride-hailing drivers as employees for social security, believing that if platforms classified
the drivers as employees, they would suffer great pressure. Their business model would
also be greatly impacted [3]. Harris S.D. and Krueger A.B. (2015) also studied and
argued that emerging working relationships in the Internet economy prevented indepen-
dent workers from obtaining the rights of traditional employees [4]. Chang K. (2019)
pointed out that the government should confirm the basic framework of the employment

© The Author(s), under exclusive license to Springer Nature Switzerland AG 2022


A. E. Hassanien et al. (Eds.): BIIT 2021, LNDECT 107, pp. 70–77, 2022.
https://doi.org/10.1007/978-3-030-92632-8_7
Occupational Risks and Guarantee Path of the Delivery Man 71

relationship, strengthen and improve the rules of labor laws in order to realize the internal
unity of workers’ basic rights and interests and sustainable development of the Internet
economy [5]. In conclusion, scholars from all countries have studied the security of the
practitioners of new employment forms from the perspectives of social security and labor
relations. However, the delivery Man, which is the subdivision of these practitioners,
is still under discussion, and there is also a lack of specific research on using policy
tools. As one of the main new employment forms at present, food delivery workers face
occupational risks and difficulties under an immature industry system, and the protec-
tion of rights and interests is insufficient. From the digital economy and industry policy
perspective, the institutional situation, occupational risks of takeout workers, the path
of rights and interests protection can be analyzed using policy tools.

2 The System Scenario of Takeout Workers in the Digital Economy

Ostrom E., the 2009 Nobel Prize winner in Economics, proposed Institutional Analy-
sis and Development framework (IAD framework), which reversed the view that the
institutional arrangements should be set by single part between government and market,
proved that common-pool pond resources can be governed better through reasonable
institutional arrangements by two parts of government and market, and this framework
has great value in economic governance and public resource governance [6]. As shown
in Fig. 1, the catering service in the digital economy as a new form is still in the explo-
ration of governance policy. The IAD framework can be used to study the independent
governance situation of catering industry market share resources in the digital economy,
analyze the system environment of takeout workers, and find the access point to analyze
takeaway workers’ occupational risk guarantee path.
Finally, external factors, action stage, interaction mode, evaluation criteria and results
influence each other, thus forming the catering industry system Institutional Analysis and
Development framework where food delivery workers are located. In this framework,
the market profitability of catering service public resources is the material condition, the
user consumer groups and the common market space of the catering service industry is
the community attribute, and the application principles are the confine, identity, choice,
information, aggregation, pay and range, which are put forward by Ostrom E.. In the
action stage, catering service enterprises, delivery workers, merchants, customers, the
government and other actors continue to interact [7]. In the face of relatively stable
market returns, catering service enterprises that have main control power among actors,
such as Meituan and Ele.me, must inevitably use digital platforms and algorithms to
compete for user groups and market space, and enterprises and takeout workers continue
to combat each other. As this emerging industry is in a stage of rapid development and the
standards in the industry have not been fully established, timely government intervention
and governance that can be integrated with market is particularly important.
72 J. Daizhi and W. Han

Fig. 1. The catering industry system Institutional Analysis and Development framework where
food delivery workers are located

3 The Occupational Risks of Takeout Workers Under the Digital


Economy

Under the action scenario and interaction, delivery workers in the digital economy face
various risks, including safety, health, stability, insurance, etc.

3.1 Safety Risks

Firstly, due to the low safety factor of electric bicycles and motorcycles, delivery rid-
ers often face high risks in the delivery process. Secondly, under the promotion and
requirement of platform enterprises, these workers must complete the food delivery
tasks intensively as soon as possible, which makes them have relatively great cycling
safety risks. In lunch and dinner time in big cities, dangerous scenes such as rushing
for red lights, speeding, and sudden turning without warning can be seen everywhere.
Therefore, the possibility of traffic accidents of the delivery man is high [8]. Thirdly,
the professional nature of food delivery workers has the characteristics of long working
time and large working intensity, which makes it difficult to ensure travel safety with full
energy, resulting in huge safety risks. Finally, owing to the urgent work, it is possible to
conflict with customers, security guards and other groups, and the delivery workers also
face the risk of malignant injury.

3.2 Health Risks

Over labor phenomenon is increasingly prominent. The delivery often work late at night
or have no rest off, and at the same time, their three meals a day can not be guaranteed.
Irregular work conditions for a long time may cause sub-health problems such as sleep
deprivation and heavy physical burden. Long-term excessive labor will not only lead to
physical fatigue but also may even cause karoshi. In addition, frequent abuse and bad
comments may affect the mental health of food delivery workers.
Occupational Risks and Guarantee Path of the Delivery Man 73

3.3 Stability Risks

Takeaway workers have flexible work, unstable income, and unstable labor relations.
In the delivery work dominated by the algorithm, takeaway workers often face passive
work conditions. Their lives and workplaces often change, and their work income also
changes greatly, making them unstable and making delivery workers’ family care to face
great uncertainty.

3.4 Insurance Risks

At present, the vast majority of delivery workers are not included in industrial injury
insurance. Some large-scale platform enterprises, such as Meituan, provide accident
insurance for takeout riders. Still, this guarantee has the characteristics of large arbitrari-
ness, small guarantee scope, and low guarantee level. Due to the differences in benefits
and systems, large platform enterprises can provide commercial insurance for food deliv-
ery workers. In contrast, small enterprises do not pay attention to the occupational risks
of food delivery workers and take some measures [9]. These phenomena reflect that the
new forms of business practitioners, such as delivery workers have not been covered in
the current guarantee of basic social insurance, and their worries have not been relieved.

4 Reasons for the Occupational Risks of Takeout Workers Under


the Digital Economy
Under the current digital economy, digitized involution of platform companies, the
replacement of evaluation power in the social networks, the dilemma of their own income
and expenditure, and the coverage problems of social security result in the upgrading of
occupational risks of takeout workers and the suspension of rights and interests protection
in the catering industry system Institutional Analysis and Development framework.

4.1 Digitized Involution of Platform Companies Leads to the Cycle


and Upgrading of Occupational Risk

In the behavioral situation, Meituan, Ele. me and other food delivery platforms hold
asymmetric control power and information, and obtain the high proportion of revenue.
In order to obtain market share and gain greater benefit, the market would compress the
delivery time, capture the best route, and reduce the delivery cost of consumers. With
extreme competition, the system setting to promote the optimal allocation of market
resources causes the involution, then the risk falls on the single merchant and delivery
man. The delivery situation after struggles of speed and time has gradually become a
general requirement. And the short rest after finding a new route has been automatically
compressed by the optimization of the platform route, so the income of each order
has been decreasing in recent years. Under such digital innovation, delivery workers
are involved in the platform competition and can only face the occupational risks with
digital requirements.
74 J. Daizhi and W. Han

4.2 The Replacement of the Evaluation Right in the Social Networks Leads
to the Increased Uncertainty of Occupational Income Risk

In the evaluation system of the work of delivery workers, there are four parts, includ-
ing platforms, merchants, delivery workers, and customers. Still, the evaluation right is
assigned to platforms and customers, which constitute an interactive platform pattern
that differs from the single evaluation of workers by enterprises in the traditional labor
relationship. Under the platform’s operation, the integration of the expectations of cus-
tomers and the platform’s requirements has elevated time requirements, and merchants
and delivery workers accept the time restrictions together. In the end, because delivery
workers have contact with customers, the overtime evaluation risk may be more calcu-
lated on the delivery workers. And uncertainties such as customer mood and possible
misunderstanding reflected in the platform also affect the work evaluation and income of
the delivery man directly. As customer feedback will directly affect the income and per-
formance, the excessive evaluation rights increase the occupational risk of the delivery
workers.

4.3 The Dilemma Under the Living Income and Expenditure Leading
to the Situation that the Delivery Man Can Not Avoid Risks by Themselves

As one of the actors of the catering industry system Institutional Analysis and Devel-
opment framework, delivery workers can not stabilize their income and expenditure
and cannot resist professional risks. According to Amartya Sen’s discussion on the lack
of capability, delivery workers have capability problems such as impaired economic
conditions and obstruction of interest expression channels [10]. As an individual under
the market economy, delivery workers face the balance of income and expenditure,
family life maintenance, and savings. At the same time, they also face the continuous
improvement of platform enterprises and strict punishment mechanisms. Under the high
pressure at both ends, the delivery man can only speed up and increase the quantity in
this dilemma, otherwise, they would seek another job. Living income and expenditure
require delivery workers to work faster and more. Even under the obstacles of complex
transportation environment, urban housing construction layout, and hot or cold weather,
delivery workers have no more choices. In short, risks attached to the profession can not
be avoided or selected.

4.4 Problems of Social Security Coverage Leads to the Situation that Worries
of Occupational Risks Can Not Be Relieved

Practitioners of new business is not covered by social security network built by govern-
ment which is one of the actors. With the guidance of CPCCC and actively exploration of
provinces, the security of these workers is developing, but there are still some coverage
problems because of labor relations restrictions, household registration restrictions, the
threshold of payment rate, immature occupational injury insurance pilots exploration.
Basic social insurance such as endowment insurance, medical insurance and industrial
injury insurance are mostly bound to labor relations, and it is difficult to define the labor
Occupational Risks and Guarantee Path of the Delivery Man 75

relationship between practitioners (like delivery workers) and platforms under new busi-
ness forms, so that finding the fit point under the institutional threshold is hard; payment
rates are too high for low-income delivery workers, which makes the willingness to
participate in the insurance reduced; remote settlement, transferring and other handling
problems exist, in the meantime, the workplace of the delivery man is changing all the
time, and the household registration may be inconsistent with the workplace, so the
contradiction between these two conditions hinders the implementation of the social
security rights of some delivery workers who have entered the social insurance network;
although the occupational injury insurance pilots has achieved some results, the number
of insured people is small, and this insurance is linked with social insurance, so further
exploration is still needed.

5 The Ideas of Protecting the Professional Rights and Interests


of Takeout Workers Under the Digital Economy
Under the action scenario and application rules of the catering industry system Insti-
tutional Analysis and Development framework, combined with their occupational risks
and reasons of risks, it can be done to analyze the guaranteed path of the delivery man.

5.1 Upgrade the Transportation and Workflow of Delivery Workers Starting


When the ‘action’ Is Assigned to the ‘identity’
Improving their transportation and workflow can resist the safety risks of takeout work-
ers. First of all, electric bicycles and motorcycles may cause accidents easily. If the deliv-
ery man drives electric vehicles with mature technology, the number of accidents will
dwindle significantly. Catering service enterprises can cooperate with automobile enter-
prises to introduce small electric vehicles which are safe, energy-saving, convenient, and
light to gradually replace the means of transportation currently used by delivery workers.
Meanwhile, visual communication equipment for delivery can be installed on vehicles.
In addition, the current direct distribution relationship between food delivery workers
and customers can be further separated and upgraded. The community can cooperate
with the property company to establish takeout points fixed with constant temperature
and safe storage equipment and employ special-assigned workers to carry out the final
connection of food delivery to reduce the work link of delivery workers improve the
efficiency and resist the risks.

5.2 Improve the Welfare of Delivery Workers and Start Employment Training
to Empower ‘actors’
The special minimum wage guarantee system of takeout workers should be formulated in
combination with features of employment and income to prevent takeout workers from
being trapped under the algorithm of the platform economy. On the premise of ensuring
the minimum wage, the firms should also ensure the normal diets of three meals a day,
adopt shift and zoning, provide subsidies of hot and cold days, and ensure the normal
holidays. In addition, the work training of the whole delivery process and psychological
counseling should be carried out to reduce conflicts and difficulties in work, improve
work satisfaction, maintain mental health, and resist the health risks efficiently.
76 J. Daizhi and W. Han

5.3 From the Perspective of ‘control Power’ Balance, the Evaluation Mechanism
and Industry Guidelines Should be Formulated

To solve the problems of the stability risks, the government should intervene in the
catering industry to reduce corporate control power in action scenarios. In the current
evaluation way, data statistics under platforms and customers make work performance
anamorphic. The unreasonable evaluation mechanism caused by inconsistent overtime
standards, unreasonable time compression, and subjective arbitrariness of customers
affects work effect and damages delivery man’s professional rights and interests. The
government should cooperate with the top enterprises of the platform to jointly formulate
a sustainable work evaluation mechanism that is in line with the rights of the delivery
man and the business model of the platform. Removing force majeure from overtime
reasons, and ensuring no direct income deduction from negative comments should be
considered. After establishing government and firms, the reasonable evaluation mecha-
nism for delivery workers could be adjusted according to the applicable situation, and
smaller platform enterprises can also adopt it. In addition, based on in-depth research, the
government can make ranking lists of catering service firms termly by work satisfaction
of delivery man and customer satisfaction to prevent monopoly and vicious competi-
tion of platform head enterprises. And these enterprises should set distribution schemes
according to the normal work intensity and time instead of squeezing distribution time
and route of the delivery man and challenging their body limit [11].

5.4 With the Scope of Public Policy of the Government Which Is One of ‘actors’,
Strengthen the Expansion of Basic Social Insurance and the Exploration
of Occupational Injury Insurance

To strengthen the coverage of basic social insurance and to explore occupational injury
insurance systems constantly can resist insurance risk of delivery workers. Social secu-
rity is the basic institutional arrangement of the state in the field of risk management,
aiming to ensure the basic survival, basic development and rights and interests of all
members of society [12]. At present, China has built the world’s largest social security
system. Social security will move from ‘wide cover’ to ‘full cover’ in the 14th Five-Year
Plan period, [13] ‘Improve the universal social security system, promote the development
of sustainable development’ is the overall goal of China’s 14th Five-Year period. Pro-
moting the delivery man to be with social insurance is the inevitable requirement of ‘full
cover’ [14]. Now, the reverse choice problem is prominent in new forms employment
groups such as employment group did not pay basic endowment insurance and medical
insurance. The government should focus on the nature of these groups, actual income
level to determine the reasonable social security rate and provide a convenient payment
method, which can make most workers be included in the scope of the social security
system. Furthermore, in terms of occupational injury insurance currently being tested,
under the principle of being pilot first before promotion, all provinces should strengthen
the top-level design, increase the pilot cities constantly, and explore the experience.
Occupational Risks and Guarantee Path of the Delivery Man 77

6 Conclusion
The delivery man is an important medium for catering enterprises and consumers. Reduc-
ing their professional risks and improving the protection of their rights and interests are
the key points to making the catering service industry vibrant and improving consumers’
satisfaction. Continuing to implement the guarantee mechanism and gradually reducing
the occupational risks of delivery workers can effectively improve the work satisfaction,
bring more benefits to catering enterprises, and let the majority of consumers enjoy con-
venient and fast delivery services cheerfully to improve the well-being of people and
promote the sustainable and healthy development of the digital economy.

References
1. Pu, D.X., Huo, H.F.: Hot spots, trends and prospects of digital economy research. Stat. Decis.-
Mak. 15, 9–13 (2021)
2. Kannan, K.P., Papola, T.S.: Workers in the informal sector: initiatives by India’s national
commission for enterprises in the unorganized sector (NCEUS). Int. Labour Rev. 146(3–4),
321–329 (2007)
3. Rosenblat, A., Stark, L.: algorithmic labor: a case study of uber’s drivers. Social Sci. Electron.
Publ. 10(27), 3758–3784 (2016)
4. Harris, S.D., Krueger, A.B.: A proposal for modernizing labor laws for 21th century work:
the independent worker. Discuss. Paper (12), 10 (2015)
5. Chang, K., Zheng, X.J.: An employment relationship or a partnership? an analysis on the
nature of labor relations in internet economy. J. Renmin Univ. China 33(02), 78–88 (2019)
6. Wang, Q.: Review of the ostrom institutional analysis and development framework. Econ.
Dyn. 04, 137–142 (2010)
7. Ostrom, E., Gardner, R., Walker, G.: Rules, Games, and Common Pool Resources. University
of Michigan Press, Ann Arbor (1994)
8. Hu, F.Z., Bai, Z.Y., Yao, Y.R.: The dilemma and countermeasures of excessive labor and
occupational injury guarantee: sell riders for example. China Econ. Trade J. (China) 02,
142–144 (2021)
9. Zhou, Y.B.: Thoughts on the occupational injury protection problem of new business form
practitioners. China Human Res. Social Secur. 10, 9–11 (2020)
10. Deng, D.S., Wu, F., Xiong, Y.: Study on the occupational injury guarantee of the new
generation of migrant workers in China. Jiangxi Social Sci. 34(11), 193–197 (2014)
11. Guan, B., Wang, Z.: Protection of The rights and interests of new employment youth: dilemma,
adjustment and problem. Chinese Youth Stud. 04, 22–28 (2021)
12. He, W.J.: Promote a fairer, more sustainable, and more efficient social security. People’s
Forum 31, 37–39 (2019)
13. He, W.J.: From ‘wide cover’ to ‘full cover.’ China Social Secur. 05, 42–43 (2013)
14. Han, Y.: Adhere to the attribute of social insurance, and expand the pilot project of occupational
injury security. Human Res. Social Secur. China (04), 58 (2021)
Research on E-commerce Recommended
Algorithm Based on Knowledge Graph

Yu Zhang1,3,4(B) , Jingming Ye1 , Xin Yue2 , Sifan Wei1 , Yu Wang1 , and Wanli Ren1
1 Harbin University of Commerce, Harbin 150028, China
2 Heilongjiang University of Science and Technology, Harbin 150022, China
3 Heilongjiang Provincial Key Laboratory of Electronic Commerce and Information Processing,
Harbin 150028, China
4 Heilongjiang Cultural Big Data Theory Application Research Center, Harbin 150028, China

Abstract. The traditional e-commerce recommendation algorithms are difficult


to extract effective information from massive data to meet users’ needs. This
paper proposes a new algorithm, which integrates user’s review with commodities
knowledge graph, and constructs corresponding entities. It also extracts feature
words and semantic relations to form a set, obtains product feature weights, and
calculates the recommended value of commodities by using random walk algo-
rithms. The experimental results show that the new algorithm has a certain degree
of improvement in precision, recall rate, and F-value compared with the traditional
recommendation algorithm.

Keywords: E-commerce · Knowledge graph · Recommendation algorithm

1 Introduction
E-commerce [1] is a model to complete transactions through the web platform. The
amount of commodity information increases with its wide application, which leads to
information overload. Commodity recommendation technology, meanwhile, was born
and is developing. It aims to extract and analyze the relevant information of goods
searched and purchased by users in the past, then recommend goods of interest back
to them through algorithms. Due to the traditional one’s is difficult to meet the needs
of users. This paper proposes an improved recommendation algorithm [2] based on
knowledge graph [3], improving the algorithm’s accuracy and enhancing user experience.

2 Research and Analysis of Recommendation Algorithm


in E-commerce
The recommendation algorithm has always been a hot topic in e-commerce research.
Taking “E-commerce recommendation algorithm” as the keyword, the author searched
for a total of 1036 related literature in CNKI within the period from July 3, 2016 to July 3,
2021. According to the literature analysis, the research of e-commerce recommendation

© The Author(s), under exclusive license to Springer Nature Switzerland AG 2022


A. E. Hassanien et al. (Eds.): BIIT 2021, LNDECT 107, pp. 78–87, 2022.
https://doi.org/10.1007/978-3-030-92632-8_8
Research on E-commerce Recommended Algorithm Based on Knowledge Graph 79

algorithms was on the rise from 2016 to 2018 and peaked in 2017. However, during the
outbreak of COVID-19, namely, from the end of 2019 to the end of 2020, the number of
published papers decreased significantly, indicating that scholars in this field paid less
attention due to the epidemic. Since 2021, the number of published papers has been on
the rise again, indicating that scholars have re-engaged in the research in this field again.
Therefore, it can be preliminarily estimated that the epidemic has a negative impact
on the research of e-commerce recommendation algorithm. The number of published
papers on “recommendation algorithm” in the e-commerce field is 254, accounting for
19.39% of the total published papers, indicating a hot research topic in recent years.
Through the network knowledge graph analysis of keywords of electronic com-
merce recommendation algorithm, the number of occurrences of keywords such as “rec-
ommended system”, “collaborative filtering”, “personalized recommendation”, “recom-
mendation algorithm”, “electronic commerce”, “matrix decomposition”, “similarity” is
at the forefront, which is the hot research topic on electronic commerce recommenda-
tion algorithm. The keywords with high centrality [4] are “recommendation algorithm”,
“recommendation system”, “personalized recommendation” and “collaborative filter-
ing", in centrality of 0.45, 0.34, 0.32 and 0.27, respectively. In general, the keywords
with a centrality greater than 0.2 indicate that the keywords play the role of research
supporting point in the whole research field and are the key research object.
Based on the keyword clustering analysis of the research of e-commerce data, the
keywords are clustered into 9 categories, namely, personalized recommendation, recom-
mendation algorithm, matrix decomposition, collaborative filtering, Spark, deep learn-
ing, trust, clustering, and similarity. The Q value of the cluster module is 0.3746, and the
average contour S value of the cluster is 0.6835. It is generally believed that the cluster
structure is significant when the Q value is greater than 0.3, and that the clustering results
are convincing when the S value is greater than 0.6 [4].
According to the analysis of 1036 selected literature, it is found that after the outbreak
of the epidemic, Chinese scholars’ enthusiasm for the study of e-commerce recommen-
dation algorithm has decreased significantly, and the number of published papers has
continued to decline and even reached the lowest point in recent years. However, rec-
ommendation algorithm is still a hot research topic in the e-commerce field. In order to
mitigate the impact of the epidemic on the e-commerce field, the author improved the
existing recommendation algorithm by combining the knowledge graph to help solve
the problem of information overload and make the e-commerce field develop better.
The traditional recommendation algorithms mainly include collaborative filtering
algorithm [5], content-based recommendation algorithm [6], and hybrid recommenda-
tion algorithm [7]. With the development of the “Internet” era and the constant increase
of data and information day by day, the demands of users are constantly changing, so it
becomes more difficult to find the required information in the massive data. Therefore,
a perfect and efficient recommendation system can play a great role. However, there
still exists many problems of recommendation system, such as content-based recom-
mendation algorithm that has the problems of feature extraction difficulty, unable to
recommend new items and cold start problem, the collaborative filtering algorithm that
has the data sparseness problem and cold start problem, and the hybrid recommendation
algorithm that has the problem of increased complexity and increased operating cost.
80 Y. Zhang et al.

The disadvantages of these traditional recommendation algorithms provide users with


poor user experience. The above recommendation systems usually only focus on the
connection between users and commodities but lack similarity between commodities.
Therefore, a new recommendation algorithm integrated the commodity entity existing
in the knowledge graph with the user comment is proposed in this paper to study the
similarity between user preferences and commodities and to improve the accuracy of
recommendation and the fine granularity of commodity information by extracting com-
modity feature words and users’ sentiment words setting on commodities and introducing
weight between emotional factors and commodity similarity, providing users with more
accurate commodity recommendations and improving their satisfaction.

3 Research on New Recommendation Algorithm

3.1 Integrate Commodity Entities and User Comments in the Knowledge Graph

The feature words of the commodity include the performance, cost performance, appear-
ance, user experience, and other features directly related to the commodity and describe
the attributes of the commodity. The sentiment words include the polarity and strength
judgment of the user emotion on the commodity. By extracting a large number of feature
words and sentiment words, the general opinion of the users on the commodity can be
obtained.

Extract Commodity Features and Sentiment Words. The traditional semantic rela-
tion LDA [8] has disadvantages as it is difficult to extract commodity feature words
with low word frequency and meet the allocation requirements of fine-grained words.
Therefore, the SRC-LDA [9] model is used to extract the feature words and sentiment
words.

Firstly, the author fused the user comments with the commodity feature set and
sentiment word set in the knowledge graph and pre-processed the data. The commodity
entity is W = {w1 ,w2 ,…,wn }, the total feature set is F = {f 1 ,f 2 ,…,f n }, and the total
sentiment word set is L = {l1 ,l 2 ,…,l n }, where wi is the ith commodity entity, fi is
the feature word set of the ith commodity, and li is the sentiment word set of the ith
commodity.

Building Semantic Relationships. In this paper, we constrained the feature words and
sentiment words of commodities to extract more accurate feature words and sentiment
words that meet the demands of users, introduced the semantic constraint of must-link
[10] so that the words that meet the must-link relation are allocated to the same set, to
find more fine-grained feature words and emotion words by exploring the correlation
between different words through the semantic relation constraint.

Obtain the Must-Link Between Feature Words. There are words with the same semantic
meanings in users’ comments, such as “cheap”, “cost-effective”, and others. These words
have a strong must-link relationship and can replace each other, that is, this type of words
should be allocated to the same set as far as possible. The semantic relationship between
Research on E-commerce Recommended Algorithm Based on Knowledge Graph 81

feature words can be obtained by formula (1). Where S(f i , f j ) is equal to 1, representing
that there is a must-link relationship between commodity entity wi and wj , namely, high
synonymy. The same relationship between the feature words can be obtained by referring
to Corpus Annotation Expanded Edition.

1, if fi and fj have must − link
S(fi , fj ) = (1)
0 else

Must-link between Feature Words and Emotion Words. The judgment rules are as
follows: if a single sentence satisfies the basic subject-verb relationship, it can be roughly
concluded that the sentence’s subject is the feature word, and the corresponding adjective
is the emotion word. Based on the semantic relationship strength formula OES-PMI [9],
the semantic relationship between feature words and sentiment words can be obtained by
formula (2). Where, ζ 1 is the frequency threshold of feature words, ζ 2 is the frequency
threshold of emotion words, q(f i ) is the word frequency of candidate feature words,
q(l i ) is the word frequency of candidate emotion words, qc (f i , l i ) is the co-occurrence
frequency of candidate feature words and emotion words.
 
 lg qc (fI , li ) 
OES − PMI (fI , li ) =  , q(f ) < ξ1 , q(l ) < ξ2 (2)
lg q(fi ) lg(q(li ) − qc (fI , li )  i i

Selecting an appropriate threshold number, and taking the relational words of the
candidate feature words and sentiment words with the relational value is greater than a
certain threshold value, to form the feature word-sentiment word set, denoted as q = {q1 ,
q2 , …}, qi indicates the stronger semantic relationship between sentiment words and
feature words. To sum up, a feature word set based on sentiment words can be obtained,
denoted as F’ = {(f 1 , q1 ), (f 2 , q2 ),…}.

3.2 Feature Weight Allocation of Commodity Recommendation


The traditional recommendation algorithm is difficult to deal with data sparsity and
less diversity of recommendation results, so the random walk method [11] of complex
network is proposed. The recommended method PRWDR [12] can effectively solve
such problems, which finds commodities with high similarity by transferring similarity,
overcoming data sparsity. Therefore, the commodity similarity in knowledge graph and
user comments is integrated as the edge weight of random walk to determine the weight
of commodity features.

Pearson Coefficient. Let r(u, i) and r(v, i) be the evaluation values of the feature value
i of commodities u and v respectively, r(u) and r(v) are the evaluation values of the
average feature value of commodities u and v respectively, then P(u, v) is the similarity
of commodities u and v, calculated according to the Pearson correlation coefficient,
seeing formula (3). According to formula (3), the larger P value (u, v) is, the greater the
similarity between commodities u and v.
n
i=1 (r(u,i)−r(u))(r(v,i)−r(v))
P(u, v) = √n √
n (3)
i=1 (r(u,i)−r(u)) i=1 (r(v,i)−r(v))
2 2
82 Y. Zhang et al.

Build Commodity Adjacency Matrix. The similarity matrix S of commodities is built


on the calculated Pearson similarity, which m is the total number of commodities. When
P(ui , vj ) > 0, sij = P(ui , vj ), otherwise sij = 0 (i.e., commodities whose similarity is less
than 0 are not considered); sii = 0(1≤ i≤ m) means that the similarity of commodities
with themselves is not considered.
⎡ ⎤
s11 s12 · · · s1m
⎢ s21 s22 · · · s2m ⎥
S=⎢ ⎣ ··· ··· ··· ··· ⎦

sm1 sm2 · · · smm

After carrying out row vector normalization of commodity similarity matrix S, we


obtained the transition probability matrix T = (t ij )m×m . t ij represents the probability of
commodity node vj moving to commodity node ui , seeing formula (4).

Sij
tij = m (4)
k=1 Sik

The similarity of the goods is taken as the edge weight of the random walk model
[13]. Assume r is the column vector of the goods, where each element represents the
probability of the goods being accessed. Formula (5) is the corresponding formula of
the random walk strategy.

r n = c × T × r n−1 + (1 − c) × r 0 (5)

Where, c is the probability of the random walk model moving to a certain commodity
and its nearest commodity, 1-c is the probability of moving to the next step and returning
to the initial node, r n is the probability of reaching other commodities at the n th step,
and r 0 is the initial probability distribution.
The essence of commodity random walk probability is commodity weight which
directly relates to feature weight, and commodity random walk probability can represent
the weight of feature weight. The vector of commodity i is Qi = {I 1 ,I 2 ,…I i ,…}T , if there
is feature i in f i , then I i = 1, otherwise I i = 0. Assume there are n pieces of goods, the
total goods matrix is I = {Q1 ,Q2 ,…,Qn }T , that is, the feature weight can be expressed
as B = (bi )n = (I)−1 × r n . Then, the commodity feature set with the commodity random
walk weight added is F W = {(f 1 , q1 , b1 ), (f 2 , q2 , b2 ), …}, where bi is the feature weight
of commodity i.

3.3 Experimental Algorithm Design and Evaluation Index

For commodity m, the recommendation value of the commodity to relevant users can
be calculated after the sentiment word-feature word set and the random walk weight set
of the commodity are obtained. The recommendation value of the commodity can be
obtained by formula (6).

V = q×b (6)
(f ,q,b)∈Fw
Research on E-commerce Recommended Algorithm Based on Knowledge Graph 83

After the recommendation value of each commodity to user is obtained, the recom-
mended commodity value is sorted from large to small, and the top N commodities in
terms of value are provided to the user as required.
The new recommendation algorithm proposed in this paper can be described as:
Input: Commodity entities correspond to user comments and knowledge graph.

1) Output: Commodities to be recommended.


2) Take a certain number of commodity sets, denote as A.
3) Integrate the user comments in the commodity set with the corresponding commodity
entities in the knowledge graph.
4) Extract the feature words and the must-link between feature words and sentiment
words from the integrated data to form a feature set with sentiment words.
5) Build commodity similarity matrix S.
6) Calculate the probability of access to different commodities, and result in the
commodity feature weight.
7) Update the commodity feature set to F W = {(f 1 , q1 , b1 ), (f 2 , q2 , b2 ), …}.
8) Obtained the recommended value V of the goods according to the above calculation.
9) Return a list of recommended items to the user.

4 Experimental Results and Analysis

The experimental data set was collected from the user data of Taobao mall (www.taobao.
com). By searching for the keyword “shoes”, the author collected 10,000 commodities
and 100,000 data comments. The author conducted knowledge extraction of Internet
data under the big data environment in the knowledge graph, and then carried out data
pre-processing.

Evaluation Index. This paper uses the indexes of precision P, recall rate R and F value
[14] to compare the recommendation algorithms. The indexes are calculated by formula
(7). Where, R(u) represents N goods recommended to user u, T (u) is the commodity
set that users are interested in the test set, and F value is the index to be considered
comprehensively when there is a contradiction between P and R indexes. The P, R and F
value are important indexes for evaluating recommendation systems. In this experiment,
the P refers to the ratio of recommended goods to the total number of goods. The R
refers to the ratio of extracted related goods to the total number of goods. F value is the
comprehensive average index of these two indexes. The value of P and R is between 0
and 1, and the closer the value is to 1, the higher the accuracy is.

R(u) ∩ T (u) R(u) ∩ T (u) 2PR


P= ,R = ,F = (7)
R(u) T (u) P+R
84 Y. Zhang et al.

Verification of Algorithm Validity. After determining parameter values, extracting the


quantity of a commodity, number of comments, and the number of entities in the knowl-
edge graph, we compared the accuracy, recall rate, and F value of the recommendation
algorithm integrated knowledge graph and the traditional recommendation algorithm
(Collaborative Filtering recommendation algorithm, CF [15], Content-based recom-
mendation algorithm, CB [16] and Hybrid recommendation algorithm, MX [17]) under
the conditions of the commodity quantity and the number of comments, to verify the
effectiveness of the proposed algorithm.

Explore the Impact of the Number of Commodities on the Algorithm Performance


(10 Comments Were Randomly Selected for Each Commodity). As shown from Fig. 1,
Fig. 2 and Fig. 3, with the increase of the quantity of commodities, the R of the proposed
algorithm keeps increasing, and the F value tends to be stable. The algorithm’s P, R, and
F values are all higher than the other three algorithms.

Fig. 1. Precision of the algorithm under different commodity quantities

Fig. 2. Recall of the algorithm under different commodity quantities


Research on E-commerce Recommended Algorithm Based on Knowledge Graph 85

Fig. 3. F-value of the algorithm under different commodity quantities

Explore the Impact of the Number of Comments on the Algorithm Performance (Set
the Number of Commodities to 5000). As can be seen from Fig. 4, Fig. 5 and Fig. 6,
with the increase of the number of comments, the P, R and F value of the algorithm
proposed are constantly rising. The accuracy is obviously better than that of the other
three algorithms when the number of comments is greater than 40000, and the R and F
values are both higher than the other three algorithms.

Fig. 4. Precision of the algorithm under different comment number

Fig. 5. Recall of the algorithm under different comment number


86 Y. Zhang et al.

Fig. 6. F-value of the algorithm under different comment number

5 Conclusion
A recommendation algorithm integrating knowledge graphs and user comments is pro-
posed in this paper. The similarity matrix is built by combining knowledge graph and
random walk model, and the commodity features are taken as the research object to make
the recommendation more accurate. The experimental results show that compared with
the traditional recommendation algorithm, the commodity recommendation algorithm
proposed in this paper achieves a better recommendation effect and solves data scarcity
and poor recommendation diversity to a certain extent.

References
1. Hu, Y.H., et al.: Recognizing the same commodity entities in big data. J. Comput. Res. Dev.
52(08), 1794–1805 (2015)
2. Gu, Q.Y., Wu, B., Hu, Q.Q., Sun, Z.Y.: Social network user interest points recommenda-
tion algorithm based on multidimensional feature fusion. J. Commun. Netw. Initial Address
(2021). http://kns.cnki.net/kcms/detail/11.2102.tn.20210401.0909.002.html
3. Wang, J.F., Wen, X.L., Yang, X., Zhang, Q.L.: A bias based graph attention neural network
recommender algorithm. control and decision. Netw. Initial Address (2021). https://doi.org/
10.13195/j.kzyjc.2020.1626, Accessed 31 July 2021
4. Niu, J.F., Li, C.L., Chen, Z.G.: Visualization analysis of knowledge mapping in research on
product supply chain. Supply Chain Manag. 2(07), 54–61 (2021)
5. Wang, H.M., Nie, G.H.: Research on collaborative filtering algorithm based on fusing user
and item’s correlative information. J. Wuhan Univ. Technol. (07), 160–163 (2007)
6. Huang, Z.H., Zhang, J.W., Zhang, B., Yu, J., Xiang, Y., Huang, D.S.: Survey of semantics-
based recommendation algorithms. Acta Electron. Sin. 44(09), 2262–2275 (2016)
7. Zhang, Y.H., Zhu, X.F., Xu, C.Y., Dong, S.D.: Hybrid recommendation approach based on
deep sentiment analysis of user reviews and multi-view collaborative fusion. Chin. J. Comput.
42(06), 1316–1333 (2019)
8. Blei, D.M., Ng, A.Y., Jordan, M.I.: Latent dirichlet allocation. J. Mach. Learn. Res. 3(3),
993–1022 (2003)
9. Peng, Y., Wan, C.X., Jiang, T.J., Liu, D.X., Liu, X.P., Liao, G.Q.: Extracting product aspects
and user opinions based on semantic constrained LDA model. J. Softw. 28(03), 676–693
(2017)
Research on E-commerce Recommended Algorithm Based on Knowledge Graph 87

10. Wagstaff, K., Cardie, C., Rogers, S., Schroedl, S.: Constrained K-means clustering with
background knowledge. In: 18th International Conference on Machine Learning, pp. 577–584
(2001)
11. Bagci, H., Karagoz, P.: Context-aware location recommendation by using a random walk-
based approach. Knowl. Inf. Syst. 47(2), 241–260 (2016)
12. Liu, W.Y.: User network service recommendation method and experimental results analysis
based on PRWDR algorithm. Tech. Autom. Appl. 39(09), 48–51 (2020)
13. Yu, C.H., Liu, X.J., Li, B.: Mobile service recommendation based on context similarity and
social network. Acta Electron. Sin. 45(6), 1530–1536 (2017)
14. Zhu, Y.X., Lv, L.Y.: Evaluation metrics for recommender systems. J. Univ. Electron. Sci.
Technol. China. 41(2), 163–175 (2012)
15. Jiang, M.Y., Zhang, Z.F., Jiang, J.G.: A collaborative filtering recommendation algorithm
based on information theory and bi-clustering. Neural Comput. Appl. 31(12), 8279–8287
(2019)
16. Li, H., Cai, F., Liao, Z.F.: Content-based filtering recommendation algorithm using HMM. In:
Proceedings of 2012 International Conference on Computational and Information Sciences,
pp. 256–263. IEEE Press, Washington D.C. (2012)
17. Zhao, J.K., Liu, Z., Chen, H.M.: Hybrid recommendation algorithms based on convmf deep
learning model. In: Proceedings of 2019 International Conference on Wireless Communi-
cation, Network and Multimedia Engineering, pp. 15–26. IEEE Press, Washington D. C.
(2019)
Research on Regional Heterogeneity
in the Impact of Digital Inclusive Finance
on the Diversification of Household Financial
Asset Allocation

Shu-bo Jiang(B) and Xiao-Han Zhao

Harbin University of Commerce, Harbin 150028, China

Abstract. Diversification is one of the basic features of classical portfolio selec-


tion theory. Reasonable asset allocation is an important embodiment of rational
family investment and the basis for family wealth accumulation. The emergence
of inclusive digital finance has energized the financial market and lowered the bar-
riers to financial access. Based on data from the China Household Finance Survey
(CHFS) and Peking University Digital Finance Research Center, this paper ana-
lyzes regional heterogeneity in the impact of digital financial inclusion on house-
hold financial asset allocation diversity. The results show that inclusive digital
finance has significant heterogeneity among regions, and it is strongest in the east-
ern coastal region and weakest in the central region. In addition, the development
of digital financial inclusion has a different impact on the diversity of financial
asset allocation for urban and rural households. Digital inclusive finance has a
greater effect on urban households.

Keywords: Digital inclusive finance · Financial asset allocation · Regional


heterogeneity · Diversification

1 Introduction

Inclusive finance is a concept officially introduced by the United Nations in 2005. From
the coverage perspective, it mainly focuses on small micro-enterprises, farmers, and
disadvantaged groups with low income. From the perspective of service characteris-
tics, the cost of inclusive digital finance is lower than that of traditional finance. It
uses artificial intelligence and other technologies to control the risk of data, improving
the efficiency of capital utilization through “accurate portraits”. It enhances financial
risk prevention, thereby reducing transaction costs and providing financial services to
more people. Through digital means, digital inclusive finance benefits people in back-
ward areas, rural areas, and other areas which can not get traditional financial services.
Internet finance enriches investment channels, increases investment diversification, and
brings higher and more stable returns. Family asset investments are no longer limited
to traditional bank deposits but also have various options such as securities, funds, and
financial products. Due to different levels of economic development and financial market

© The Author(s), under exclusive license to Springer Nature Switzerland AG 2022


A. E. Hassanien et al. (Eds.): BIIT 2021, LNDECT 107, pp. 88–97, 2022.
https://doi.org/10.1007/978-3-030-92632-8_9
Research on Regional Heterogeneity in the Impact of Digital Inclusive Finance 89

innovation, residents in different regions are exposed to different types of financial prod-
ucts and businesses. Financial exclusion prevents disadvantaged residents from financial
services, which seriously affects the rationalization of household asset allocation.
With the implementation of the rural revitalization strategy and the achievement of
poverty eradication by 2020, the wealth accumulation of residents has been increasing.
To achieve asset preservation and appreciation for families with “surplus wealth,” it is
necessary to make reasonable investments and financial management. This paper first
investigates the impact of digital financial inclusion development on household finan-
cial asset allocation diversification through empirical tests. In addition, this paper will
investigate the heterogeneity of this impact across regions. It illustrates the importance
of promoting inclusive digital finance on the financial investment environment at the
macro level. At the micro-level, it reveals that improving residents’ household financial
literacy can help optimize household financial asset allocation decisions.

2 Literature Review

The concept of digital financial inclusion was first introduced in China in 2016. Relevant
studies mainly focus on the impact of Internet finance on rural financial exclusion. As
the research progresses, some scholars turn to the impact of inclusive digital finance
on economic development and residents’ asset allocation. Many scholars believe that
the development and popularization of inclusive digital finance positively affects the
economy, employment, and social welfare. Tang, Yu et al. (2020) found that the broad
coverage level of inclusive digital finance could enhance investment opportunities and
strengthen financial services’ equity, thus realizing the economic source of long-lasting
growth [1]. Wang, Yannan et al. (2020) found that inclusive digital finance could enhance
social security by affecting income and employment. On this basis, some scholars study
the impact of inclusive digital finance on different industries and industrial structures [2].
Ding Rijia et al. (2020) found that inclusive digital finance has a significant contribution
to the development of the service industry, and the impact is manifested in different
degrees among different regions [3]. Tian Juanjuan and Ma Xiaolin (2020) found that
inclusive digital finance could lead to a more rational and efficient agricultural, industrial
structure and enhance the economic benefits of agriculture [4]. Regarding the impact
of inclusive digital finance on residents’ asset allocation, scholars have found that the
development of digital finance could not only inject vitality into China’s financial marke
[5, 6] but also effectively promote the optimization of household asset allocation (Deng
Shanshan, 2019) [7].
Through a brief literature review, we found that after the concept of inclusive digital
finance was introduced in China, most studies focused on the impact on economic
development, income growth, and residents’ consumption [8]. Most of them have used
provincial data to conduct empirical tests. However, few articles studied the impact of
inclusive digital finance on household financial asset allocation at the household level.
90 S. Jiang and X.-H. Zhao

3 Model Setting and Variable Introduction


3.1 Data Introduction

This paper will use data from the China Household Tracking Survey (CHFS) of South-
west University of Finance and Economics and the Digital Inclusive Finance Index
compiled by the Digital Finance Research Center of Peking University. The China
Household Finance Survey Center (CHFS) has conducted the China Household Finance
Survey every two years since 2009, and its data are nationally representative. The Digital
Inclusive Finance Research Center of Peking University, using microdata on inclusive
digital finance from Ant Financial, compiled the Digital Inclusive Finance Index from
2011–2018, covering three levels: provincial, city, and county. In this paper, households
participating in the survey in 2011, 2013, 2015, and 2017 are selected as the sam-
ple [9]. According to the National Development and Reform Commission, the policy
on dividing the east-central and western regions. The eastern provinces include Bei-
jing, Tianjin, Hebei, Liaoning, Shanghai, Jiangsu, Zhejiang, Fujian, Shandong, Guang-
dong and Hainan. The western region includes Sichuan, Guizhou, Yunnan, Shaanxi,
Gansu, Ningxia, and Qinghai. The central region includes Shanxi, Inner Mongolia, Jilin,
Heilongjiang, Anhui, Jiangxi, Henan, Hubei, and Hunan [10].

3.2 Model Setting and Variable Description

Many factors are affecting the allocation of financial assets of residential households. We
need to add factors with a high degree of influence as control variables in the model to
ensure the model’s accuracy. The panel data model can better respond to the heterogeneity
in time and space. In order to reduce the impact of data heteroscedasticity on empirical
test results, we took the logarithm of all variables. The final model is built as follows

lninvestdiv i,t = δ0 + lndigital i.t + lndepthi,t + lncover i,t + lnpayment i,t


(1)
+ lncredit i,t + lninsur i,t + Xi,t + αi + αt + εi,t

Among them, i and t represents the province i and year t, respectively. εi,t is the
random disturbance term. αi is the individual fixed effect, and αt is the year fixed
effect. Investdiv is the explanatory variable in the model, which denotes the number
of household financial asset allocations. It includes demand deposits, time deposits,
stocks, bonds, funds, financial derivatives, bank financial products, other financial prod-
ucts, gold, foreign investment products, and cash. The core explanatory variables are the
total digital financial inclusion index (Digital), depth of coverage (Depth), breadth of
coverage (Cover), payment index (Payment), credit index (Credit), and insurance index
(Insur). Control variables include the age of household head (Age), an education level
(Edu), marital status (Marry: Married 1, Unmarried 0), gender (Gender: Male 0, Female
1), number of household members (Num), and presence of housing assets (House: Yes
1, No 0).
Research on Regional Heterogeneity in the Impact of Digital Inclusive Finance 91

4 Empirical Tests
4.1 The Impact of Total Digital Inclusion Indicators on Household Financial
Asset Allocation Diversification

As can be seen from Table 1. The digital inclusive finance index and each control vari-
able have significant effects on household financial asset allocation. (1) The popularity
of inclusive digital finance expands investment channels for residents, promoting their
understanding of financial information, and increasing household financial asset allo-
cation diversification. (2) The education level of household heads positively affects the
diversification of household financial asset allocation. The higher the education level
of the household head, the more financial information he gets and the more rational he
considers the risks of financial products. He can use digital inclusive finance channels
more rationally to purchase investment products and allocate household assets when
allocating assets. (3) The age of the household head has a weak effect on the household
financial asset allocation. This may be the result of offsetting. On the one hand, as the
household head gets older, he acquires more financial knowledge to make more sensible
decisions on household investments and have a more rational asset allocation. On the
other hand, some older householders do not keep up with the times due to their low
education level, conservative thinking, and lacking knowledge about Internet finance.
He will prefer the single asset allocation method of traditional finance. (4) The increase
in household size has a negative effect. This may be due to the increase in the number
of teenagers and older adults in the household. Most of them are risk-averse and more
conservative in financial investment. It eventually makes the financial asset allocation
tend to be homogeneous. (5) Owning home equity plays a positive role. It may be that
households with housing assets do not need to set aside funds for house purchase or
rent. When basic living conditions are guaranteed, they are willing to put aside more
funds to make financial investments to enrich their household financial asset allocation.
Because of the uncertainty of the future, the homeless people need more precautionary
savings funds. Therefore, they have less funds for financial investment. (6) The gender of
the household head has a positive effect. Compared with male household heads, female
household heads tend to acquire multiple investment products when making financial
asset decisions. (7) Married heads tend to have more diversified financial assets than
unmarried heads.

Table 1. Test results of the impact of total digital financial inclusion indicators on the diversifi-
cation of household financial asset allocation

lndigital lnedu lnage lnmarry lngender lnnum lnhouse


lninvestdiv 0.145*** 0.025*** −0.006*** 0.008*** 0.054*** −0.0016*** 0.001***
(0.015) (0.002) (0.000) (0.002) (0.023) (0.001) (0.000)
92 S. Jiang and X.-H. Zhao

4.2 The Impact of Digital Inclusive Finance Sub-indicators on Household


Financial Asset Allocation Diversification

The test results are shown in Table 2. Different dimensions of digital financial inclusion
development (breadth of coverage and depth of use) and different sub-indicators of busi-
ness types (payment, credit, and insurance) significantly influence household financial
asset allocation. The breadth of coverage refers to the number of users who apply to reg-
ister their investment accounts and tie up their physical bank cards through websites. The
greater the breadth of coverage, the more users are exposed to inclusive digital finance,
and the residents have more opportunities to try different kinds of investment products.
It will lead to a more rational allocation of residents’ financial assets and more diver-
sified investments. Depth of use has a greater impact on the number of financial assets
allocated than the breadth of coverage because the depth of use reflects actual usage
and transactions better. For residents, it is more effective to contact Internet financial
services than to accept financial publicity and promotion. Payments and credit indices
have a greater impact on asset allocation diversity than insurance. This may be because
payment and credit services are basic businesses among traditional financial institutions
with lower thresholds and higher coverage. As a more advanced and complex financial
service, insurance has higher business thresholds. These businesses allow residents to
have more liquid assets and increase the way money flows. They reduce the income
pressure caused by the consumption so that residents have more funds available for
household financial asset allocation.

Table 2. Test results on the impact of digital inclusive finance sub-indicators on the diversification
of household financial asset allocation

Index Dependent variable: lninvestdiv


(1) (2) (3) (4) (5) (6)
lndigital 0.501***
(0.042)
lndepth 0.285***
(0.026)
lncover 0.142***
(0.016)
lnpayment 0.279***
(0.022)
lncredit 0.178***
(0.017)
lninsur 0.043***
(0.008)
F-statistic 2394.87 1397.2 878.9 1567.9 1067.3 856.4
Research on Regional Heterogeneity in the Impact of Digital Inclusive Finance 93

4.3 The Heterogeneous Impact of Digital Inclusive Finance on Household


Financial Asset Allocation in Different Regions
As shown in Table 3, the regression results for the eastern, central and western regions
show that the total and sub-indicators of digital inclusion positively impact household
financial asset allocation in all three regions. The results show that digital financial
inclusion has the greatest impact on the dispersion of household asset allocation in the
eastern region, followed by the western regions, and has the smallest impact on the central
region. In the eastern region, inclusive digital finance develops faster by relying on the
developed economy and active financial markets. It can provide residents with diversified
products and services with lower financial access barriers. The development of regional
finance has led to a gradual increase in the number of bank branches and financial
institutions around the community. Households have more access to financial literacy
and a wider range of products and services, increasing households’ financial literacy and
changing their risk appetite. It helps households to make rational investments, increase
the diversification of their investment portfolio.
In contrast, the western region is relatively backward in terms of macroeconomic
and digital infrastructure development. The exclusion of financial products and services
is more serious, limiting the development of inclusive digital finance. However, because
of the sparse population, the advantages of inclusive digital finance can enable financial
resources to be fully shared and utilized. In addition, some supporting policies and
preferential credit policies for the western region have diversified residents’ investment
and financial management. Although the total financial investment in the western region
is still the lowest, the allocation of financial assets has been greatly enhanced with the
help of inclusive digital finance.

4.4 The Differential Impact of Digital Inclusive Finance on the Diversity


of Financial Asset Allocation of Urban and Rural Households
From the results in Table 4, the total digital inclusion index has a greater impact on the
diversity of financial asset allocation for urban households than rural households. This
may be because urban residents have greater access to inclusive financial services and
advocacy. As a result, it is easier for them to use new channels and inclusive digital
finance to acquire investment products. Easier access to relevant information makes
urban residents more willing to diversify their financial asset allocation. However, the
underdeveloped network in rural areas makes rural residents receive less publicity and
services of inclusive digital finance, and their channels for investment are also limited.
Because of insufficient financial knowledge, rural residents are reluctant to try more
kinds of investment products. Therefore, inclusive digital finance has a greater impact
on urban residents’ asset allocation.

4.5 Robustness Tests


To further demonstrate the reliability of the conclusions, this paper performs robustness
tests using the method of replacing independent variables. Referring to the constructed
financial asset diversity index, this paper sets the variable investment share. The variable
94 S. Jiang and X.-H. Zhao

Table 3. The results of a test of regional heterogeneity in the impact of digital financial inclusion
on the diversification of household financial asset allocation

Variable Coefficient Prob R-squared F-statistic Durbin-Watson


(Prob) stat
Eastern C 0.925 0.000 0.995 1489.78 1.0536
Coastal lndigital 0.204*** 0.012 (0.000)
Region
lndepth 0.187*** 0.035
lncover 0.387*** 0.010
lnpayment 0.235*** 0.014
lncredit 0.451*** 0.002
lninsur 0.063*** 0.021
Central C 1.457 0.000 0.990 1036.80 1.2043
Inland lndigital 0.044*** 0.021 (0.000)
Region
lndepth 0.116*** 0.010
lncover 0.074*** 0.002
lnpayment 0.292*** 0.036
lncredit 0.203*** 0.003
lninsur 0.089*** 0.024
Remote C 0.543 0.047 0.989 1676.972 1.1536
Western lndigital 0.133*** 0.023 (0.000)
Region
lndepth 1.082*** 0.000
lncover 0.035*** 0.037
lnpayment 0.126*** 0.010
lncredit 0.198*** 0.012
lninsur 0.097*** 0.003

Table 4. The differential impact of digital inclusive finance on the diversity of financial
asset allocation of urban and rural households

Indes lndigital _con


Urban lninvestdiv 0.431*** −1.221***
(0.040) (0.003)
Rural lninvestdiv 0.350*** −0.085***
(0.036) (0.016)
Research on Regional Heterogeneity in the Impact of Digital Inclusive Finance 95

is the proportion of each investment in cash, demand deposits, time deposits, bonds,
stocks, gold, funds, derivatives, foreign currency assets, bank financial products, and
other financial products to the total investment in household financial assets. For this
part of the robustness test, this paper replaces the indicator measuring household financial
asset allocation dispersion. The model setup, control variables, and regression methods
are identical to those in Sect. 3.2. The formula for calculating the indicator is as follows:
n
Investshare = 1 − s2 , if n > 0 (2)
1

The test results are shown in Table 4 and Table 5.

Table 5. Robustness regression results

lndigital lnedu lnage lnmarry lngender lnnum lnhouse


lninvestdiv −0.002*** −0.035*** 0.001*** 0.005*** – 0.003*** 0.006*** −0.003***
(0.003) (0.001) (0.000) (0.007) (0.002) (0.001) (0.002)

Table 6. Robustness regression results

Index Dependent variable: lnvestshare


(1) (2) (3) (4) (5) (6)
lndigital 0.501***
(0.042)
lndepth 0.285***
(0.026)
lncover 0.142***
(0.016)
lnpayment 0.279***
(0.022)
lncredit −0.006***
(0.003)
lninsur −0.001***
(0.000)

Under the premise of the same asset types, if the difference between the proportions
of assets is greater, it shows that households hold a high proportion of certain financial
assets. On the contrary, if the difference between the proportion of each asset is smaller,
it indicates that households are diversified into multiple assets when allocating financial
assets. The results of the test shown in Table 5 and Table 6 are more favorable. Invest share
will decrease as the diversification of household financial asset allocation increases. Thus
the negative effect coefficient indicates that the increase of digital inclusion makes Invest
share decrease. The total digital inclusion index has a positive effect on the diversification
of household financial asset allocation.
96 S. Jiang and X.-H. Zhao

5 Conclusion
This paper uses a dataset formed by data from the China Household Finance Survey
(CHFS) and the Digital Finance Research Center of Peking University to build a fixed-
effects model. This paper studies the impact of digital inclusion on regional differences
in household financial asset allocation diversity through empirical testing. The study
concludes

– (1) inclusive digital finance development positively promotes diversification of house-


hold financial asset assets. Moreover, the total indicators of digital inclusive finan-
cial development and different sub-indicators significantly positively impact financial
asset allocation behavior. The depth of usage index is the most influential coefficient
among the sub-indicators, and the insurance index is the least influential coefficient.
– (2) The results of the heterogeneity analysis by region show that digital inclusive
financial has the greatest effect on household financial asset allocation in the eastern
region. It has the least effect on the western region.

The main reason for this result is the different levels of regional economic devel-
opment and financial market sophistication. Based on the above research results, this
paper proposes the following suggestions: First, relying on the current financial institu-
tions that already exist, we should accelerate the introduction of inclusive digital finance
and upgrade various dimensions of inclusive digital finance. Second, the government
and companies should step up their efforts to promote digital financial inclusion and
enhance residents’ financial literacy and Internet thinking. Third, it is necessary to adapt
to local conditions when developing inclusive digital finance. For the western region,
the development of inclusive digital finance should focus on matching infrastructure
construction and the symmetry of financial products and services. For the eastern region
with a higher degree of financial development, the regulator should strengthen financial
supervision and create a fair financing environment.

References
1. Tang Y., Long Y., Zheng Z.: Study on the inclusive economic growth effect of digital inclusive
metals. An empirical analysis based on 12 provinces in Western China. Southwest Finan.
41(09), 60–73 (2020)
2. Wang, Y., Tan, Z., Zheng, L.: Study on the impact of digital inclu-sive metals on social
security. J. Quant. Tech. Econ. 37(07), 92–112 (2020)
3. Ding, R., Liu, R., Zhang, Q.: Research on the impact and mechanism of digital inclusive
finance on service industry development an empirical analysis based on inter-provincial panel
data. J. Finan. Econ. 40(7), 4–10 (2019)
4. Tian, J., Ma, X.: Analysis of the effect of digital inclusive finance to promote agricultural
transformation and upgrading–empirical evidence based on inter-provincial panel data. Credit
Refer. 38(07), 87–92 (2020)
5. Wang, S., Yang, B.: The marriage of rural finance and internet finance: impact, innovation,
challenges and trends. Rural Finan. Res. 38(08), 19–24 (2017)
6. Hang, Y., Huang, Z.: Digital financial development in China: present and future. China
Econom. Q. 17(04), 1489–1502 (2018)
Research on Regional Heterogeneity in the Impact of Digital Inclusive Finance 97

7. Deng, S.: A Study on the impact of digital inclusive finance on SME financing. Think Tank
Era. 3(37), 62+82 (2019)
8. Corrado, G., Corrado, L.: Inclusive finance for inclusive growth and development. Current
Opinion Environ. Sustain. 24,19–23 (2017)
9. Zhou, Y., He, G.: The impact of digital inclusive finance development on financial
asset allocation of farm households. Mod. Econom. Sci. 42(03), 92–105 (2020)
10. Jiang, S., Zhao, X.: The influence of shadow banking on China’s credit transmission
mechanism. Bus. Econom. 40(08), 166–169 (2021)
Research on the Countermeasures
for the Development of Agricultural and Rural
Digital Economy

Bai Ying1(B) and Jiao Jinpeng2


1 School of Economics, Harbin University of Commerce, Harbin 150028, Heilongjiang, China
2 MBA, MPA Center, Northeast Forestry University,

Harbin University of Commerce, Harbin 150028, Heilongjiang, China

Abstract. The digital economy has become an important means to promote the
high-quality development of agriculture and realize the modernization of agricul-
ture. It has a significant role in digital technology empowerment, HP sharing role,
and innovation promotion role. Based on the research on the development status of
the agricultural and rural digital economy, this paper found that in the development
of agricultural and rural digital economy in my country, there are problems such
as imperfect communication infrastructure construction, the low utilization rate of
farmers’ information, and inadequate and unbalanced development of digital agri-
cultural economy, which seriously restrict agriculture sustainable development of
the rural digital economy. Therefore, it is proposed to strengthen the construc-
tion of digital infrastructure and the development and promotion of agricultural
information platforms, and accelerate the development of digitalization of the agri-
cultural industry, so that the new round of scientific and technological revolution
can better benefit agriculture and rural farmers, and realize the modernization of
agriculture and rural development.

Keywords: Digital economy · Rural economy · High-quality development

1 Introduction

Since the beginning of the new century, the Party’s No. 1 Document has focused on the
issues of agriculture, rural areas and farmers for 18 consecutive years. The moderniza-
tion of agriculture can effectively help the country to modernize. The themes of agricul-
tural and rural development in different periods differ, from the initial improvement of
comprehensive production capacity and the construction of agricultural infrastructure.
Overall planning of urban and rural development, to the construction of modern agricul-
ture that incorporates more technical elements, to start the comprehensive construction
of a well-off society and the development of agricultural and rural areas is listed as a
priority for development. A variety of factors have been incorporated in the new era to
accelerate the modernization of agriculture and rural areas.

© The Author(s), under exclusive license to Springer Nature Switzerland AG 2022


A. E. Hassanien et al. (Eds.): BIIT 2021, LNDECT 107, pp. 98–105, 2022.
https://doi.org/10.1007/978-3-030-92632-8_10
Research on the Countermeasures for the Development of Agricultural 99

According to the report of the “White Paper on China’s Digital Economy Devel-
opment (2020)” released by the China Academy of Information and Communications
Technology, the value-added of the digital economy has reached 35.8 trillion by the
end of 2019, and the proportion of the digital economy in GDP has risen to 36.2%.
The development of the digital economy has become an important driving force for the
development of various fields, the integration of rural economy and modern informa-
tion technology, using the advantages of knowledge and information to drive agriculture
to produce significant spillover effects, playing the role of Hewlett-Packard and ener-
gizing the domestic agricultural industry [1] effectively stimulating economic growth
and realizing industrial transformation and upgrading. The digital economy achieves
the high-quality development of the rural economy by innovating the agricultural devel-
opment model, breaking the dual barriers between urban and rural areas, maintaining
the stability of agricultural production, and promoting the sustainable development of
the agricultural economy [2]. In 2019, China issued the “Digital Village Development
Strategy Outline” to coordinate the development of digital villages, build smart cities,
and promote the construction of digital villages. In the No. 1 document of the Central
Committee of 2021, it is pointed out that at the beginning of China’s “14th Five-Year
Plan”, the construction of a socialist modern country is to be realized, and the construc-
tion of digital villages is an important breakthrough to break the revitalization of the
countryside.
Agricultural and rural digital economy refers to advanced digital information tech-
nology to promote agricultural economic development in rural areas based on upgrading
digital infrastructure [3]. Regarding the research of agricultural and rural digital econ-
omy, the existing literature mainly discusses the integration mechanism of the digital
economy and rural industry [4] farmers’ participation in the agricultural and rural digital
economy [5], the future development trend of the agricultural and rural digital economy,
etc. [6]. However, few scholars have analyzed the current situation and existing prob-
lems of developing my country’s agricultural and rural digital economy. After China
entered the new era, it strengthened the construction of information infrastructure to
realize networked operation and production. It now has the conditions to implement the
construction of digital villages [7]. However, there are still a series of problems that need
to be solved urgently to develop China’s agricultural and rural digital economy. There-
fore, this article combs the current status and problems of agricultural and rural digital
economy development and puts forward reasonable suggestions to provide a reference
for further research on the development of the agricultural and rural digital economy.

2 Development Status of Agricultural and Rural Digital Economy


2.1 Development of Government Public Agricultural Information Platform
Infrastructure construction is the foundation for developing agricultural and rural dig-
ital economies, and it determines whether the digital economy can develop smoothly.
It mainly relies on computer networks and telecommunications equipment to provide
guarantees for the development of the digital economy [8]. As shown in Fig. 1, the Inter-
net penetration rate in rural China is increasing year by year. and by 2020, the Internet
penetration rate will exceed half. By the end of 2020, 1/3 of the telecommunications
100 B. Ying and J. Jinpeng

universal service pilot tasks will be deployed in poor villages, exceeding the “Thirteenth
Five-Year Plan” target, and achieving broadband network coverage of more than 90%
of poor villages.

60.00%

50.00%

40.00%

30.00%

20.00%

10.00%

0.00%
2013.12 2014.12 2015.12 2016.12 2017.12 2018.12 2019.6 2020.12

Fig. 1. Internet penetration rate in rural areas (Data source: China Internet Network Information
Center)

2.2 Construction of Government Public Agricultural Information Platform


Government departments provide authoritative and comprehensive agricultural-related
information through platform construction and provide farmers with the varieties needed
in the process of agricultural production and sales, selection, planting technology, sales
price and approach, product quality inspection, and climate change information to realize
the effectiveness, transparency, and technology of agricultural information. The Chinese
government has established many public agricultural information platforms, such as the
Ministry of Agriculture and Rural Affairs of the People’s Republic of China, containing
agricultural information such as agricultural policies, agricultural product prices, and
agricultural technology popularization. The agricultural sector has also launched APPs to
improve the construction of online websites, such as the “New Type of Direct Reporting
System for Agricultural Business Main Body Information” developed and constructed by
the Ministry of Agriculture, from which services such as policy, finance, and information
can be quickly obtained.

2.3 Agricultural and Rural Digital Economy Industry Development


The development of the digital economy has promoted the transformation of the agri-
cultural industry and changed the entire process of traditional agricultural products from
production to sales. New business models such as custom agriculture, creative agricul-
ture, and cloud farms are booming, and “Internet +” agricultural social services are
Research on the Countermeasures for the Development of Agricultural 101

developing rapidly. Taking the development of rural e-commerce as an example, rural e-


commerce reduces information asymmetry in the circulation of agricultural products. It
achieves cost reduction, efficiency increase, and income expansion. As shown in Fig. 2,
relying on the digital economy, my country’s rural e-commerce has risen rapidly in just
a few years and achieved rapid sales growth, from 0.35 trillion yuan in 2015 to 1.79
trillion yuan in 2020, opening up agricultural products New sales channels have enabled
the optimization and upgrading of the agricultural industry structure.

2
1.8
1.6
1.4
Unit:Trillion

1.2
1
0.8
0.6
0.4
0.2
0
2015 2016 2017 2018 2019 2020

Fig. 2. Rural online retail sales (Data source: Ministry of Commerce)

3 Problems in the Development of Agricultural and Rural Digital


Economy
3.1 Incomplete Communication Infrastructure
According to the 47th “Statistical Report on Internet Development in China”, as of
the end of 2020, the rural network coverage rate is 55.9%, which is far lower than the
79.8% urban network coverage rate. Rural network infrastructure construction needs to
be further strengthened. At the same time, with the development and change of network
information technology, the rural population’s demand for the Internet is increasing.
Rural networks have fewer base stations, poor signals, easy damage, and fewer com-
munication maintenance personnel. Supply cannot meet the needs of farmers. Rural
communication network infrastructure requires a lot of funds, especially in remote areas
with a small population, and the low return on investment is difficult to attract operators
to invest.

3.2 Low Utilization of Farmers’ Information


Firstly, farmers are restricted by their level of education. Judging from the data at the end
of 2019, only 11.2% of rural household heads have high school education. The comple-
tion ratio of rural high school education is 80.84%, lower than the national high school
102 B. Ying and J. Jinpeng

education ratio of 89.32%. The low level of education hinders farmers’ from learning and
using information. The second is that the information cannot be transmitted due to the
reliance on traditional production and sales experience, which leads to information block-
ing and unable to be transmitted in time. Farmers have a short time to contact and learn
to use the Internet, and they have little knowledge of agricultural information websites.
The channels for obtaining information on agricultural production are mainly through
traditional channels, such as television media, acquaintances, and nearby businesses.
The transmission and acquisition of information depend on interpersonal communica-
tion and exchange transactions—lack of awareness of using agricultural information
websites and lack of professional talents’ guidance. The third is the low coverage of
intelligent communication equipment due to population size, population distribution,
and economic development level restrictions.

3.3 The Unbalanced and Inadequate Development of the Agricultural Digital


Economy Industry
There are big differences in the level of development of agricultural digital economy
industries in various provinces. As a typical new business model for the development
of digital agricultural economy, rural e-commerce can representatively reflect the devel-
opment level of the agricultural digital economy industry. It can be seen from Fig. 3
that there is a big gap in the number of Taobao villages in each province. The eastern
region has developed relatively mature earlier, and the number of Taobao villages is
much higher than that of the central and western regions. The number of Taobao villages
in large agricultural provinces such as Heilongjiang and Sichuan is much lower than
other provinces. The development of rural e-commerce needs further promotion.

2000
1800
1600
1400
1200
1000
800
600
400
200
0
zhejiang

hebei

sanxi
shandong

fujian
henan
hubei

anhui
shanghai

shanxi

guangxi
hunan
jiangxi
guangdong
jiangsu

beijing

gansu
tianjin

sichuan

chongqing
liaoning

yunnan
jilin
guizhou
xinjiang
heilongjiang

hainan
ningxia

Fig. 3. Number of Taobao villages in each province (Data source: Ali Research Institute)

Digital talent training is not perfect. The use of information technology poses a certain
threshold and difficulty for farmers, and faces the serious problem of rural aging, which
deepens the difficulty of training. The training process usually takes the form of large-
class lectures, and it lacks pertinence for farmers of different education levels, different
Research on the Countermeasures for the Development of Agricultural 103

age groups, different production capabilities, and scales, and the actual application effect
is poor.
The construction of rural logistics needs to be further strengthened. Due to geo-
graphical conditions, the logistics infrastructure in some areas is very scarce, and the
high logistics costs have seriously hindered the development of rural e-commerce. With
the rapid development of e-commerce, third-party logistics has also developed. How-
ever, in some remote areas, especially the western region, the low return on investment in
logistics construction hinders the extension of third-party logistics transportation routes
to rural areas due to the vastness and sparse population.

4 Countermeasures to Develop Agricultural and Rural Digital


Economy

4.1 Strengthen the Construction of Digital Infrastructure

To build the development of agricultural and rural digital economy, it is necessary to lay
a solid foundation for digital infrastructure construction. The government should play
a leading role and encourage operators to carry out digital infrastructure construction
through cooperation and subsidies to make up for the shortcomings of digital infrastruc-
ture construction. Prioritize the development of basic communication network construc-
tion in remote and backward areas to ensure that farmers access basic digital facilities
to meet their daily needs. Then, according to the development potential and positioning
of different regions, implement the policy of first piloting and then promoting. In-depth
study of the application of big data, cloud computing, and the Internet of Things in
agricultural products’ production and sales process to ensure efficient use of resources
and realize smart agriculture. According to the basic conditions of the residents, such
as the number of residents, age, and labor capacity, guide farmers to change and merge
villages, increase population concentration, and avoid new “digital divides.” Maximize
the sharing range of digital infrastructure, ensure the timeliness and economy of digi-
tal infrastructure, avoid waste of resources and funds caused by repeated construction,
and establish and improve construction standards to avoid easy damage due to quality
reasons.

4.2 R&D and Promotion of Agricultural Information Platform


Realize the research and development of the agricultural big data platform with the
government as the mainstay and the enterprise as the supplement. Build an agricul-
tural big data platform through the timely collection and monitoring of data to achieve
pre-production analysis and in-production operations, optimize the selection of seeds,
cultivate seeds, and reduce production costs. Encourage enterprises to launch service
agricultural information platforms, providing agricultural-related services such as con-
sulting services for farmers, agricultural production services, fertilizer, seeds, and pes-
ticide sales services. It can be promoted to the rural grassroots in cooperation with the
government or news media to promote competition in agricultural-related industries and
promote the economic development of various industries.
104 B. Ying and J. Jinpeng

Improve farmers’ mastery of the use of digital infrastructure, increase farmers’ aware-
ness of the digital economy, increase farmers’ acceptance and discrimination of digital
products, and promote better use of agricultural information platforms. In addition,
the government should actively build a community service platform for farmers, speed
up the exchange of information between farmers, break the dependence of farmers on
acquaintances, can effectively speed up the exchange and dissemination of agricultural
information between farmers, exert spillover effects, and effectively promote farmers’
acquisition and utilization of agricultural knowledge and agricultural technology.

4.3 Accelerate the Development of Digitalization of the Agricultural Industry

Accelerate the training of digital talents, build a complete and targeted digital talent
training system, and provide agricultural digital production and sales services. Carry
out classified and batch training for different age groups and different levels of edu-
cation, provide professional guidance on Internet operation technology and operation
technology, realize the application of what has been learned, and effectively drive other
farmers to break traditional sales methods and participate in the development of new
agricultural business models middle. Aiming at areas where the development of new
agricultural businesses is slow, speed up the introduction of digital talents to fill the gap
and accelerate the process of realizing the development model of new rural businesses.
Accelerate the digital construction of rural logistics. Use big data and other tech-
nologies to scientifically analyze the geographic location of rural residents, establish
the optimal distribution network, monitor the entire logistics process, and improve the
efficiency of the circulation. Agricultural products have the characteristics of being eas-
ily lost and perishable during transportation. Therefore, the construction of rural cold
chain logistics should be strengthened, and the value of agricultural products should
be preserved through cold chain transportation. Strengthen cooperation with third-party
logistics, and reach a long-term contract with third-party logistics in order farming to
reduce transportation costs and search costs.
Develop a multi-format agricultural and rural digital economy. The government pro-
vides key support for areas with abundant agricultural products resources and lagging
rural e-commerce development, adopting a “scaled” policy guidance method, coordi-
nating financial, fiscal, and tax incentives, and lowering the entry threshold of the rural
e-commerce industry. Accelerate the construction of the agricultural industry chain,
accelerate the development of the processing chain, service chain, and functional chain
in the agricultural industry chain, cross-industry integration and innovation of the agri-
cultural industry chain integration model, and extend the industry chain to the front and
back ends. For example, the front-end variety cultivation, the back-end use short video,
live broadcast, and other forms for promotion and sales. Build agricultural digital indus-
trial parks, improve agricultural production, picking, sorting, and packaging standards,
give play to the agglomeration effect of digital agricultural development, and realize the
spillover of knowledge and technology.
Research on the Countermeasures for the Development of Agricultural 105

Acknowledgment. This work was supported by the grants of the Heilongjiang Province Phi-
losophy and Social Science Research Planning Project “Study on Symbiosis of Entrepreneur-
ship Ecosystem” (17GLE300) and Harbin University of Commerce “Young Innovative Talents”
Support Program Project (2020CX18).

References
1. Cai, Z.Z., Su, X.D.: The opportunities, obstacles and strategies of rural economy in the field
of digital economy. Agric. Econ. 07, 35–37 (2021)
2. Qi, W.H., Zhang, Y.J.: Promote the high-quality development of rural economy with digital
economy. Theoret. Explor. 03, 93–99 (2021)
3. Mu, J., Ma, L.P.: Measurement and regional differences of China’s agricultural and rural digital
economy development index. J. South China Agric. Univ. (Soc. Sci. Edn). 20(04), 90–98 (2021)
4. Kou, S.: Study on the effective integration of my country’s rural digital economy and
agricultural economy under the guidance of technology. Agric. Econ. 06, 12–14 (2021)
5. Su, L.L., Peng, Y.L.: A study on the evaluation of farmers’ practice participation and driving
factors from the perspective of digital village construction. J. Huazhong Agric. Univ. (Soc. Sci.
Edn.). 5, 1–12 (2021)
6. Liang, B., et al.: Discussion on the development trend of large-scale digital agriculture in rural
areas, Taking Xinjiang production and construction corps as an example. Agric. Econ. 12,
36–38 (2020)
7. Yang, B.Z., Wu, M.Q.: Advantages, Difficulties and ways of rural economic development under
the background of digital rural strategy. Agric. Econ. 2021(07), 38–40 (2021)
8. Zou, H.: The dilemma of the development of rural digital economy and its solution. Agric.
Econ. 02, 46–47 (2021)
Research on the Countermeasures of Digital
Intelligence Transformation and Upgrading
of Innovative Manufacturing Enterprises

Yanlai Li1(B) , Lili Zhai1 , Caixia Yang1 , and Kewen Liu2


1 Harbin University of Technology, Harbin, Heilongjiang, China
2 Harbin University of Commerce, Harbin, Heilongjiang, China

Abstract. In China, the rapid development of the digital economy has brought
transformation opportunities to innovative manufacturing enterprises. New-
generation information technologies such as big data, 5G, and artificial intelligence
provide powerful tools for transforming digital intelligence and upgrading inno-
vative manufacturing enterprises. In this paper, the concept and characteristics of
innovative manufacturing enterprises are introduced, and the economic transfor-
mation process of innovative manufacturing enterprises is divided into five main
stages. Secondly, from the data-driven perspective, starting from the development
needs of innovative manufacturing enterprises, a digital intelligence platform and
data mining framework are built for them. Finally, the transformation counter-
measures of innovative manufacturing enterprises are given. This paper helps the
innovative manufacturing enterprises scientifically complete the transformation
and upgrading from digitization to digital intelligence. It provides a new perspec-
tive and ideas for the deep integration of the digital economy and real economies’
deep integration.

Keywords: Data-driven · Innovative manufacturing enterprises · Digital


intelligence transformation

1 Introduction

New-generation information technologies such as big data, 5G, and artificial intelligence
are opening a new era of the digital economy. Building a digital China is a new national
informatization development strategy in the new era. The rapid development of the digital
economy injects new power into China’s economy, brings revolutionary influence to
the innovative manufacturing industry, and provides unprecedented opportunities for its
development and upgrading from digitization to digital intelligence. In the 14th five year
plan for national economic and social development of the people’s Republic of China and
the outline of long-term objectives for 2035 [1], it is pointed out that the optimization
and upgrading of the manufacturing industry should be promoted, the high-end and
intelligent manufacturing industry should be promoted, and intelligent manufacturing
should be deeply implemented.

© The Author(s), under exclusive license to Springer Nature Switzerland AG 2022


A. E. Hassanien et al. (Eds.): BIIT 2021, LNDECT 107, pp. 106–115, 2022.
https://doi.org/10.1007/978-3-030-92632-8_11
Research on the Countermeasures of Digital Intelligence Transformation 107

Data is the fifth largest factor of production after land, labor, capital, and technology.
According to IDC (2018), more than 70% of China’s top 1000 enterprises take digital
transformation as the company’s strategic core. It expenditure that 65% of China’s GDP
will be related to data by 2022 [2]. The digital transformation and upgrading of innovative
manufacturing enterprises in China have achieved phased results. However, there are
still many problems in high-quality economic development, such as insufficient digital
service capacity of industrial supporting facilities, lack of data integration and sharing
platform, lack of high value-added data mining solutions, etc. Making full use of big
data technology and data mining methods to build a value creation system with data as
the core driving factor and realize the leap from digitization to digital intelligence is the
inevitable trend of developing innovative manufacturing enterprises in today’s digital
economy.
At present, the research on digital intelligence transformation abroad has been rel-
atively mature. The research content covers all aspects of intelligent manufacturing,
showing a multi-perspective, dynamic trend, and interdisciplinary integration. In terms
of research methods have gradually shifted from qualitative research methods such as
early concept elaboration and theoretical discussion to experimental methods and quan-
titative analysis methods such as computational simulation, data investigation, and case
study. Foreign research hotspots on digital intelligence transformation mainly focus on
intelligent design [3], intelligent production [4], intelligent manufacturing service [5],
and intelligent management [6].
Compared with foreign countries, the research on digital intelligence transformation
in China started later. Domestic-related research is mainly reflected in the following
three aspects.
First, research on economic digitization and digital intelligence transformation.
Wang W. [7] proposed that enterprise digital intelligence is based on connection and
data, and intelligence is the final value of digital intelligence. Wang J. [8] shared the
understanding of digital transformation from four dimensions: main motivation, basic
characteristics, path analysis, and intelligent technology. Chen J. [9] pointed out that
China must rely on big data to realize innovation-driven, structural adjustment and
industrial integration in order to promote economic transformation and upgrading under
the new normal.
Second, the economic transformation and upgrading of manufacturing enterprises.
Zhang L. [10] pointed out that data resources have become a key factor in the man-
ufacturing industry’s digital transformation. Jiang J. [11] thought that manufacturing
enterprises should track the latest science and technology trends and use big data to
realize intelligent decision-making. Li L. et al. [12] thought that intelligent manufactur-
ing, as the main direction of manufacturing in China from large to strong, is a strategic
choice for China to comply with a new round of scientific and technological revolution
and industrial reform, reshape new advantages in manufacturing development and build
an innovative country.
Third, the application of big data technology and data mining methods in the manu-
facturing industry. Pan et al. [123] established an analysis model of influencing factors
of shipbuilding cost to achieve the purpose of cost control. Wei et al. [14] proposed a
108 Y. Li et al.

product process adaptive design method to solve insufficient data utilization in manufac-
turing enterprises. Guo et al. [15] constructed a sustainable quality control mechanism
for the heavy vehicle production process using the improved turtle chart and VDA
evaluation model. Zhou et al. [16] proposed a method based on incomplete orthogo-
nal specified element method and SVM information to diagnose concurrent faults of
refrigeration equipment. Ma et al. [17] analyzed the research methods of multi-criteria
decision model and classification model in supplier portrait.
As mentioned above, rich research achievements are made by academia in economic
digitization and digital intelligence transformation, manufacturing transformation and
upgrading, and the application of big data technology and data mining methods in man-
ufacturing. However, most of the data-driven digital intelligence transformation appli-
cations are limited to the macro level, and there are few reports on the application of
big data technology and data mining-specific models and methods to digital intelligence
transformation and upgrading the innovative manufacturing industry. Therefore, from
the perspective of data-driven, starting with the development characteristics and needs
of innovative manufacturing industry, this paper builds a digital intelligence platform
based on big data processing technology and data mining methods, analyzes its digital
intelligence upgrading countermeasures, and helps the innovative manufacturing indus-
try realize the upgrading from digitization to digital intelligence. This has theoretical and
practical significance for exploring digital intelligence, Upgrading Countermeasures of
data-driven innovative manufacturing enterprises, and promoting the deep integration
between the digital and real economies.

2 Characteristics, Transformation Process and Development Needs


of Innovative Manufacturing Enterprises

2.1 Concept and Characteristics of Innovative Manufacturing Enterprises

An innovative enterprise is an innovative organization, which refers to an enterprise


that has strong innovation vitality in system, management, knowledge, technology, and
culture has the advantages of key technologies and intellectual property rights in the
industry and can respond sensitively to changes in the market environment [18]. Com-
bined with the situation and characteristics of today’s economic development, innovative
manufacturing enterprises can be defined as manufacturing enterprises with independent
intellectual property rights, well-known brands and representative products, relatively
strong international comprehensive competitiveness, relying on technological innovation
to obtain a competitive market advantage, and growth and sustainable development.
An innovative manufacturing enterprise often has the following characteristics:

(1) It has its R & D brands and representative products;


(2) It has a strong special R & D team and a certain degree of innovation;
(3) It has a large business scale and good enterprise growth;
(4) It has a good market position and plays a leading role in the industry;
Research on the Countermeasures of Digital Intelligence Transformation 109

2.2 The Process of Economic Transformation and Upgrading of Innovative


Manufacturing Enterprises

In general, the economic transformation and upgrading of innovative manufacturing


enterprises has gone through five main stages, which surf model is shown in Fig. 1.

Fig. 1. Surf model of economic transformation model of innovative manufacturing enterprises

(1) Automation stage (started around 1990): the idea of informatization is put forward,
which is mainly characterized by office automation and the installation of various
management software;
(2) Online stage (started around 2005): Internet technology has been developed, mainly
characterized by the development of e-commerce, social networks, and online
payment;
(3) Cloud stage (started around 2015): cloud computing technology tends to mature,
mainly characterized by the cloud of infrastructure and software facilities. At this
stage, the concepts of data middle station and AIOT middle station are put forward,
starting the transformation from business data to data business, and mobile payment
has developed rapidly;
(4) Two-wheel drive stage (started around 2019): the development stage of the digital
economy, which is mainly characterized by the digital pull of the consumer end,
the Digital Collaborative linkage of the supply end, and the two-wheel-drive of
business and technology;
110 Y. Li et al.

(5) Digital intelligence stage (started around 2020): it is characterized by taking con-
sumers as the core, giving full play to big data technology and artificial intelligence
technology, and making the whole link and life cycle intelligent from supply end
to consumption end.

2.3 Development Needs of Innovative Manufacturing Enterprises

Continuous innovation is the power source driving the development of enterprises, and
the primary demand for the development of innovative manufacturing enterprises is
continuous innovation.
Taking Midea Group, a well-known domestic enterprise, as an example, Midea Group
was established in 1968. After 52 years of development, it has become a mature global
technology group with digitization and intelligence.
In terms of marketing, pressing goods often occur in the household appliance indus-
try, resulting in a large backlog of inventory in the channel, especially in the seasonal
household appliance category of air conditioning. Midea Group has changed the tradi-
tional layered distribution mode through digital technology and innovated the “T + 3”
mode, starting from the order through three cycles of material procurement, manufac-
turing, sales, and delivery. The whole process determines production by sales. Bringing
product planning, marketing management, procurement preparation, manufacturing, and
logistics into the mainline of unified operation promotes the coordinated operation of
the internal value chain, reduces inventory, shortens order delivery cycle, and quickly
responds to market changes and differences. After the “T + 3” conversion, Midea can
organize production before receiving the order, thus eliminating the inventory backlog
of the channel.
In terms of R & D, in order to improve the user experience, Midea has developed
a high-speed distribution network technology, which simplifies the original seven-step
connection to three steps, and takes the lead in reducing the time from equipment distribu-
tion to successful connection to less than 5 s. At the same time, Midea has greatly reduced
the threshold for users to use smart appliances through the “smart touch” intelligent
function and can realize automatic network distribution and one key control.
In terms of management, Midea’s daily production plan involves nearly 2 million
kinds of materials. The amount of each material and who produces it must be arranged
in advance. These arrangements used to take at least 10 h. In 2020, Midea optimized the
material planning algorithm, and the material arrangement took only one hour, saving
90% of the time.
It can be seen that for innovative manufacturing enterprises, digital intelligence is
the only way of innovation. The basis of digital intelligence is data, and the way is data
processing technology. The goal is to improve enterprise efficiency, find out enterprise
profit points and enhance its industry competitiveness.
Research on the Countermeasures of Digital Intelligence Transformation 111

3 Digital Intelligence Platform Architecture of Innovative


Manufacturing Enterprises

3.1 Construction of Digital Intelligence Platform

Based on the data-driven concept and big data technology, this paper has built a digital
intelligence platform for innovative manufacturing enterprises. The general architecture
is shown in Fig. 2. The whole platform architecture consists of physical resource layer,
data source layer, big data processing layer, task management layer, data analysis layer
and decision analysis layer.

Fig. 2. Digital intelligence platform for innovative manufacturing enterprises

(1) Physical resource layer mainly includes the underlying physical devices, such as
some distributed computer clusters.
(2) Data source layer: innovative manufacturing enterprises have accumulated a large
amount of data in production, sales, and management. These data may be stored in
traditional databases, files, warehouses, or some distributed file systems HDFS.
112 Y. Li et al.

(3) Big data processing layer: There are two major problems to be solved for massive
data. Big data storage can be completed through HDFS, HBase, cloud database,
NoSQL database. Second, the distributed processing of big data can be completed
through MapReduce, spark, etc.
(4) Task management layer: the task management layer is the core of the platform.
It can connect the analysis function with the background cluster to ensure that
the platform has easy algorithm scaling, supporting task flow scheduling, rational
computing, and storage resource allocation.
(5) Data analysis layer: provides specific data analysis and processing tasks, including
exploratory data analysis, real-time data query, and data mining tasks.
(6) Decision analysis layer: interpret, evaluate and make decisions according to the
results of the data analysis layer.

This hierarchical architecture fully considers the distributed storage and processing
of massive data, the integration of different data mining algorithms, the configuration
and scheduling of multiple data processing tasks, and the support of data analysis results
for intelligent decision-making.

3.2 Specific Implementation Scheme of Data Mining in Innovative Manufacturing


Enterprises

In the platform shown in Fig. 2, data mining is the core and key of digital intelligence
of innovative manufacturing industry, its application framework is shown in Fig. 3.

Fig. 3. Application framework of data mining in innovative manufacturing industry


Research on the Countermeasures of Digital Intelligence Transformation 113

The specific model and method implementation scheme is as follows:

(1) Decision trees, neural networks, and support vector machines are used for cost
analysis and management.
(2) Multiple regression and model trees are used for product design.
(3) The methods of correlation analysis, neural network, and random forest are used
for production process management.
(4) Frequent pattern mining and multi-criteria decision trees are used for logistics and
supply chain management.
(5) Rough set, fuzzy logic, and other methods are used for intelligent decision support
management.
(6) Using cluster analysis, correlation analysis, and other methods to carry out customer
relationship management and emotional analysis.

4 Management Countermeasures for Digital Intelligence


Transformation and Upgrading of Innovative Manufacturing
Enterprises
China is a big manufacturing country. “Made in China” is moving towards the world and
becoming the most important link in the global industrial chain. But at the same time,
China is not a manufacturing power, and there is still a gap compared with developed
countries. Facing the fierce international competitive environment, we still need to further
improve the scientific and technological content of “China Intelligent Manufacturing”
through digital upgrading. The transformation and upgrading of digital intelligence is
the objective need to comply with the new round of global scientific and technological
revolution and industrial reform and the internal demand of manufacturing enterprises
to achieve high-quality economic development.
Innovative manufacturing enterprises have achieved remarkable results in digital
transformation, but the innovation process is not smooth. The resistance and difficulties
they face are mainly reflected in the lack of talents, insufficient innovation power, lack
of core technology, insufficient R & D investment, inadequate patent protection, and
insufficient combination of big data technology and data mining models and methods.
Therefore, the realization of digital intelligence of innovative manufacturing
enterprises should start from the following aspects:

(1) Strengthen the training of data analysis talents. Enterprises can join hands with
colleges and universities to cultivate high-quality data analysis talents through col-
leges and universities, which can be transmitted to all departments and links of the
enterprise.
(2) To increase R & D investment, we can combine independent R & D with a foreign
introduction to create core products and leading brands and occupy the market.
(3) Build a technological innovation system combining industry, University, and
research, so that the enterprise has growth and sustainable development.
(4) With the help of a new round of high-tech wave, combined with the big data plat-
form and data mining framework built in this paper, accelerate and expand the
114 Y. Li et al.

application of big data technology and data mining methods, apply them in all links
of production, sales and supply, improve the production efficiency of enterprises
and enhance their industry competitiveness.
(5) Strengthen the protection of enterprise intellectual property rights.

5 Conclusions

This paper analyzes the digital intelligence transformation needs of innovative manufac-
turing enterprises in today’s economic environment and studies its transformation coun-
termeasures. Starting from the characteristics of the innovative manufacturing industry
and its digital intelligence needs, build a data-driven digital intelligence platform for
it, provide a data mining framework, explore the digital intelligence Implementation
Countermeasures of innovative manufacturing enterprises from the perspective of the
specific implementation, open up the data management mechanism through multiple co-
governance, and provide effective solutions for the successful scientific transformation
of innovative manufacturing enterprises.
There is a long way to go in upgrading digital intelligence. Innovative manufacturing
enterprises should give full play to the leading role of transformation, take the lead in
realizing digital intelligence, and lead other manufacturing enterprises to promote the
development of data-driven digital economy jointly, turn intelligence into energy, and
build China into an intelligent manufacturing power as soon as possible.

Acknowledgment. This work was supported by the grants of the Heilongjiang Social Science
(Evolution path and development countermeasures of big data industry ecosystem in Heilongjiang
Province: 20JYB037).

References
1. Chinese government network: The 14th five year plan for national economic and social devel-
opment of the People’s Republic of China and the outline of long-term objectives for 2035
(2021). http://www.gov.cn/xinwen/2021-03/13/content_5592681.htm
2. CCID think tank. Research on digital transformation of traditional enterprises in the devel-
opment of new business formats and models of digital economy. China Information Weekly
(2020)
3. Cao, Y., Jiang, H.: Comparative analysis of China’s equipment manufacturing enterprises
and world-class enterprises based on case study. In: 2021 2nd International Conference on
E-Commerce and Internet Technology (ECIT) (03), pp. 79–82 (2021)
4. Dumitrache, I., Caramihai, S., Stanescu, A.: From mass production to intelligent cyber-
enterprise. In: 19th International Conference on Control Systems and Computer Science
(CSCS) (05), pp. 399–404 (2013)
5. Mejía, G., Lefebvre, D.: Robust scheduling of flexible manufacturing systems with unreliable
operations and resources. Int. J. Prod. Res. 58(21), 6474–6492 (2020)
6. Serrasqueiro, M., Martins, R., Ferreira, H.: Modeling system based on machine learning
approaches for predictive maintenance applications. KnE Eng. 06, 857–871 (2020)
7. Wang, W.: Digital intelligence drives business innovation. Manager 5(322), 10–12 (2021)
Research on the Countermeasures of Digital Intelligence Transformation 115

8. Wang, J.: Analysis on the regular characteristics and development path of digital transforma-
tion. In: Huawei Ecological Conference, Shenzhen, 17–18 May 2021
9. Chen, J.: Big data and China’s economic transformation and upgrading. Econ. Forum 3(584),
26–32 (2019)
10. Zhang, L.: Seizing the opportunity of industrial technological change and accelerating the
digital transformation of manufacturing industry. China electronic news, 18 May 2021
11. Jiang, J.: Intelligent development mode and countermeasures of intelligent manufacturing
industry. Light Ind. Sci. Technol. 35(9), 149–150+152 (2019)
12. Li, L., Shi, A., Liu, J.: 40 years of China’s manufacturing industry: intelligent process and
prospect. China Soft Sci. (01), 1–9+30 (2019)
13. Pan, Y., Li, M., Li, X.: Analysis on association rules of influencing factors of shipbuilding
cost. J. Wuhan Univ. Technol. (Inf. Manage. Eng. Edn.) 39(03), 353–358 (2017)
14. Wei, W., Chen, Z., Yuan, J.: A product process adaptive design method based on manufacturing
big data. China Eng. Sci. 22(04), 42–49 (2020)
15. Guo, H., Zhang, R., Zhu, Y., et al.: Sustainable quality control mechanism of heavy truck
production process for Plant-wide production process. Int. J. Prod. Res. 58(24), 7548–7564
(2020)
16. Zhou, Y., Wu, K., Sun, Y., et al.: Fault diagnosis of refrigeration equipment based on data
mining and information fusion. Vib. Test Diagn. 41(02), 392–398+418 (2021)
17. Ma, M., Pan, S.: Overview of supplier portrait research based on data mining. Wirel. Commun.
Technol. 29(04), 55–60 (2020)
18. Xi, J.: Encyclopedia Dictionary of Scientific Outlook on Development. Shanghai Dictionary
Publishing House, Shanghai (2007)
Data Mining and Association Rules
A Robust Matting Method Combined
with Sparse-Coded Model

Guilin Yao1,2(B) , Huixin Yang1,2 , and Shirui Wu1,2


1 Harbin University of Commerce, Harbin 150028, China
glyao@hrbcu.edu.cn
2 Heilongjiang Provincial Key Laboratory of Electronic Commerce and Information Processing,

Harbin 150028, China

Abstract. Digital image matting is widely used in virtual reality, augmented real-
ity, and film and television production. It is mainly used to solve the problem of
separating foreground information from background information. Specifically, it
is to solve the value of the unknown pixels between foreground and background
information. Digital image matting methods mainly include sampling method and
affine method, this paper proposes a sampling method that combines Sparse-Coded
and Robust matting algorithm, named SRMatting. The robust algorithm uses a
local sparse single-point sampling method at the edge. Sparse-Coded sampling
has a lot less color redundancy than dense sampling. Still, local edge sampling
has the problem of insufficient depth and breadth color information collection.
Therefore, this article combines the Sparse-Coded algorithm (global depth sparse
method) with the sampling part of the Robust algorithm to enrich the diversity
of samples and adds the preprocessing steps to the new algorithm SRMatting to
improve the accuracy of the solution and reduce the amount of calculation.

Keywords: Digital image matting · Robust matting · Sparse-coded matting

1 Introduction
Digital image matting is different from image segmentation, and it is a classic problem
in a type of image editing and image processing [1]. Image matting adopts the concept
of α in the matting process. The following formula usually solves the unknown pixels:

Ii = αi Fi + (1 − αi )Bi , (1)

where α is the opacity of the pixel, and its value range is [0, 1]. If α = 1 we call it the
absolute foreground, if α = 0, we call it the absolute background, and the other points
are called mixed points. To use this formula to solve α, seven unknowns need to be
estimated from three equations. The problem is ill-posed. Therefore, many algorithms
now solve this problem using user annotations and propose the Trimap template, which is
divided into three areas: absolute foreground F , absolute background B , and unknown
area U , so the image matting problem has gradually evolved into solving the Trimap
problem.

© The Author(s), under exclusive license to Springer Nature Switzerland AG 2022


A. E. Hassanien et al. (Eds.): BIIT 2021, LNDECT 107, pp. 119–130, 2022.
https://doi.org/10.1007/978-3-030-92632-8_12
120 G. Yao et al.

Common matting methods are mainly divided into three categories: 1. Sampling-
based methods, such as Bayesian [2], Shared algorithm [3], etc.; 2. Affine-based methods,
such as Closed-Form [4], KNN [5] algorithm, etc.; 3. The method of combining sampling
and affine, such as the Robust [6, 7] algorithm used in this article. The Sampling-based
method, as the name implies, {Fi , Bi , αi } is calculated by point-by-point calculation. In
this method, the correlation between points is ignored, and each point exists in isolation
so that users can customize more optimized sampling according to their needs. In this
way, some poor effects can be ignored, which is conducive to future improvements. The
Affine-based method uses the correlation between points to recursively find the α value
in the unknown area. It can obtain a closed solution by modeling a mature quadratic
function, and it can avoid the problem of alpha discontinuity caused by the sampling
method. Still, the treatment effect is not good for some discontinuous objects (with
holes). It is divided into two stages, local and non-local. Propagation matting algorithms
based on local criteria such as Closed-Form, which propagate related constraints to the
entire image by minimizing the quadratic cost function, and non-local criteria-based
propagation matting algorithms such as KNN, which solve for the nearest K points of
pixels are used to estimate the pixel value of the position point.
Robust matting considers the proximity of samples to collect samples from the
known nearby foreground and background pixels and selects three sample pairs with
the highest confidence among them to estimate unknown sample pairs. Compared with
pure sampling methods, it combines Closed-Form. The idea of the matting algorithm
is to solve the linear system through the Laplace matrix and conjugate gradient method
to achieve the purpose of α optimization. Since the selection is made by considering
the color distortion, it provides more robust results than the rejection system. However,
since sampling only considers the spatial proximity to unknown pixels, the candidate
set may lack real samples, which reduces the quality of the matting.
The Sparse-Coded algorithm [8] is a matting algorithm based on global sampling.
It is different from the sampling method in the Robust algorithm. It builds dictionary
atoms by collecting the average color of superpixels at the boundary of the three-part
image and directly calculates α. The Sparse-Coded matting algorithm uses probability
segmentation as a clue to mark unknown pixels as high certainty and low certainty.
Samples that can be well separated from the foreground and background information
are marked as high certainty. Otherwise, they are marked as low certainty. The higher the
certainty, the smaller the dictionary. The sparse coding estimation at the pixel is combined
with its feature space neighbors and spatial neighbors in a weighted graph model and
solved in a closed-form to obtain the final. This modification improves performance and
greatly outperforms previous work on the benchmark data set.

2 Previous Work
2.1 Classification of Sampling Methods
Sampling methods can be divided into four categories: 1. Local and global sampling
methods: local sampling is only allowed to sample near unknown points, such as
Bayesian, BP, Robust algorithms, etc., global sampling methods can be performed in
places far away from unknown points sampling, such as Global, SRLO, WCT algorithm,
A Robust Matting Method Combined with Sparse-Coded Model 121

etc. The local type is better than the global type for processing areas with more color
overlap, but it is not good for some images with little change. 2. Sparse and dense sam-
pling methods: sparse sampling includes Robust, Sparse-Coded, and SVR algorithms,
and dense sampling includes BP, Shared, and Global algorithms. Color redundancy is
reduced compared with the sparse and dense types, but some key samples may be missed
if the collected samples are too scattered. 3. Edge and depth type sampling methods: edge
type has Robust, Shared, ISC algorithm, etc., depth type has BP, Sparse-Coded algo-
rithm, etc. The edge type can only sample at the edge of the unknown area, while the
depth type can sample inside the known area. The depth type can use the vertical “depth”
information of the edge of the known area and has more information than the edge type.
4. Parameters and single-point sampling methods: parameter methods include Bayesian,
Comprehensive, and CWCT algorithms and single-point methods include SVR, Robust,
and Sparse-Coded algorithms. The single-point type is better than the parametric type
to handle areas where the local color texture changes greatly, so the parameterization
method is not common.
Considering comprehensively, this paper adopts the sparse single-point sampling
method of local and local mixing. Although Robust is an edge type and Sparse-Coded
is a depth type, I think the two are not conflicting, but α can be solved better.

2.2 Other Problems with Sampling


Regarding the setting of sample weights, the new algorithm uses color differences as
weights. It emphasizes the color similarity between unknown and known points—the
higher the similarity, the better the sample. Secondly, regarding the color space issue,
the new algorithm inherits the six-dimensional color space in Sparse-Coded, which
combines Lab and RGB color spaces for feature selection. The Lab mode makes up for
the deficiencies of the two-color modes of RGB and CMYK, and the light is independent
of the device, and the processing speed is faster. It consists of three channels. The first
channel is lightness, which is “L”; the color of the “a” channel is from red to dark green,
and the color of the “b” channel is from blue to yellow. Finally, unlike many samples,
Sparse-Coded does not use the matting formula. It collects samples globally and uses the
obtained samples as a dictionary. After that, it solves the Lasso regression problem and
obtains the sparse coding of the dictionary for each unknown point. Add all the codes
corresponding to the foreground to get the α value. Since non-zero encoding is aimed at
a few dictionaries, it is not good to handle areas with high color overlap, but Robust’s
local sampling method can neutralize this shortcoming. It is worth mentioning that the
concept of superpixels is used in Sparse-Coded. The superpixel algorithm combines
pixels into perceptually meaningful atomic regions. They capture image redundancy
and provide convenient primitives for calculating image features, and greatly reduces
the complexity of subsequent image processing tasks.

3 Algorithm Implementation
3.1 Preprocessing
Preprocessing is an important step before calculation. Its purpose is to divide some
points in the unknown area into the absolute foreground and background in advance
122 G. Yao et al.

by analyzing the unknown area in Trimap, setting some spatial thresholds and color
thresholds. This can narrow the range of the unknown area, and it can make the accuracy
of α higher.
The preprocessing step of this algorithm uses the static threshold method. It deter-
mines whether the unknown point in the foreground or the background by the following
formula [8]:

(D(i, j) < Ethr ) ∧ (Ii − Ij  ≤ (Cthr − D(i, j))) (2)

where D(i, j) is the Euclidean distance between the pixel i and j, I i is the color at the
pixel i, and the thresholds of Ethr and Cthr are set to 15 and 4 in this algorithm, respec-
tively. If the pixel i can satisfy the formula, it can be judged as an absolute foreground,
which greatly improves the solution’s efficiency and reduces the workload.

3.2 α Solving

Robust Algorithm Sampling. Robust algorithm [6] noticed that dense sampling is
likely to miss more information on the edge width, and at the same time it is easy
to bring color redundancy in depth, so it places local sampling the foreground edge F
and background edge B , and uses sparse sampling. It finds the sum of the points Fz
and Bz closest to the currently unknown pixel z in the Euclidean distance from the two
areas, and then sums along the two sides F and B from these two points, collecting
a sample point every several distances, in this way, a total of 20 front scenic spots and
20 background points are collected to form 400 sample pairs.
After sampling, high-quality sample points need to be selected. A good sample
should be a linear combination of foreground and background pixels. Specifically, for
unknown foreground and background, the estimated αr should be:
  
 C − Bj F i − Bj
αr =   (3)
F i − Bj 2
 
where C represents the pixel color of the unknown point, and F i , Bj is the
foreground and background sample pair of the unknown pixel.
Then evaluate the sample pair by defining a distance ratio:
    
 C − α F i + 1 − α Bj 
 

Rd F , B =
i j   (4)
F i − Bj 

where the denominator is the color Euclidean distance between the foreground and
background samples, and the numerator is the error between the color of the unknown
point C and the estimated color of the sample.
In general images, the points belonging to the absolute foreground and the absolute
background occupy the majority, so the following two weights are defined:
  2
 
ω Bj = exp −Bi − C  /DB2 (5)
A Robust Matting Method Combined with Sparse-Coded Model 123

  2
 
ω F j = exp −F i − C  /DF2 (6)

Where DF and DB are the maximum color distance between it and the foreground
and background samples:
 
 
DF = maxi F i − C  (7)
 
 
DB = maxj Bi − C  (8)

Taking these factors into account, the confidence of a pair of samples should be
expressed as:
  
Rd F i , Bj · ω(F i ) · ω(Bj )
f F , B = exp −
i j
(9)
σ2

Where σ is a global constant, usually the value is 0.1.


Finally, three pairs of samples with the highest confidence are selected, and the
average value obtained is used as the initial value.

Sparse-Coded Algorithm Sampling. The pairwise method cannot produce an accurate


matte. At the same time, the Sparse-Code algorithm uses multiple unpaired sum samples
to process the texture and smooth parts to obtain more accurate results. It saves the pixel
information of each unknown point in a dictionary. If the pixel of the unknown point
can be well separated from the known foreground and background, it is marked as high
certainty. The smaller the dictionary, otherwise, if it exists, complex and overlapping
areas of color distribution require a larger dictionary to capture the variability of colors.
Select multiple pairs of samples that can best reconstruct the pixel from a subset of labeled
samples for a given unknown pixel. The sum of reconstruction coefficients directly gives
α.
The feature vector used for encoding is a 6-dimensional vector [R, G, B, L, a, b]T
In order to reduce the sample space, Sparse-Coded adopts the concept of super pixels
[9], and uses the Slic [10] algorithm to cluster the boundary of the unknown area in
a frequency band with a width of 40 pixels and divide the pixels marked as sum into
super pixels. It is worth mentioning that among many segmentation algorithms, Slic
has obvious advantages. It significantly reduces the number of distance calculations in
optimization by limiting the search space to an area proportional to the size of the super
pixel [11]. It is formed by concatenating two color spaces RGB and CIELAB respectively.
This reduces the linear complexity of the number of pixels and is independent of the
number of superpixels. The average vector of each superpixel represents the sum samples
that make up the full set. The following shows a set of examples of dividing superpixels
(Fig. 1):
124 G. Yao et al.

Fig. 1. 1-a is the original image, and 1-b shows the super pixels we have marked. It can be seen
that the pixels in each block of super pixels are very similar.

Sparse-Coded [8] uses probabilistic segmentation to mark unknown pixels as


high/low certainty, and the probability of a given pixel i belonging to the foreground is:
pf (i)
p(i) = (10)
pf (i) + pb (i)

where the foreground color similarity is calculated by the following formula:


 
m
k=1 c(i) − c(fk )
2
pf (i) = exp (11)
m·δ

Here C(·) is the RGB color value, which m is the number of samples that are spatially
close to the foreground, m and δ are two constants, usually set to m = 10, δ = 1. pb is
also calculated like this. The sample here is the average vector of the nearest superpixel
in space.
In order to determine the position of each sum atom F and B during sparse coding,
the F and B samples are horizontally spliced into [F1 , F2 , . . . Fn , B1 , B2 , . . . Bn ], and
given a dictionary D of unknown pixels, its sparse code is determined as

β = argminvi − Dβi 22 s.tβi 1 ≤ 1; βi ≥ 0 (12)

where vi is the feature vector at the pixel i, and the sparse code β i is generated
by the Lasso algorithm. The tangent point between the contour line and the constraint
domain is the optimal solution of the objective function [12]. For the Lasso method, the
constraint domain is a square, and there will be a tangent point with the coordinate axis,
which makes the weight of some dimensional features 0, so it is easy to produce sparse
results [13]. Therefore, the Lasso method can achieve the effect of variable selection,
compressing insignificant variable coefficients to 0.

SRMatting. Finally, α calculated by Robust and Sparse-Coded is calculated according


to the weight of 1:1:
  

α = 0.5 × α r + 0.5 × α s (13)


A Robust Matting Method Combined with Sparse-Coded Model 125

3.3 Post-processing

Using the affine method to smooth the results is a method that many algorithms will use.
The common ones are the Matting Laplacian class and the Nonlocal class. The former
builds a matrix according to the space, the latter builds a matrix according to the color,
and this article uses the former to smooth the result.
Achieving post-processing is to achieve the goal of more absolute pixels than mixed
pixels, so α should be locally smoothed and respect the letters selected for each individual
pixel, that is, satisfy the data and neighborhood constraints.
In a graphical model [7], the initial α and a smooth term are combined as data items,
and then a closed form of coefficient linear equations is used to solve. For example,
the graph model shown in the figure below, virtual nodes F and B represent pure
foreground and pure background. The white nodes represent unknown pixels on the
image grid, and the red and blue nodes are known foreground and background pixels
marked by the user (Fig. 2).

Fig. 2. Graph model

The data weight is the probability that the pixel is the foreground and the background,
which are defined as:
  
W (i, F) = γ · [fˆi α̂i + 1 − fˆi δ α̂i > 0.5 (14)

    
W (i, B) = γ · [fˆi 1 − α̂i + 1 − fˆi δ α̂i < 0.5 (15)

where α and f are estimated and confidence values, δ is Boolean functions that return
0 or 1. γ is the parameter that balances the weight and weight of the data, that is, the
confidence value of the pixel. It can be defined as:

γ (i) = γsprec (i) · γcolrec (i) (16)

where γsprec (i) = e−vi −vi  measure the credibility of reconstructing the input
2


2
(i) = e−Ii −[Fi Bi ]α i 
 rgb rgb
feature vector based on the sparse coefficient (v = Dβ ), γ
i i colrec
measures color distortion.
126 G. Yao et al.

In order to make the result smoother, we need to define the weights between nodes.
The Closed-Form algorithm solves α by establishing a color linear model and combining
with the matting formula. In this algorithm, we also apply this idea to obtain a more
optimized solution [14]. In order to strengthen the smoothing constraint of the image,
the neighborhood term W ij is defined by the sum of all 3 × 3 windows containing pixels
i and j:
(i,j)∈wk 1   ε −1  

W (i, j) = 1 + (Ci − μk ) k + I Cj − μk , (17)
k 9 9

where wk represents a collection of 3 × 31 windows containing pixels i and j, and k


iterates over these windows. uk and k are the color mean and variance of each window.
ε is a regularization coefficient, which is set to 10−5 in this system.
In order to minimize the total image energy on the actual value, we use a random
walk method to solve the icon labeling problem. The energy function for solving α is:
 N   2

E=λ (αi − hi )2 + Wij αi − αj , (18)
i∈v i=1 j∈Ni

where λ is fixed to 100, N is the total number of pixels in the graph model node, which
can ensure that the final α can be consistent with the constraints specified by the user. v is
a set of well-determined foreground and background pixels, it is to ensure that adjacent
pixels share similar values. hi is a user-defined constraint, 0 is a clear background, and 1
is a clear foreground. N i represents the spatial sum and feature neighbors of pixels and
two virtual nodes.
The energy function is written in matrix form:

E = λ(α − H )T (α − H ) + α T LT Lα, (19)

where H is a N × 1 vector with user constraint values, and the Laplacian matrix L
is defined as [15]:

⎨ Wii : if i = j
Lij = −Wij : if i and j are neighbors , (20)

0 : otherwise

where Wii = j Wij , L is a N × N sparse symmetric positive definite matrix, N is
the number of all nodes in the graph, including all pixels in the image plus two virtual
nodes B and F . Finally, the conjugate gradient method is used to solve the linear
system.

4 Experimental Results and Analysis


4.1 Experimental Environment
Experimental environment: This algorithm is written and tested on a PC with an Intel
Core i5 CPU of 2.30 GHz and 8G memory. The development environment is mixed
A Robust Matting Method Combined with Sparse-Coded Model 127

programming of VS2019 and Matlab2020a. Regarding the test set, 27 training images
and 8 private images in the online evaluation [16] system are used. Each training image
provides three Trimap templates: Huge\Large\Small. The size of the template is distin-
guished according to the size of the unknown area. It also gives the true α value of each
image, which is convenient for users to evaluate the effect of the algorithm. Since there
are very few test images with true values, it is impossible to compare their parts quanti-
tatively. Therefore, we did not introduce more test images, but these 35 images contain
most of the features contained in the images and cover various pixel transparency from
opaque to completely transparent. We believe that the test result of the new algorithm
on this data set is very convincing.

4.2 Comparison of Experimental Results

For some simple images, the effect of each algorithm is not much different, so we
choose a typical part from some complex images for final comparison. This does not
mean that our algorithm has no universal applicability for image matting. Some typical
parts selected here can better illustrate that our algorithm can better achieve separating
the background and the background.
The following is the comparison of Robust, Sparse-Coded, Closed-Form and SRMat-
ting algorithms for α solving. For the convenience of comparison, we have selected four
typical parts from the data set to illustrate the advantages of the new generation algorithm
(Fig. 3).

Original Picture Ground Truth Robust Sparse-Coded Closed-Form SRMatting

Fig. 3. Comparison of processing results of different algorithms on different parts


128 G. Yao et al.

First of all, for the image GT05 (the first one), the first three algorithms deal with
this part rather fuzzy and do not generate a clear boundary, while the newly generated
algorithm has a better processing effect. Secondly, for the picture DT02 (second picture),
select its hair area. The processing results of the first three algorithms have certain defects,
such as blurring. The processing effect of the new algorithm is closer to the real value.
Then for the picture GT18 (third picture), its special feature is that its foreground is very
close to the background pixels, which easily makes some matting algorithms invalid.
Still, our new algorithm SRMatting can effectively identify the foreground information.
Finally, the image GT10 (fourth) was selected. It has part of the background embedded
in the foreground information and is far from the main background. It can be seen that
the new algorithm can better deal with the hole area than the first three algorithms.

4.3 Performance Evaluation

The above is a visual comparison of the processing results of different algorithms. In order
to evaluate the SRMatting algorithm more objectively, the following four algorithms have
passed the two evaluation criteria of SAD and MSE on three different types of Trimap
in 35 images. As shown in the following Tables 1 and 2:

Table 1. Three algorithms are ranked according to MSE

Rank Algorithm Trimap type Average


Huge Large Small
1 SRMatting 6.50 4.27 2.10 4.29
2 Robust 6.85 4.16 2.53 4.51
3 Closed-Form 9.39 3.93 2.24 5.18
4 Sparse-Coded 9.26 6.02 3.43 6.23

Table 2. There algorithms are ranked according to SAD

Rank Algorithm Trimap type Average


Huge Large Small
1 SRMatting 5.83 3.96 2.70 4.16
2 Robust 6.86 4.40 2.96 4.74
3 Sparse-Coded 7.34 4.97 3.41 5.24
4 Closed-Form 10.36 4.60 2.90 5.95

It can be seen that the algorithm we proposed has excellent results in SAD and MSE.
Among the four algorithms, the average value of both MSE and SAD is the smallest,
which shows that the result of our algorithm is closer to the true value. Closed-Form,
A Robust Matting Method Combined with Sparse-Coded Model 129

as a pure affine method, is prone to failure for complex areas. Although Robust uses
Closed-Form to optimize α, it is easy to lose some pixel information for local sampling,
resulting in insufficient accuracy. Sparse-Coded performs global sampling, but it is easy
to fail when the foreground and background overlap is high. At the same time, SRMatting
relies on the innovation of combining Sparse-Coded and Robust algorithms on sampling.
This combination can make the obtained sample set more comprehensive, so the result
can be more accurate.

5 Conclusion

We propose a robust matting algorithm combined with Sparse-Coded. This combination


of global and local sampling allows us to collect a more accurate sample set. At the
same time, the combination of depth and edge can also improve the accuracy of the
results. In order to increase the running speed, we also divide the superpixels into pixel
blocks, and at the same time, add the preprocessing and post-processing so that our
algorithm can generate accurate images for complex images in a robust manner. Like
most matting algorithms, the newly generated algorithm SRMatting can separate the
foreground of a simple image from the background. Still, it has better calculation results
in image matting of specific scenes, such as background embedding into the foreground
area, transparent objects, hair areas, etc. Finally, we also tested the running time of
the algorithm. The average running speed of the new algorithm is also faster than the
other three algorithms in this article. At the same time, the quantitative and qualitative
evaluation of the benchmark data set also shows that SRMatting has a great deal of
handling the matting problem. In the next research, more data sets will be sought to test
it, and it will also be considered to extend to the video field.

Acknowledgment. This work is supported by the Youth Innovation Talent Support Program of
Harbin University of Commerce (No. 2020CX39).

References
1. Yao, G., Zhao, Z., Liu, H.: A comprehensive survey on sampling-based image matting.
Comput. Graphics Forum 36(8), 1–17 (2017)
2. Chuang, Y.-Y., Curless, B., Szeliski, R.: A bayesian approach to digital matting. In:
Proceedings of IEEECVPR, pp. 264–271 (2001)
3. Gastal, E.S., Oliveira, M.: Shared sampling for real-time alpha matting. Comput. Graphics
Forum 29(2), 575–584 (2010)
4. Levin, A., Lischinski, D., Weiss, Y.: A Closed-form solution to natural image matting. IEEE
Trans. Pattern Anal. Mach. Intell. 30(2), 228–242 (2008)
5. Chen, Q., Li, D., Tang, C.-K.: KNN matting. In: Proceeding of IEEE CVPR, pp. 869–876
(2012)
6. Wang, J., Cohen, M.: Optimized color sampling for robust matting. In: Proceedings of IEEE
Conference on Computer Vision and Pattern Recognition, Minneapolis, USA, pp. 1–8 (2007)
7. Comaniciu, D., Meer, P.: A robust approach toward feature space analysis. IEEE Trans. Pattern
Anal. Mach. Intell. 24(5), 603–619 (2002)
130 G. Yao et al.

8. Johnson, J., et al.: Sparse coding for alpha matting. IEEE Trans. Image Process. 25(7), 3032–
3043 (2016)
9. Fulkerson, B., Vedaldi, A., Soatto, S.: Class segmentation and object localization with
superpixel neighborhoods. In: International Conference on Computer Vision (ICCV) (2009)
10. Achanta, R., et al.: Slic superpixels compared to state-of-the-art superpixel methods. IEEE
Trans. Pattern Anal. Mach. Intell. 34(11), 2274–2282 (2012)
11. Yang, J., Wright, J., Huang, T.S., Ma, Y.: Image super-resolution via sparse representation.
IEEE Trans. Image Process. 19(11), 2861–2873 (2010)
12. Johnson, J., Rajan, D., Cholakkal, H.: Sparse codes as alpha matte. In: Proceeding of BMVC
(2014)
13. Gould, S., Rodgers, J., Cohen, D., Koller, D.: Multi-class segmentation with relative location
prior. Int. J. Comput. Vision (IJCV) 80(3), 300–316 (2008)
14. Levin, A., Lischinski, D., Weiss, Y.: Colorization using optimization. ACM Transactions on
Graphics (2004)
15. Grady, L.: Random walks for image segmentation. IEEE Trans. Pattern Anal. Mach. Intell.
28(11), 1768–1783 (2006)
16. Rhemann, C., et al.: Alpha Matting Evaluation Website (2009). http://www.alphamatting.
com. Accessed Jun 2016
Segmentation of Cervical Cell Cluster
by Multiscale Graph Cut Algorithm

Tao Wang(B)

Network and Education Technology Center,


Harbin University of Commerce, Harbin 150028, China
wt@hrbcu.edu.cn

Abstract. The segmentation and recognition of cervical cell clusters is a major


challenge for the automatic screening of cervical cancer cells. This paper presents
a model based on a multiscale graph cut algorithm for automatically segmenting
cervical cell clusters. Global seed nodes are obtained using a multiscale graph
cut algorithm to coarsely segment the cervical cell sample image and combine the
confidence region method. Then, according to the global seed nodes and the global
graph cut algorithm, the segmentation of the sample image is performed. The
experimental data shows that the proposed algorithm is better than the currently
widely used threshold watershed algorithm in DSC, accuracy, and recall measures.

Keywords: Cervical cell cluster · Multiscale segmentation · Graph cut ·


Confidence region

1 Introduction
Cervical cancer is one of the most common malignant tumors of the female reproductive
system, and it has a huge impact on women’s physical health [1]. In the world, new
cases of cervical cancer reach about 500,000 every year. The high incidence of cervical
cancer is mostly concentrated in the 40–60 age groups. However, in recent years, the
incidence of patients has been getting younger, and the average age has a downward
trend [2, 3]. Pap smear (PAPSMEAR) [4] is one of the earliest and most important
methods of cervical cancer screening to reduce the mortality of women with cervical
cancer. The H&E staining method provides a more stable and accurate image of the
cervical sample for the Pap smear. Currently, cervical screening relies too much on the
clinical experience of the pathologist and requires a lot of time and energy from the
pathologist. Even pathologists with rich clinical experience are prone to misdiagnosis
under the influence of energy and subjective judgment [5].
In order to solve this problem, low-cost, high-efficiency computer-aided diagnosis
technology has begun to develop. Computer-aided diagnosis technology can reduce the
fatigue of manual inspection and reduce the rate of misdiagnosis and missed diagnosis.
Cervical cell detection is mainly divided into cervical sample cytoplasm detection and
nucleus detection. Existing methods have made some progress in the field of cervical
cell detection [9–14]. Algorithms that use cytoplasm as the segmentation target mainly

© The Author(s), under exclusive license to Springer Nature Switzerland AG 2022


A. E. Hassanien et al. (Eds.): BIIT 2021, LNDECT 107, pp. 131–140, 2022.
https://doi.org/10.1007/978-3-030-92632-8_13
132 T. Wang

include morphological methods, k-mean [11, 15, 16], edge detection [14], threshold
method [18], active contour [16, 20], adaptive watershed, which are usually applied to
single-cell segmentation. For segmentation images containing overlapping cells, meth-
ods such as threshold [12, 17–19] and level set [20–22] are usually used. The level set
method obtains a sub-optimal result, likely to converge to a local minimum and obtain a
locally optimal solution. This optimal local solution may be far from the optimal global
solution. Zhang et al. [13] used the multi-threshold Otsu algorithm to segment the cyto-
plasm. However, it could not obtain better results due to the limitations of objective
conditions, such as uneven staining, light differences, and cell overlap.
The active contour model [16, 23], a morphological watershed method [12, 25,
26], a morphological corrosion method can segment the cell nucleus. Zhang et al.
proposed an expectation-maximization algorithm combined with distance transforma-
tion, an unsupervised algorithm that uses ellipse fitting technology for cell nucleus
segmentation.
The graph cut method has attracted the interest of many experts and scholars in cell
segmentation. In binary cell nucleus image segmentation, the Graph cut method obtains
more accurate results than the global threshold. However, the application of binary
images seems to be very limited. Therefore, the proposed algorithm has not attracted
any attention during the ten years. In the late 1990s, new computer vision methods were
proposed to describe applying Min-cut/Max-flow algorithms to solve non-binary image
problems. Applying prior knowledge of cell nucleus shape, manual annotation (seed
node), and local image features on the graph segmentation framework can get more robust
segmentation results. Boykov et al. discussed the theoretical nature of graph construction
in computer vision and gave the conditions that the energy function minimized by the
graph cut algorithm should meet. Song et al. combined a multiscale convolutional neural
network with a graph cut algorithm and obtained relatively ideal results in overlapping
cell nucleus segmentation. At the same time, the concept of superpixel was introduced,
which greatly reduced the computational complexity.
In most cases, since most abnormal cells are in clusters, pathologists need to pay
special attention to clumps. The cell clumps have the characteristics of complex composi-
tion, diverse shapes, and irregular boundaries. The characteristics bring great challenges
to computer-aided diagnosis methods. In the clinical samples, the complicated arrange-
ment of white blood cells, dust, impurities, uneven light, uneven staining make it more
difficult to segment the cell clusters. In the automatic segmentation and identification
of cervical cancer cells, the segmentation algorithm for cervical cell clusters is not yet
mature. Therefore, it is necessary to study the automatic and accurate segmentation
algorithm to segment cervical cell clusters.

2 Multi-scale Graph Cut Theory


2.1 Multiscale Theory

Multiscale technology can express and process images at different scales. The reason is
that the characteristics that are not easy to see in one scale or the acquired characteristics
are easy to find or extract in other scales. The pyramid structure is a form of multiscale
Segmentation of Cervical Cell Cluster 133

expression of images. The pyramid obtained by the Gaussian convolution function is


called the Gaussian pyramid, and the convolution function is shown in Formula (1).

1 − (x−x0 )2 +2(y−y0 )2
G(x, y) = e 2σ (1)
2π σ 2
Gaussian Pyramid adds Gaussian filtering on the down-sampling of multi-resolution
pyramids. Gaussian blurring is performed with different parameters σ for each layer of
the pyramid so that each layer of the pyramid has multiple Gaussian blurred images, as
shown in Fig. 1.

Fig. 1. Gaussian pyramid image

2.2 Graph Cut Algorithm Theory


The problem of optimal image segmentation is minimizing the energy function, so
the problem of image segmentation is the problem of minimizing the energy function.
GREIE et al. applied the Min-cut/Max-flow algorithm to the computer vision field for
the first time to solve the optimal solution of the energy function. The energy function
solved by GREIE et al. is shown in Formula (2).
 
E(L) = Dp (Lp ) + Vp,q (Lp , Lq ) (2)
p∈P {p,q}∈N

L = {Lp |p ∈ P} represents the mark of image P; Dp is a data item, which represents


a data cost function; Vp,q (·) is an interaction term, which represents a non-continuous
cost; N represents a collection of all pairwise neighborhoods. There are two common
energy functions: Potts energy function and interactive linear model function. In general,
the Potts energy model is used in image segmentation, target recognition, augmented
reality. The latter is generally used in the field of 3D reconstruction. The Potts energy
function used in this paper has the form of Formula (3):
 
E(I ) = ||Ip − Lp || + k K(p,q) · T (Ip = Iq ) (3)
p∈P (p,q)∈N

Ip represents the pixel value of the p pixel in the image P; the k value represents
the weight coefficient of the data item and the interaction item; Lp represents the label
134 T. Wang

assigned by the p pixel; K(p,q) represents the cost of discontinuity; T (Ip = Iq ) is the
indicator function. The specific forms of Lp , K(p,q) , and T (Ip = Iq ) are shown in Formulas
(4), (5), and (6).

255 foreground
Lp = (4)
0 background
= e−(Ip −Iq )
2
K (p,q) (5)

1 Ip  = Iq
T = (6)
0 Ip = Iq
For Lp , its value depends on the label (foreground or background) assigned to the
p pixel. It can be seen that the data item of the energy function represents the cost of
assigning the corresponding label to the pixel. The interaction term K(p,q) · T of the
energy model represents the cost of assigning different labels to p pixels and q pixels.
The graph cut algorithm can be applied to solve the global minimum of the energy
function. First, define a directed weight graph G = ν, ε. The directed graph G is com-
posed of a set of vertices ν and a set of directed boundaries ε, and the set of vertices ν
is connected by the set of directed boundaries ε. Normally, each vertex corresponds to a
pixel in the image. The directed graph contains several special vertices called terminal
vertices, and the terminal vertices correspond to the labels assigned by the image pix-
els. In general, only consider the existence of Source(S) terminal vertices and Sink(T )
terminal vertices. In other words, there are only two labels (foreground or background)
for each pixel.
There are two types of terminal boundaries in G = ν, ε: n − link boundary and
t − link boundary. n − link connects the pixels (non-terminal pixels) in the neighborhood
system in the graph, so the weight of the n − link boundary represents the cost of
discontinuity between pixels. t − link connects the pixel point and the terminal vertex,
so the weight of the boundary of t−link represents the cost of assigning the corresponding
label to the pixel point. It can be seen that the weight of the n−link boundary corresponds
to the data item of the energy function, and the t−link weight of the boundary corresponds
to the interaction term of the energy function.
From the directed graph constructed above and the boundary weights, it can be seen
that the problem of solving the energy function can be transformed into the boundary
cutting problem of the graph.

3 Cervical Cell Cluster Segmentation Based on Multi-scale Graph


Cut Algorithm
First, the full-size cervical sample image is transformed into images of different scales
through multiscale transformation. The multiscale graph cut algorithm is used to segment
the images at different scales roughly. Then, according to the definition of the confidence
region proposed in this article, a confidence region is generated from the up-sampled
cervical sample image, and a probability distribution map of seed nodes is generated
from the confidence region. Finally, the seed node and the probability distribution graph
are applied to the global cell cluster segmentation graph cut algorithm.
Segmentation of Cervical Cell Cluster 135

3.1 Pre-processing

Process the given cervical sample image in the RGB color space, and select the channel
according to the strength of the color space contrast. The image of each channel of the
original image is shown in Fig. 2, and the sequence of the image is R channel, G channel,
and B channel. According to the contrast in Fig. 2, this paper selects the G channel image
for processing. Because the cervical sample image is affected by light, staining, and other
noise, the original image is pre-processed using histogram equalization to enhance the
contrast.

Fig. 2. Each channel image

3.2 Seed Node Generation

The pre-processed G image is down-sampled based on the Gaussian pyramid. According


to experience, the value of σ is 1.6 in this paper. The G channel image IG is down-sampled
by the Gaussian pyramid to obtain the multiscale pyramid image Is , ∀s ∈ {1, ..., N },
and I1 , IG have the same size. The graph cut algorithm is used to roughly segment
Is , ∀s ∈ {1, ..., N } to provide a basis for the generation of confidence regions.
The confidence region  = {M1 , M2 , · · ·, Mm , · · ·, ML } is in N images with the
same scale, and the pixel p(i, j) of any ni image satisfies the same label Li ∈ {L1 , L2 , · ·
·, Lm , · · ·, LL }, where L = {Lp |p ∈ P} represents the labeling result of the image P, and
p(i, j) is called the confidence region of the label Li . Normally, the confidence region is
divided into the foreground confidence area and the background confidence area.
After segmenting the multiscale image with the graph cut algorithm, the binary
mask image at each scale is up-sampled based on the Gaussian pyramid, so that Is , ∀s ∈
{1, ..., N } has the same scale as I1 .
According to the definition of the confidence region, the foreground confidence
region and the background confidence region are respectively marked on the original
image. The probability value P(Ip | obj ) or P(Ip | bkg  ) of the pixel value in the assigned
label (foreground or background) is obtained from the distribution histogram of the
pixels in the foreground confidence area and the background confidence area. Due to
the graph cut effect of the specified seed node, this paper chooses the confidence region
as the seed node. In this way, a better segmentation result can be obtained in the global
cell cluster segmentation.
136 T. Wang

3.3 Global Cell Cluster Segmentation

The global cell cluster segmentation energy function is shown in Formula (7):
 
E(I ) = Rp (Ap ) + λ B{p,q} · δ(Ap , Aq ) (7)
p∈P {p,q}∈N

Rp (Ap ) is the data item of the energy function; B{p,q} is the interaction item of the
energy function; λ represents the weight coefficient of the data item and the interaction
item; δ(Ap , Aq ) is the indicator function; Ap is the label of p pixel. On the basis of the seed
node image obtained above, the pixel label distribution probability value is converted
into the data item Rp (·) in the energy function as shown in Formulas (8) and (9).

Rp ( obj ) = − ln P(Ip | obj ) (8)

Rp ( bkg  ) = − ln P(Ip | bkg  ) (9)

The interaction cost term is composed of discontinuous cost B{p,q} and indicator
function δ(Ap , Aq ), as shown in Formulas (10) and (11).

(Ip −Iq )2
(− ) 1
B{p,q} = e 2σ 2 · (10)
dist(p, q)

1 Ap = Aq
δ(Ap , Aq ) = (11)
0 otherwise

For the discontinuity cost B{p,q} , the case where adjacent pixels with similar intensity
values (|Ap − Aq | < σ ) are assigned different labels will have a larger discontinuity
cost. On the other hand, if the intensity difference between two adjacent pixels is large
(|Ap − Aq | > σ ), the discontinuity cost of assigning them to different labels is small. In
this paper, the boundary weights of the directed boundary (p, q) and (q, p) are the same.
This paper uses the α − expansion algorithm to complete the energy function solution.
The α − expansion algorithm obtains a global approximate optimal solution through
multiple iterations of the Min-cut/Max-flow algorithm.

4 Experimental Results and Discussion

4.1 Data Collection

This paper collects 20 cervical sample images of 2048 * 2048 size and divides each
image into 4 images of 1024 * 1024 size, a total of 80 images of 1024 * 1024 size.
The cervical specimen image is used in the experiment of this article. All slides were
prepared by manual liquid-based cytology and stained with the H&E method. Among
the 80 sample pictures, 50 images are used for algorithm parameter adjustment, and the
remaining 30 are used as the data set.
Segmentation of Cervical Cell Cluster 137

4.2 Evaluation Method


This paper uses DSC and pixel-based evaluation criteria to evaluate the segmentation
results. The definition of DSC evaluation criteria is shown in Formula (12).
2|RGT ∩ Rseg |
DSC = (12)
|RGT | ∩ |Rseg |
RGT represents the pixel area of the clumped cells of the annotation data, and Rseg
represents the pixel area of the segmented clumps of cells. Based on the pixel evaluation
criteria, the accuracy rate (prec) and recall rate (rec) are used as evaluation indicators.
The accuracy rate (prec) and the recall rate (rec) respectively represent the proportion of
the correctly segmented clump cells in the segmented clump cells and the labeled clump
cells, as shown in formulas (13) and (14).
TP
prec = (13)
TP + FP
TP
rec = (14)
TP + FN
TP represents the number of correctly segmented clump cells; FP represents the
number of pixels that are background but incorrectly segmented into clumps; FN indi-
cates that they are labeled clumps but are not segmented into clumps; TP +FP represents
the number of pixels divided into clump cells; TP + FN represents the number of pixels
labeled clump cells.

4.3 Parameter Setting


The experiment in this article is implemented using Python software on a PC with 64-
bit Ubuntu16.04 operating system. The host has 3.2 GHZ and 4 GB of memory. The
50 training images in this article mainly aim to adjust the λ parameter of the global
graph cut algorithm in the global clump cell segmentation stage. The DSC obtained by
setting the λ value to 100, 200, 300, and 400 respectively are 0.72, 0.74, 0.73, and 0.73,
respectively. The segmentation results of the cervical cell sample image are shown in
Fig. 3. Considering the problem of computational complexity, the value of λ in this paper
is set to 200. For the image scale, this article takes s = 3, which means that the original
image is divided into three scales: 1024 * 1024, 512 * 512, and 256 * 256.

(a)λ=100 (b)λ=200 (c)λ=300 (d)λ=400

Fig. 3. The effect of segmentation under different λ parameters


138 T. Wang

4.4 Comparison Results


This paper uses the threshold watershed algorithm described in the literature [36] to
evaluate the test set of this paper and compares it with our algorithm. It mainly com-
pares the three indicators of DSC evaluation criteria, accuracy rate and recall rate. The
comparative results of the two algorithms are shown in Table 1.

Table 1. Compare results

Algorithm DSC Prec Rec


Algorithm [36] 0.67 0.76 0.62
Our algorithm 0.74 0.84 0.66

The algorithm in this paper is higher than the algorithm in literature [36] on the three
performance indicators of DSC, prec and rec. The following Fig. is a picture in the test
set. The cluster cell segmentation results obtained by the method of this paper and the
method of literature [36] are shown in Fig. 4.

(a) Threshold watershed (b) Our method

Fig. 4. Segmentation results

5 Conclusions
In this paper, the full-size cervical sample image is transformed into images of different
scales through multiscale transformation. After coarse segmentation, the confidence
region generates a seed node and pixel value probability distribution map and then applies
it to the global cell cluster segmentation graph cut algorithm. The data experiments
of clinical TCT cervical sample images show that the multiscale graph cut algorithm
proposed in this paper can segment cervical cell clusters more accurately. The three
evaluation indicators of DSC accuracy rate and recall rate are better than the threshold
watershed algorithm of literature [36].

Acknowledgment. This work was supported by the project of talented youth reserves funded by
the Harbin University of Commerce.
Segmentation of Cervical Cell Cluster 139

References
1. Song, J., Xiao, L., Lian, Z.: Contour-seed pairs learning-based framework for simultaneously
detecting and segmenting various overlapping cells/nuclei in microscopy images. IEEE Trans.
Image Process. 27(12), 5759–5774 (2018)
2. Arya, M., Mittal, N., Singh, G.: Texture-based feature extraction of smear images for the
detection of cervical cancer. IET Comput. Vision 12(8), 1049–1059 (2018)
3. Pradhan, D.: Clinical significance of atypical glandular cells in Pap tests: an analysis of more
than 3000 cases at a large academic women’s center. Cancer Cytopathol. 124(8), 589–595
(2016)
4. Yu, J., Tan, J., Wang, Y.: Ultrasound speckle reduction by a SUSAN-controlled anisotropic
diffusion method. Patt. Recogn. 43(9), 3083–3092 (2010)
5. Huang, J.J., Wang, T., Zheng, D.Q., He, Y.J.: Nucleus segmentation of cervical cytology
images based on multi-scale fuzzy clustering algorithm. Bioengineered 11(1), 484–501 (2020)
6. Ds, G., Jm, R.: Quantitative immunocytochemistry of hypothalamic and pituitary hormones:
validation of an automated, computerized image analysis system. J. Histochem. Cytochem.
Off. J. Histochem. Soc. 33(1), 11–20 (1985)
7. Bengtsson, E.: Recognizing signs of malignancy - the quest for computer assisted cancer
screening and diagnosis systems. In: Coimbatore: 2010 IEEE International Conference on
Computational Intelligence and Computing Research, pp. 1–6 (2010)
8. Zhang, L., Kong, H.: Segmentation of cytoplasm and nuclei of abnormal cells in cervical
cytology using global and local graph cuts. Comput. Med. Imaging Graph. 38(5), 369–380
(2014)
9. Marinakis, Y.: Pap smear diagnosis using a hybrid intelligent scheme focusing on genetic
algorithm based feature selection and nearest neighbor classification. Comput. Biol. Med.
39(1), 69–78 (2009)
10. Lezoray, O., Cardot, H.: Cooperation of color pixel classification schemes and color
watershed: a study for microscopic images. IEEE Trans. Image Process. 11(7), 783–789
(2002)
11. Meng, H.: Nucleus and cytoplast contour detector of cervical smear image. Patt. Recogn.
Lett. 29(9), 1441–1453 (2008)
12. Genc, M.: Unsupervised segmentation and classification of cervical cell images. Patt. Recogn.
45(12), 4151–4168 (2012)
13. Zhang, L.: Automation-assisted cervical cancer screening in manual liquid-based cytology
with hematoxylin and eosin staining. Cytometry Part A. 85(3), 214–230 (2014)
14. Mao, Y.: Edge enhancement nucleus and cytoplast contour detector of cervical smear images.
IEEE Trans. Syst. Man Cybern. 38(2), 253–366 (2008)
15. Huang, Q.K.: A method based on watershed algorithm for core particles image segmen-
tation. In: 2010 3rd IEEE International Conference on Computer Science and Information
Technology (ICCSIT), pp. 408–410 (2010)
16. Li, K., Lu, Z., Liu, W., Yin, J.: Cytoplasm and nucleus segmentation in cervical smear images
using radiating GVF snake. Patt. Recogn. 45(4), 1255–1264 (2012)
17. Mei, L.: Adaptive image segmentation algorithm under the constraint of edge posterior
probability. IET Comput. Vis. 11(8), 702–709 (2017)
18. Bergmeir, C.: Segmentation of cervical cell nuclei in high-resolution microscopic images:
a new algorithm and a web-based software framework. Comput. Meth. Programs Biomed.
107(3), 497–512 (2012)
19. Aniel, G.B.: Automated marker identification using the radon transform for watershed
segmentation. IET Image Proc. 11(3), 183–189 (2017)
140 T. Wang

20. Harandi, N.M.: An automated method for segmentation of epithelial cervical cells in images
of ThinPrep. J. Med. Syst. 34(6), 1043–1058 (2010)
21. Wang, T., Huang, J.J., Zheng, D.Q., He, Y.J.: Nucleus segmentation of cervical cytology
images based on depth information. IEEE Access. 8, 75846–75859 (2020)
22. Bamford, P., Lovell, B.: Unsupervised cell nucleus segmentation with active contours. Signal
Process. 71(2), 203–213 (1998)
23. Al-Kofahi, Y.: Improved automatic detection and segmentation of cell nuclei in histopathology
images. IEEE Trans Bio-Med Eng. 57, 841–852 (2010)
24. Plissiti, M.E., Nikou, C., Charchanti, A.: Automated detection of cell nuclei in pap smear
images using morphological reconstruction and clustering. IEEE Trans. Inf Technol. Biomed.
15(2), 233–241 (2011)
25. Plissiti, M.E., Nikou, C., Charchanti, A.: Combining shape, texture and intensity features for
cell nuclei extraction in pap smear images. Pattern Recogn. Lett. 32(6), 838–853 (2011)
26. Jung, C.: Unsupervised segmentation of overlapped nuclei using bayesian classification. IEEE
Trans. Biomed. Eng. 57(12), 2825–2832 (2010)
Source Code Author Identification Method
Combining Semantics and Statistical Features

Xu Sun1 , Yutong Sun1 , Leilei Kong2(B) , Yong Han2 , and Hui Ning3
1 Heilongjiang Institute of Technology, Jixi, Heilongjiang, China
2 Foshan University, Foshan, Guangdong, China
3 Harbin Engineering University, Harbin, Heilongjiang, China

Abstract. The globalization of information sharing has made copying easier and
easier. The endless duplication of plagiarism has aroused wide attention in aca-
demic circles, and the related research in plagiarism detection has become a hot
topic in recent years. Taking deep learning-based plagiarism detection modeling
as the research object and improving the performance of the plagiarism detection
system as the research objective, this paper conducts an in-depth study on the task
of source code author identification in internal plagiarism detection. In the task
of source code author identification, text features based on pre-trained language
model can be used to model the semantic information of code fragments. How-
ever, it still lacks in the representation of complex full-text statistical features at
the level of text granularity. Therefore, this paper proposes a source code author
identification method that integrates semantic and full-text statistical features and
combines statistical and semantic features to build a source code writing style
model to realize accurate identification of author identity. Experimental results
on AI-SOCO datasets show that the proposed modeling method is superior to the
statistical and single semantic feature models.

Keywords: Plagiarism detection · Author identification · Pre-training language


model

1 Introduce
This paper focuses on source code author identification in internal plagiarism detection.
The focus of its task is to identify and test the authorship of source code text by construct-
ing an author writing style model given a predefined set of source code text and its author
[1]. This paper focuses on the positive role of semantic features in the construction of
document expression and realizes the semantic representation of source code. However,
because of the particularity of the source code programming language, only extracting
semantic features is not enough to model the author’s writing style. Therefore, this paper
proposes a source code author identification method that combines semantic features and
complex full-text statistical features to accurately predict the identity of the source code
author in a deep learning algorithm. Deep learning aims to automatically analyze and
extract the feature representation of deep data to be studied. Based on the working prin-
ciple of deep learning, the researchers proposed the pre-training language model. The

© The Author(s), under exclusive license to Springer Nature Switzerland AG 2022


A. E. Hassanien et al. (Eds.): BIIT 2021, LNDECT 107, pp. 141–151, 2022.
https://doi.org/10.1007/978-3-030-92632-8_14
142 X. Sun et al.

emergence of the pre-trained language model has brought a milestone change to the NLP
research field, making the development of natural language processing technology into
a new era. The BERT pre-training language model [2] released by Google at the end
of 2018 is the most noteworthy, showing amazing performance in 11 KINDS of NLP
tasks. BERT pre-training language model completes the training process in advance on
the massive unsupervised text corpus. The model can capture the semantic, syntactic,
positional structure and other information representation in the natural language text
data and then make subtle adjustments to the model according to fixed research tasks.
This article will introduce the training language model in-depth study, further promote
the plagiarism detection research-related tasks, more rationalization of plagiarism detec-
tion task model, makes the research achievements of this paper can promote plagiarism
detection research and development in the field of natural language processing. It has
certain practical significance.

2 Related Work

2.1 Relevant Methods of Source Code Author Identification

Given a predefined set of Source Code texts and their authors, the focus of Source Code
recognition task is to identify the authorship of the written test set text by building an
author style model. In experiments, the identification of source code authors is usually
formalized into a potential plagiarism detection problem, and plagiarism is identified by
extracting text features and analyzing the author’s writing style and personal language
characteristics.
The current work uses different techniques to accurately identify authors by improv-
ing the quality of features captured from source code text. In terms of representation,
most methods still use the AST algorithm, N-gram word frequency statistics, and other
models to transform text to text feature vector and realize the author’s writing style in
a classification model or neural network. For example, Yang et al. used Word N-gram,
Character N-gram (N = 2–5), and Word + Character combination as features and sent
them into logistic regression classifier for modeling.
In addition, some researchers propose that feature extraction methods based on source
code text, coding style, and abstract syntax tree can also represent the author’s writing
characteristics. For example, Pellin et al. [3] used the model to train the abstract syntax
tree (AST) representation extracted from the code. They applied the support vector
machine classifier to determine the source code author. Hugo A et al. [4] chose to
extract the coding features in the source code to find information about the text author to
predict the author’s real identity. Rehana et al. extracted the features based on style and
category from the source code and used a support vector machine, Gaussian function,
M5 algorithm for prediction.

2.2 Improvement of Traditional Feature Representation

However, the feature representation method based on traditional word frequency statis-
tics and text coding structure is weak in extracting source code text semantic features.
Source Code Author Identification Method 143

It cannot capture the writing characteristics of source code text. With the deepening
of research, researchers gradually introduce deep learning into the source code author
recognition task and use the deep learning model to capture effective features based on
source code text independently. For example, Alsulami et al. [5] used LSTM and BiL-
STM deep learning models to learn and capture the grammatical features contained in
texts and information representation at different levels of granularity. Abuhamad et al.
[6] applied the word frequency-inverse document frequency (TF-IDF) provided by con-
volutional neural networks (CNN) and word embedding representation to identify the
authorial identity of source code text. However, from the actual effect, the prediction
accuracy of the deep learning model is far lower than the expected value, indicating that
the simple deep learning representation model can not solve the problem of low author
recognition performance based on source code text.
Based on the disadvantages of the above methods, this paper will start from the
difference and particularity of source code text compared with natural language text,
and re-model the source code author recognition task, which has a breakthrough to
improve the accuracy of author identification.

3 Method

The focus of source code author identification is given a set of predefined source code text
and its author, and the author identity of the text is tested by building an author writing
style model [1]. At present, most of the mainstream methods use traditional statistical
models to complete the transformation from text to text feature representation vector,
but due to the particularity of programming language compared with natural language,
the feature representation information captured by the model is far from enough for the
author’s unique writing style, resulting in the experimental results far below the expected
value. In view of this, this paper proposes a source code author identification method
that integrates semantic and full-text statistical features to increase the ability to extract
complex full-text statistical features at the text granularity level and accurately predict
the author identity of source code text.

3.1 Analysis of Source Code Author Identification

Source code author identification task refers to that given a set of source code documents
and their authors, a model belonging to their own unique writing style is constructed
for each author. In the test data set, the author identity of source code text is identified
according to the matching degree of writing style. Based on the research method of
internal plagiarism detection, this paper formally defines the identification of source code
authors as the problem of determining potential plagiarism. Plagiarism is identified by
analyzing the similarities and differences of source code text writing styles and individual
language characteristics.
This paper attempts to construct a writing style model using full-text semantic fea-
tures, but the experimental results are far lower than expected. The reason is that the
programming language used in source code text has certain particularity compared with
natural language.
144 X. Sun et al.

First of all, there are few keywords in the programming language, and the vocabulary
dictionary formed by it is small in scale. However, the pre-training language model
usually adopts a massive unsupervised training corpus to conduct training operations on
the model. Its corpus covers almost all the linguistic information in the English corpus.
Because the vocabulary set of programming language is far smaller than the number of
English vocabulary learned by the pre-trained language model, the captured semantic
features of the text in the source code feature representation information learned by the
BERT model cannot fully express the author’s writing characteristics.
Second, there are a large number of fixed character sets and expression rules in
programming languages, such as the use of the While loop, which is followed by a
Boolean expression While. = 100), the Mask language model (Mask LM) in the BERT
pre-training language model will randomly Mask 15% of the words in the input corpus
and predict the original masked words depending on the learned context word meaning
and the context information on the left and right sides. But in fact, Boolean expressions
have some randomness, and there is no semantic relationship between Boolean expres-
sions, and While statements, so the model cannot recognize the full masked expression.
Compared with natural language, the semantic relationship between words in the source
text is relatively weak.
Through the above analysis, it is found that although the single semantic feature
model can better model the semantics of code fragments, it ignores the importance of
the statistical features in source code text to establish the author’s unique writing style
model and cannot represent the complex full-text statistical features in the granularity of
text in the aspect of feature extraction. This paper also needs to look for the characteristic
information that can accurately identify the writing style of source code text, solve the
problem of representation selection of source code text, and describe the author’s writing
style and personal language characteristics in a more detailed way.

3.2 Source Code Author Identification Model Integrating Semantic and Full-Text
Statistical Features
Feature Representation Method Integrating Semantics and Full-text Statistics. This paper
proposes that the semantic features, representation information, and full-text statistical
features of source code text should be fully considered in feature construction. The
methods of corresponding feature representation in the source code text will be described
below.
There are some semantic connections between words in the source code. The trans-
former mechanism in the Bert pre-training language model is used to learn and generate
embedded information based on words, words, and sentences in the source code, use its
powerful feature extraction ability to capture the semantic features of the source code,
perform SoftMax operation on the output vector, calculate the probability distribution,
output the prediction probability of the source code author, and extract the maximum
probability value by using the argmax function, Get the predicted author tag.
In this paper, we should integrate various features, including semantic information,
to build the author’s writing style model. Therefore, the author prediction probability
corresponding to each source code text output by the model is expressed as, and N
represents the order of the given author in the source code data set. For all source code
Source Code Author Identification Method 145

texts, a prediction probability matrix is formed. This probability matrix represents the
information as the semantic feature of the source code text.

Statistical Features of Full-Text Vocabulary Based on Word and Character N-gram.


Based on the traditional statistical learning method, this paper uses the combination of
Word N-gram and Character N-gram in the source code text as the vocabulary statistical
features, and uses the TF-IDF model to filter out the effective vocabulary statistical
features.

Full-Text Statistical Features Based on Source Code Structure. The author will
automatically bring in his personal language characteristics and show a certain law in
the use of various sentences and symbols. For example, when writing loop statements,
the author only uses the for loop, and comments (time, name) will be added when writing
code. This paper will conduct quantitative analysis on the source code text through the
method based on mathematical statistics to identify various quantitative laws based on
structure and style in the document [7]. As shown in Table 1.

Table 1. Structural features of source code

Length metric feature Complexity measurement


features
Number of source code lines Number of for loops
Number of classes per file Number of if clauses
Length of class names Number of if else clauses
Average number of capitalized Number of identifiers
words
Number of times various Number of algorithm
parentheses complexity

3.3 Research Framework of Source Code Author Identification Task


The task of source code recognition can be described as: a given training set D =
{(T 1 ,T 2 ,……,T n ), Ai }, which Ai represents the i-th author and (T 1 ,T 2 ,……,T n ) said the
authors write N source document, purpose is through the analysis of the N document
of each author, constructs the model belongs to the author’s unique writing style, and
reflected in the text of the source code style matching identification text the identity of
the author.
The key to solve the above problems is how to build a personal writing style model,
which is based on the feature representation method of source code text and the feature
representation information captured from the text. In the training data, the authorship
of Ai given document is known, so this paper needs to learn a model M to predict the
authorship of source code according to the characteristic representation information
146 X. Sun et al.

of the authorship tag and source code text. Set T i as the training text, Ai as the author
identity tag, Y i as the individual writing style model for each author. Then the framework
of source code recognition algorithm can be shown in Fig. 1.

Fig. 1. Algorithm framework diagram of source code recognition task

When given a new set of source text, the author identity of source text is predicted
by learning model M by extracting the writing style embodied in the text. The modeling
process for the source code author identification task is described below.

3.4 Modeling of Source Code Author Identification Task

In this paper, full-text statistical features based on vocabulary, full-text statistical fea-
tures based on code structure, and deep semantic features based on source code text are
captured as feature representation information to comprehensively describe the author’s
writing style and personal language characteristics. MLP can overcome the weakness
that single-layer perceptrons cannot identify linear infraction data and has strong gener-
alization ability in terms of prediction. Therefore, this paper uses MLP neural network
to construct the source code author identification model that integrates semantic features
and full-text statistical features.
In order to facilitate the mapping transformation of interpretive text features in neural
networks, this paper simply combines semantic features and full-text statistical features
to make them into a fixed dimension vector, and uses vector to express interpretive text.
In the experiment, the vector representation of the interpretation text was input into the
deep neural network built by MLP to construct the source code author identification
model combining semantic and full-text statistical features.
In the MLP neural network used in this paper, the input layer is used to receive
the input feature representation vector, and the hidden layer is fully connected with the
input layer. By learning the input feature vector, the output feature representation vector
is saved, and then the label probability distribution is calculated through the Softmax
regression function of the output layer to obtain the final matching result. Each layer
adopts different activation functions. The hidden layer is represented by vector x, then
Source Code Author Identification Method 147

its output is y(w1 x + b1 ), w1is the connection coefficient, b1 is the bias, and Sigmoid
function is used as the activation function. The formula is as follows:

Sigmod (x) = 1/1 + e−x (1)

The output of the output layer is f (w2 x 1 + b2 ) the output that x 1 represents the hidden
layer Sigmoid(w1 x + b1 ). Function F can use the Softmax function, as shown in Formula
2, where x i represents the output of the hidden layer, and n represents the number of
neurons in the hidden layer.
i n x n
Softmax(x) = ex / e (2)
1

Based on the above analysis, as shown in Fig. 2, a task model of source code author iden-
tification that incorporates semantic and full-text complex statistical features is estab-
lished. When given a new set of source code texts, the author identity of source code
texts can be predicted by judging the matching degree between the modeling author’s
writing style and the source code text’s writing style through MLP neural network.

Fig. 2. Model framework diagram of source code identification task

The specific steps of the experiment can be described as follows:


Extract the full-text statistical features of the source code text. For the given training
data composed of source code compiled by different authors, n-gram features based
on Word and Character are extracted to express the full text statistic features based on
words in source code text. The author’s personal characteristics and writing habits in the
process of writing source code are analyzed quantitatively.
Extract source code depth semantic features. The source code text is fed into the
BERT pre-training language model for feature learning and extraction. Softmax cal-
culates the output feature representation vector, and the output source code author
prediction probability matrix is used to represent the deep semantic features of the
text.
Model the writer’s writing style. The source code feature representation engineering
and author identity sequence captured from the training and verification data set are fed
into the MLP neural network. The unique writing style model belonging to each author
in the source code text is established.
148 X. Sun et al.

Predict source code authorship. After feature extraction, the test data set is fed into
MLP neural network, and the author writing style obtained from the test source code
text is matched with the established author style model. The author identity sequence of
the source code is predicted.

4 Experimental Results and Analysis

4.1 Data Sets and Evaluation Indicators

The AI-SOCO data set provided by FIRE 2020 evaluation conference was used in the
source code identification experiment. In the data set, 1000 users in the Codeforces
online evaluation system were selected, and 100 compilable source codes submitted by
each user were collected. The source codes were written in C++ language. For each user
(UID), all the collected source code comes from a different problem. The total number
of source documents is 100,000, which are divided into Train (training data set), Dev
(validation data set: for tuning model parameters), and Test (Test data set).
The source code identification model will be evaluated and ranked according to the
Accuracy index, which is defined as follows:

Accuracy = Number of correctly documents/Total number of documents (3)

4.2 Baseline Method Analysis


The 2020 FIRE evaluators set Baseline method for the reference of experimental results.
The first baseline approach, RoBERTa Tiny, uses the source code text directly from the
training data set to pre-train the RoBERTa model with a single layer Transformer and
12 Attention Heads with default parameters. The accuracy is 0.8746 on the test data set.
RoBERTa Tiny-96 method increased the number of Attention Heads in method 1
from 12 to 96, and the Batch size was 16.
Control group 3 UoB method [8] won the first place in ai-SOCO test data released
by FIRE Evaluation in 2020.The teams modeled n-gram features at the byte level. Each
source code is encoded as a vector. This vector represents 20,000 most common Character
6-g feature representation information (n-gram is represented by binary number) in the
training data set, which is trained in the neural network classifier and used for the final
author identity prediction.

4.3 Experimental Results and Analysis

This section reports the experimental results of source code author identification and
analyzes the results in detail. Accuracy was used as the evaluation index to evaluate
the experimental performance. Table 2 shows the prediction results of the text using
only the lexical information based on the N-Gram statistical model as the feature repre-
sentation vector. Table 3 describes the experimental results using different text feature
combinations.
Source Code Author Identification Method 149

Table 2. Structural features of source code

N-gram features Accuracy N-gram features Accuracy


Wo-1 + 2ngram + tfidf 0.8084 Char-2 + 3 + 4g + tfidf 0.9301
Wo-1 + 2 + 3ngram + tfidf 0.8230 Char-2 + 3 + 4 + 5g + tfidf 0.9327
Wo-1-4ngram + tfidf 0.8398 Char-2 + 3 + 4 + 5 + 6g + tfidf 0.9348
Char-2g + tfidf 0.9013 Ch-2 + 3 + 4 + 5 + 6 + 7g + tfidf 0.9356
Char-2 + 3g + tfidf 0.9240 Ch-2 + 3 + …… + 7 + 8g + tfidf 0.9413

Table 3. Experimental results of the author identification task

Experimental method Feature combination Experimental result


AI-MLP1 Semantic features 0.9222
AI-MLP2 Semantic + structure 0.9370
features
AI-MLP3 Semantic + n-gram 0.9543
features
AI-MLP4 Semantic + n-gram + 0.9720
structure

Considering that the source code text of programming language cannot model the
author’s writing style model due to the small vocabulary and the particularity of fixed
writing format, this paper hopes that the full-text statistical features based on the text
granularity level can capture the information based on the author’s writing characteristics
besides the deep semantic features.
Therefore, this paper uses AI-MLP3 method to combine semantic features of text with
n-gram features, and investigate the effect of integrating semantic and lexical statistical
features of source code. The experimental results show that the method is improved
compared with MLP1 method. AI-MLP2 combines semantic features with statistical
features of code structure to verify whether statistical features of code structure have
positive effects on experimental performance. When writing code, the author will reflect
certain personal language characteristics. Based on this discovery, this paper extracts the
structural statistical features in the source code and normalizes the indexes. According
to the experimental results, it is about 1.5% higher than the experimental method AI-
MLP1 , which proves the validity of the statistical features of code structure. However,
the effect is weak, and the in-depth quantitative analysis of the text is still needed. The
experimental result of AI-MLP4 method reached 0.9720, which was the highest among
the 4 groups of experiments. The combination of semantic feature, lexical statistic feature
and code structure statistic feature can more fully capture the author’s personal writing
style information and accurately identify the author identification.
150 X. Sun et al.

Finally, Table 4 shows the comparison between the experimental results of the
optimal performance method proposed in this paper and the four baseline methods.

Table 4. Comparison of experimental results between AI-MLP4 method and baseline method

Experimental method Experimental method Experimental result


1 RoBERTa Tiny 0.8746
2 RoBERTa Tiny-96 0.9288
3 UoB 0.9511
4 AI-MLP4 0.9720

Table 4 compares experimental results between the AI-MLP4 method and four base-
line methods of RoBERTa Tiny, RoBERTa Tiny-96, and UoB. RoBERTa Tiny and
RoBERTa Tiny-96 methods are the baseline methods given by the FIRE evaluators
in 2020. The first two methods adopt BERT single semantic model, and only adjust the
model parameters to distinguish and obtain the final prediction results. The lexical and
structure-based statistical features of the source text are ignored.
The experimental results in Table 4 also verify the view that “the deep semantic
features and full-text statistical features in the source code text play a common role in
constructing the author’s writing style model and author identification”.

5 Conclusion
The semantic text features based on a pre-trained language model can better model
the semantics of code fragments. Still, it cannot represent complex full-text statistical
features at the text granularity.
In this paper, we propose a source code author recognition method that integrates
semantic and statistical features. We pay attention to the positive role of semantic features
in constructing source code representation and consider the importance of statistical
features in fixed character sets of source code for source code recognition according to the
particularity of source code compared with natural language. The author’s writing style
model is constructed by combining statistical information with semantic information, and
the author’s identity of source code can be accurately identified in MLP neural network.
The experimental results show that the accuracy of the AI-SOCO test data set provided
by FIRE 2020 reaches 0.9720, about 2.1% higher than the strong baseline method, which
verifies the effectiveness. The proposed feature fusion model is statistically superior to
several advanced source code author recognition models, including traditional statistical
models and single semantic feature models based on deep learning.
The integration of complex full-text statistical features provides a new idea that
semantic features of modeling cannot fully extract text representation information and
advances the research on the basic issues of internal plagiarism detection.

Acknowledgment. This work is supported by the Social Science Foundation of Heilongjiang


Province (No. 210120002).
Source Code Author Identification Method 151

References
1. Fadel, A., Musleh, H., Tuffaha, I.: Overview of the PAN@FIRE 2020 task on the author-
ship identification of SOurce COde. In: FIRE 2020–12th Forum for Information Retrieval
Evaluation, pp. 649–676 (2020)
2. Devlin, J., Chang, M.W., Lee, K.: BERT: pre-training of deep bidirectional transformers for
language understanding, pp. 4171–4186 (2018)
3. Pellin, B.: Using classification techniques to determine source code authorship. In: Department
of Computer Science (2000)
4. Hugo, A., Bogotá, C.: Personality recognition applying machine learning techniques on source
code metrics. In: FIRE Forum for Information Retrieval Evaluation, pp. 25–29, India (2016)
5. Alsulami, B., Dauber, E., Harang, R.: Source code authorship attribution using long short-
term memory based networks. In: European Symposium on Research in Computer Security,
pp. 65–82 (2017)
6. Abuhamad, M., Rhim, I., AbuHmed, T.: Source code authorship identification using convolu-
tional neural networks. Future Gener. Comput. Syst. 95, 104–115 (2019)
7. Alvi, F., Stevenson, M., Clough, P.D.: Hashing and merging heuristics for text reuse detection.
In: 2014 Cross Language Evaluation Forum Conference, pp. 939–946 (2014)
8. Crosby, A., Tayyar, H., Tayyar, M.H.: UoB at AI-SOCO 2020: approaches to source code
classification and the surprising power of N-grams. In: The 12th meeting of the Forum for
Information Retrieval Evaluation, pp. 677–693 (2020)
The Improvement of Attribute Reduction
Algorithm Based on Information Gain Ratio
in Rough Set Theory

Wenjing Wang1 , Min Guo1 , Tongtong Han1 , and Shiyong Ning1,2(B)


1 Harbin University of Commerce, Harbin 150028, China
101102@hrbcu.edu.cn
2 Heilongjiang Provincial Key Laboratory of Electronic Commerce and Information Processing,

Harbin 150028, China

Abstract. Due to the various data sets and the cumbersome and diverse data
types, there must be many redundant attributes in them, which greatly increases
the classification time in the background of rough set theory. In this paper, we
improve the attribute reduction algorithm by information gain ratio. The data sets
obtained after the attribute reduction of this method are used for classification, and
the data sets are directly used for classification and comparison with other common
classification methods. Experimental data verify that the improved algorithm in
this article is effective. It can improve the classification speed greatly and shorten
the time spent.

Keywords: Attribute reduction · Fuzzy rough sets · Information gain ratio

1 Introduction

Attribute reduction occupies a very important position in the processing of intelligent


information and data, and it is also one of the core contents of rough set theory. As an
efficient mathematical method, it is very useful to solve inaccurate, incomplete data.
Especially in the field of data mining, and has achieved great success [1].
In the field of rough sets [2], attribute reduction is a hot research problem. It is
generally used in the preprocessing stage of data mining and has become one of the
most important features selection methods. Its main idea is to delete irrelevant features
and attributes in the data without reducing the data classification ability and remove a
large amount of redundant information contained in the data to reduce the classification
time of the data [3].
Here we give an example, a doctor confirms whether a patient has cancer. There
may be dozens or even hundreds of physical examination indicators for a patient, but
not all of these indicators are related to cancer. We need to select those key physical
examination indicators to determine whether the patient has cancer.

© The Author(s), under exclusive license to Springer Nature Switzerland AG 2022


A. E. Hassanien et al. (Eds.): BIIT 2021, LNDECT 107, pp. 152–159, 2022.
https://doi.org/10.1007/978-3-030-92632-8_15
The Improvement of Attribute Reduction Algorithm 153

2 Basic Theory
In the field of rough sets, knowledge is represented by an information system. Knowledge
is the sum of the knowledge and experience acquired by human beings in the practice of
transforming the objective world [4]. There are mainly two types of knowledge repre-
sentation systems: one is the information table, which is the knowledge representation
system without decision attributes, and the other is the decision table, which is the knowl-
edge representation system with decision attributes. Generally, the columns in the table
mark different attributes, the rows in the table mark the objects of the universe. Decision
table reduction is also called relative reduction of knowledge. Now we will introduce
some important related concepts in some rough sets.

Definition 1 (Sub-cluster). Given a set X , its power set is 2X , then it is called a set
composed of any elements in the power set, that is, each subset C of the power set is
a subset of X . In rough set theory, we are only interested in those sub-clusters that can
form a division or cover in the universe U .

Definition 2 (Equivalence relations and equivalence classes). If the relationship R on the


set A satisfies the following relationship: (1) reflexivity; (2) symmetry; (3) transitivity.
Then R is said to be the equivalent relation on the set A [5].

Definition 3 (Division) [6]. Given a domain U and a subset ζ = {A1 , · · · Am } of U , if


m
➀ Ai = Ø, ➁ U = Ai is satisfied, then the subset ζ is called a division of the set U ,
i=1
denoted as Parπ (U ) or π , and the divided element Ai is called the partition of π .

Definition 4 (Relation matrix). Supposed the X is a non-empty finite set and the R is a
equivalence relation. We define a relation matrix M (R):
⎛ ⎞
r11 r 12 · · · r1n
⎜ r21 r22 · · · r2n ⎟
⎜ ⎟
M (R) = ⎜
⎜ .. .. . . ..


⎝ . . . . ⎠
rn1 rn2 · · · rnn

We make the following similarity function calculate the equivalence relation:


|xi −xj | |xi −xj |
1−4∗ |amax −amin | , |amax −amin |
≤ 0.25
rij = (1)
0, otherwise

Definition 5 (Reduction of knowledge). Given a knowledge base [7] K = (U , S) and


an equivalence relationship family P ⊆ S, For any G ⊆ P, if G satisfies the following
two:

(1) For every R ∈ G, R is necessary in G.


(2) IND(G) = IND(P).
154 W. Wang et al.

Obviously, any reduction of knowledge is equivalent to the expression of any category


in the knowledge base by the knowledge itself, that is, their ability to classify domains is
the same. Generally speaking, the reduction of knowledge is not unique, and there can
be multiple reductions.

Definition 6 (Core of knowledge). Given a knowledge base [8] K = (U , S) and P ⊆ S,


for any R ∈ P, if R satisfy IND(P − {R}) = IND(P). The set of all necessary knowledge
in P is called the core of P, denoted as CORE(P).
The concept of nucleus has two functions: one is that the nucleus can be used as
the calculation basis for all reductions, because the nucleus of knowledge is included in
every reduction of knowledge, and calculations can be performed directly; the other is
that the nucleus can be interpreted as knowledge The most important part of the feature
cannot be deleted during knowledge reduction, otherwise it will weaken the knowledge
classification ability.

Definition 7 (Information entropy). Given knowledge P and its probability distribution.


We define the information entropy of knowledge P:
n n
|Xi | |Xi |
H (P) = − p(Xi ) log p(Xi ) = − log (2)
|U | |U |
i=1 i=1

Definition 8 (Conditional entropy). Given knowledge P and Q, their respective proba-


bility distributions and conditional probability distributions [9]. We define the conditional
entropy of knowledge Q relative to knowledge P:
n m
H (D|B ) = − p(Xi ) p(Yj |Xi ) logp(Yj |Xi ) (3)
i=1 j=1

Definition 9 (Mutual information). Given knowledge P and Q, their respective proba-


bility distributions and conditional probability distributions [10]. We define the mutual
information between knowledge P and Q:

I (B; D) = H (D) − H (D|B ) (4)

Definition 10 (Information gain).

Gain(a, B, D) = I (B ∪ {a}; D) − I (B; D) = H (D|B ) − H (D|B ∪ {a} ) (5)

Definition 11 (Information gain ratio).

Gain(a, B, D) I (B ∪ {a}; D) − I (B; D)


Gain_Ratio(a, B, D) = = (6)
H ({a}) H ({a})
The Improvement of Attribute Reduction Algorithm 155

3 Improved Algorithm
3.1 Algorithm Principle and Description

Step 1. First we need to assume B = Ø;


Step 2. Then determine the data type of the attribute a, if the data type of a is a character
type, convert it to an ASCII value. If the data type of a is Boolean, convert it to 0 or 1;
Step 3. For each attribute. we are supposed to calculate the Gain_Ratio(a, B, D);
Step 4. We need to select the attribute with the highest value in Gain_Ratio(a, B, D),
add the attribute a to the B collection, that is B ← B ∪ {a};
Step 5. At last, if Gain_Ratio(a, B, D) = 0, return to Step 2, else return to Step 5;
Step 6. Finally, we should output the set B.

3.2 An Illustrative Example

In this part, we will give a data case in the algorithm in Table 1. c1 stands for integer,
c2 stands for Boolean, c3 stands for character and the c4 stands for float. They are all
conditional attributes. At last, the D = {d} is the decision attribute.

Table 1. A data set

c1 c2 c3 c4 d
x1 18 False A 7.6 0
x2 13 True B 7.6 1
x3 22 True C 1.0 1
x4 13 True D 6.9 1
x5 16 False E 14.2 1

First, we are supposed to calculate the relation matrix, which including each attribute.
The results are as following:
⎡ ⎤
1 0 0 0 0.1111
⎢ 0 101 0 ⎥
⎢ ⎥
⎢ ⎥
R1 = ⎢ 0 0 1 0 0 ⎥
⎢ ⎥
⎣ 0 101 0 ⎦
0.1111 0 0 0 1
⎡ ⎤
1 0 0 0 1
⎢ ⎥
⎢ 0 1 1 1 0⎥
⎢ ⎥
R2 = ⎢ ⎢ 0 1 1 1 0⎥

⎢ ⎥
⎣ 0 1 1 1 0⎦
1 0 0 0 1
156 W. Wang et al.

⎡ ⎤
1 0 0 0 0
⎢ ⎥
⎢ 0 1 0 0 0⎥
⎢ ⎥
R3 = ⎢⎢ 0 0 1 0 0⎥

⎢ ⎥
⎣ 0 0 0 1 0⎦
0 0 0 0 1
⎡ ⎤
1 1 0 0.7878 0
⎢ 1 1 0 0.7878 0 ⎥
⎢ ⎥
⎢ ⎥
R4 = ⎢ 0 0 1 0 0⎥
⎢ ⎥
⎣ 0.7878 0.7878 0 1 0 ⎦
0 0 0 0 1
⎡ ⎤
1 0 0 0 0
⎢ ⎥
⎢ 0 1 1 1 1⎥
⎢ ⎥
d =⎢ ⎢ 0 1 1 1 1⎥

⎢ ⎥
⎣ 0 1 1 1 1⎦
0 1 1 1 1

Then, we will select the first attribute. We need to calculate the Gain(a, B, D) of
each attribute. Note that B = Ø in this step.

Gain(c1 , Ø, D) = I ({c1 }; D) = 0.6611

Gain(c2 , Ø, D) = I ({c2 }; D) = 0.3219

Gain(c3 , Ø, D) = I ({c3 }; D) = 0.7219

Gain(c4 , Ø, D) = I ({c4 }; D) = 0.1926

Then, we need to calculate the Gain_Ratio(a, B, D) of each attribute. The gain ratio
of each attribute is:
Gain(c1 , Ø, D) 0.6611
Gain_Ratio(c1 , Ø, D) = = = 0.3552
H ({c1 }) 1.8611
Gain(c2 , Ø, D) 0.3219
Gain_Ratio(c2 , Ø, D) = = = 0.3315
H ({c2 }) 0.9709
Gain(c3 , Ø, D) 0.7219
Gain_Ratio(c3 , Ø, D) = = = 0.3109
H ({c3 }) 2.3219
Gain(c4 , Ø, D) 0.1926
Gain_Ratio(c4 , Ø, D) = = = 0.1322
H ({c4 }) 1.4573
The Improvement of Attribute Reduction Algorithm 157

We need to select the maximum value of the information gain ratio from it, and select
the corresponding attribute. In this example, we should choose c1 and get B = {c1 }. We
continue to choose another attribute.
Gain(c2 , {c1 }, D) 0.0
Gain_Ratio(c2 , {c1 }, D) = = = 0.0
H ({c2 }) 0.9709
Gain(c3 , {c1 }, D) 0.0608
Gain_Ratio(c3 , {c1 }, D) = = = 0.0262
H ({c3 }) 2.3219
Gain(c4 , {c1 }, D) 0.0608
Gain_Ratio(c4 , {c1 }, D) = = = 0.0417
H ({c4 }) 1.4573
We need to select the maximum value of the information gain ratio from it, and select
the corresponding attribute. In this example, we should choose c4 and get B = {c1 , c4 }.
We continue to choose another attribute.
Gain(c2 , {c1 , c4 }, D) 0.0
Gain_Ratio(c2 , {c1 , c4 }, D) = = = 0.0
H ({c2 }) 0.9709
Gain(c3 , {c1 , c4 }, D) 0.0
Gain_Ratio(c3 , {c1 , c4 }, D) = = = 0.0
H ({c3 }) 2.3219
Because all the remaining attribute information gain ratio is 0, we directly output
the set B, that is, get the final result is {c1 , c4 }.

4 Experimental Analysis
This experiment uses four data sets in the UCI machine learning database, and uses
the above attribute reduction algorithm to process separately, compare their reduction
results and the classification accuracy in two case [11].
The Table 2 records the size of each data set, the amounts of attributes, and the
amounts of decision category.

Table 2. UCI data sample

Sample data Data Amounts Decision


size attributes category
Sobar 73 20 2
Algerian_forest_fires_dataset_update 123 13 2
Breast_cancer 569 31 2
Default of credit card clients 30000 24 2

The experimental results are shown in Table 3. The table records amount of after
attribute reduction of each data set [12]. The second and third columns in the table
respectively list the SVM classification accuracy rate before attribute reduction and the
SVM classification accuracy rate after attribute reduction.
158 W. Wang et al.

Table 3. Experimental results

Sample data Amounts Accuracy_1 Accuracy_2 Time_1 Time_2


Sobar 15 92% 92% 4s 2s
Algerian_forest_fires_dataset_update 11 90% 91% 8s 5s
Breast_cancer 26 94% 95% 22 s 16 s
Default of credit card clients 17 92% 94% 1068s 771 s

5 Conclusion
Generally speaking, the attributes in the knowledge base are not equally important, and
even some of the knowledge is unnecessary or redundant [13]. In this article, through
the improvement of the algorithm, it can deal with a variety of data types, which greatly
improves the running efficiency of the algorithm.
Attribute reduction is a hot issue worthy of joint research and efforts of scholars at
home and abroad [14]. We believe that a more beautiful algorithm can be proposed to
guide practical application. In the next step, we continue to apply this method to data
sets with missing values and ther to improve the algorithm’s efficiency in the distributed
environment.

Acknowledgment. The author sincerely thanks the members of the research team for their joint
efforts and the editors and reviewers for their valuable comments.

References
1. Yang, X., Zhang, Q., Sheng, G.: Research on Classification of LBS Service Facilities Based
on Rough Sets Neural Network. Proceedings of the 30th China Conference on Control and
decision-making 30(3), 76–81 (2018).
2. Xiao, Z., Chen, L., Zhong, B.: A model based on rough set theory combined with algebraic
structure and its application: bridges maintenance management evaluation. Expert Syst. Appl.
37(7), 5295–5299 (2010)
3. Wang, M., Deng, S.: Research to e-commerce customers losing predict based on rough set.
Appl. Mech. Mater. 1287(5), 164–170 (2011)
4. Afsoon, M., Vahid, S.: A fuzzy-rough approach for finding various minimal data reductions
using ant colony optimization. J. Intell. Fuzzy Syst. 26(5), 2505–2513 (2014)
5. Li, Z., Liu, Y., Li, Q., Qin, B.: Relationships between knowledge bases and related results.
Knowl. Inf. Syst. 49(1), 171–195 (2016)
6. Liang, F., et al.: Recognition algorithm based on improved FCM and rough sets for meibomian
gland morphology. Appl. Sci. 7(2), 192–208 (2017)
7. Dai, J., Xu, Q.: Attribute selection based on information gain ratio in fuzzy rough set theory
with application to tumor classification. Appl. Soft Comput. J. 13(1), 211–221 (2013)
8. Li, Y., Deng, Q.: Research of rough set theory on forecast of power generation for grid-
connected photovoltaic system. Appl. Mech. Mater. 3147(3147), 885–889 (2014)
9. Liu, J., Hu, Q., Yu, D.: A weighted rough set based method developed for class imbalance
learning. Inf. Sci. 178(4), 1235–1256 (2007)
The Improvement of Attribute Reduction Algorithm 159

10. Zhang, Y., Ding, S., Xu, X., Zhao, H., Xing, W.: An algorithm research for prediction of
extreme learning machines based on rough sets. J. Comput. 8(5), 1335–1342 (2013)
11. Shu, W., Qian, W., Tang, Z.: An efficient uncertainty measure-based attribute reduction app-
roach for interval-valued data with missing values. Int. J. Uncertain. Fuzziness Knowl.-Based
Syst. 27(6), 931–947 (2019)
12. Li, X., Lu, A.: A rough set theory based approach for massive data mining. Int. J. Signal
Process. Image Process. Pattern Recogn. 11(1), 23–26 (2018)
13. Yan, C., Zhang, H., Nan, T.: Attribute reduction based on generalized orthogonal fuzzy rough
sets. Fuzzy Syst. Math. 34(6), 130–139 (2020)
14. Sheng, K., Wang, W., Dong, H., Ma, J.: Incremental attribute reduction algorithm based on
neighborhood discrimination of mixed data. Electron. J. 48(4), 682–696 (2020)
Accurate Teaching Data Analysis of “Special
Effects Production for Digital Films and TV
Programmes” Based on Big Data

Fan Jing(B)

Department of Information Engineering, Heilongjiang International University, Harbin,


Heilongjiang, China

Abstract. With the promotion and implementation of student-centered teaching


concepts, traditional teaching methods do not meet teaching requirements in the
new era. Because of the deficiency of pertinence and accuracy in traditional teach-
ing activities, this paper proposes precision teaching based on big data by analyzing
educational data. This paper proposes methods to reform the teaching methods
from teaching design, teaching practice, and teaching assessment. Besides, data
about students, courses, and other learning resources and course assessment are
collected through normal analysis, Pearson correlation analysis, and linear regres-
sion analysis. Then it is found that personalized test scores are the main factor
that affects students’ final learning effect. The interactive teaching factors are not
perfect, so it is necessary to strengthen teaching design and embody individualized
development.

Keywords: Precision teaching · Normal analysis · Pearson correlation analysis ·


Linear regression

1 Big Data and Precision Teaching


As a part of big data, big education data refers to a collection of data collected according
to needs during the entire process of educational activities and used for educational
development with great value [1]. Precision Teaching, proposed by Dr. Lindsley from
the U.S. in the 1960s, mainly means to record learners’ behavior frequency, response
speed, and other observable learning behaviors, draw standardized charts, and then adjust
teaching strategies according to the changes of frequency data in the chart [2, 3].
Although studies about precision teaching have a history of nearly a century abroad,
most focus on evaluating the effect of precision teaching through teaching experiments,
and the research progress is slow. For instance, studies by Downer [4] and Griffin [1]
et al. show that precision teaching can significantly improve students’ reading ability.
And Gallagher [5] and Stromgren et al. [6] study precision teaching in mathemati-
cal teaching, and the experiment results show that precision teaching can significantly
improve students’ performance with slow learning.
Domestic research on precision teaching is in its infancy. As of August 2021, a search
has been conducted on CNKI with “precision teaching” as the title and keywords, and a

© The Author(s), under exclusive license to Springer Nature Switzerland AG 2022


A. E. Hassanien et al. (Eds.): BIIT 2021, LNDECT 107, pp. 160–168, 2022.
https://doi.org/10.1007/978-3-030-92632-8_16
Accurate Teaching Data Analysis 161

total of 884 documents have been obtained. Specifically, these documents can be divided
into four categories in terms of research directions. The first category of documents
analyze the basic theory and model design of precision teaching from the perspective
of informatization teaching reform combined with smart learning and propose a method
for precise target determination based on recursive thinking [7]; the second category of
documents introduce the basic concepts, operating procedures and application value of
precision teaching from the perspective of theoretical introduction [8]; the third category
of documents propose precise teaching model and its practical strategies under the flipped
classroom [9], and the fourth category of documents use digital media technology to
accurately focus on students by extracting facial expressions and posture features. This
is a technical solution derived from precision teaching [10]. Through research, it is found
that the current research on precision teaching has the following two problems:

(1) There are more theoretical analyses on precision teaching and less precision
teaching analysis through big data;
(2) There are many methods for teaching reconstruction measures, but there are few
analyses on whether there are links between various teaching designs.

2 Precision Teaching Mode Based on Big Data

“Special Effects Production for Digital Films and TV Programmes” is a compulsory


professional course for digital media technology majors, aiming to train students to
master the core skills of post-production special effects. Through seminars, compre-
hensive research, independent learning, blended learning, and other teaching methods,
students can have practical problem-solving skills, critical and creative thinking, and
other advanced thinking skills. In addition, they can master the ability of self-control
and autonomous learning in the online learning environment and communication meth-
ods in non-face-to-face cooperative learning. This teaching reform is task-driven and
carries out mixed teaching, using the wisdom tree teaching platform to supervise the
teaching process before, during, and after class. It also analyzes the data of process eval-
uation and overall evaluation to accurately find the factors that affect the teaching effect
and the deficiencies in the teaching design, thereby improving the quality of teaching
(Fig. 1).

Fig. 1. Precision teaching mode based on big data


162 F. Jing

3 Learning Analysis
This course is designed for sophomore students majoring in digital media technology.
On the whole, these students have a solid foundation of basic professional knowledge,
strong professional learning motivation, and independent learning ability. They urgently
need comprehensive practical knowledge and skills for future employment and industrial
development. From the individual level, these students have varying levels of curricu-
lum knowledge: some students have relatively high levels of knowledge, while most
students have almost no operational foundation. From the perspective of the online
teaching environment, it is found through pre-class surveys that students are equipped
with basic online teaching conditions such as laptops, mobile phones, software, and
wireless networks, so mixed online and offline teaching can be successfully carried out.

4 Teaching Contents
According to the set teaching goals, the author reconstructs the teaching knowledge
system based on years of teaching experience and the content of other colleges and
universities. Based on the idea of taking students as the center and cultivating learning
ability as the main goal, the teaching content is divided into the following structures
(Fig. 2):

Fig. 2. Teaching contents of the “Special effects production for digital films and tv programmes”
course

5 Construction of Teaching Resources


Based on the reconstructed curriculum knowledge points, the author meticulously pre-
pares diversified teaching and learning resources, and the organization of teaching
Accurate Teaching Data Analysis 163

resources mainly reflects the characteristics of integrated teaching.This is conducive


to the development of mixed teaching that integrates classroom teaching and online
learning and pays attention to the resource support of students’ independent learning
after class. The resources of this course include textbook resources and wisdom tree
self-built resources. Wisdom tree self-built resources include teachers’ lecture videos,
homework production videos, and resource expansion videos. Abundant video resources
can expand students’ horizons and improve their ability to learn independently.

6 Teaching Methods
Task-driven teaching: By letting students watch the effect video, the task completion
effect can be reflected intuitively; through the teacher’s question, students can take the
initiative to construct a learning system of inquiry, practice, thinking, application, and
problem-solving.
Problem-oriented inquiry and discussion: students can continuously explore and
innovate in solving problems proposed by both teachers and students.
Group teaching: through group tasks, students’ teamwork ability, language ability
and innovative thinking ability can be improved.

7 Course Characteristics and Innovation


7.1 Diversified Process Evaluation Assists to Improve Students’ Learning
Incorporating students’ cognition, emotions, values, and other content can reflect the
humanity and diversity of the evaluation. It is necessary to comprehensively adopt results
evaluation, process evaluation, dynamic evaluation, and other methods to formulate more
refined and systematic evaluation indicators to fully reflect the growth and success of
students promptly and reflect the degree of integration of knowledge transfer and value
guidance in the curriculum.

7.2 Extend Learning Outside Class to Achieve Interconnection


(1) Encourage students to participate in relevant contests to promote their knowledge
skills and thinking level.
(2) Flexible apply learned knowledge into extracurricular practice, give full play to tech-
nical advantages and serve the school’s media promotion, thus improving students’
practical ability and thinking level.

8 Teaching Design
The real subject of “student-centered” teaching process management is students. Stu-
dent behavior occurs not only in the classroom but also throughout the entire teaching
process before, during, and after class. Students should have more time to complete
independent learning and knowledge system construction through practice before and
after class and cultivate their creative thinking, teamwork, and problem-solving ability
in extracurricular practice. As the supervisor of student behavior, teachers should for-
mulate a complete implementation strategy and monitoring system to escort students in
all aspects to complete the teaching tasks.
164 F. Jing

9 Effect Analysis
The teaching evaluation of this course mainly includes the usual grades (50%) and
end-term performance (50%), among which the usual grades include attendance grades
(5%), independent study grades (15%), homework grades (30%); end -term performance
includes group work (40%) and personality test (60%). Normal performance reflects the
result of process evaluation, and end-term performance reflects the result of personal
ability and teamwork ability. The data for this analysis comes from the learning situation
and offline performance of 43 students majoring in digital media technology in the class
of 2019 in the wisdom tree teaching platform.

9.1 Normal Distribution Analysis

In the statistical analysis of the data, the P-P diagram can be used to check whether the
data is normally distributed visually. The usual grades, end-term performance, and total
grades of this course satisfy the normal distribution, and the effect is shown in Fig. 3,
Fig. 4, and Fig. 5.

Fig. 3. Normal P-P chart of normal performance

Fig. 4. Normal P-P chart of end-term performance


Accurate Teaching Data Analysis 165

Fig. 5. Normal P-P chart of total performance

9.2 Pearson Correlation Analysis


Pearson correlation analysis is used to study the relationship between quantitative data,
whether there is a relationship, and the degree of closeness of the relationship.

(1) Correlation between normal performance and end-term performance

On the basis of normal analysis, this paper analyzes the correlation between nor-
mal performance and end-term performance data to prove the importance of process
evaluation (Table 1 and 2).

Table 1. The pearson correlation between normal performance and end-term performance

Average value Standard Normal End-term


difference performance performance
Normal 78.186 9.189 1
Performance
End-term 72.930 5.930 0.339* 1
performance
* p < 0.05 ** p < 0.01

As can be seen from the above table, this article uses correlation analysis to study
the correlation between usual performance and end-term performance. It uses Pearson
correlation coefficient to indicate the strength of the correlation. The specific analysis
shows that the correlation coefficient between the usual performance and end-term per-
formance is 0.339. It shows a significant level of 0.05, which shows a significant positive
correlation between usual and end-term performance.

(2) Correlation between group work and individual test in end-term performance
166 F. Jing

Table 2. Correlation between group work and individual test in end-term performance

End-term performance
Group task 0.263
Individual test 1 0.720**
Individual test 2 0.870**
* p < 0.05 ** p < 0.01

As can be seen from the above table, this article uses correlation analysis to study the
correlation between end-term performance and group work, personality test 1, person-
ality test 2 and a total of three items. Pearson correlation coefficient is used to indicate
the strength of the correlation. Specific analysis shows that the correlation coefficient
between end-term performance and group work is 0.263, close to 0. The p value is
0.080 > 0.05, which shows no correlation between end-term performance and group
work. The correlation coefficient between end-term performance and personality test 1
is 0.720. It shows a significance level of 0.01, which shows a significant positive corre-
lation between end-term performance and personality test 1. The correlation coefficient
between end-term performance and personality test 2 is 0.870. It shows a significant level
of 0.01, which shows a significant positive correlation between end-term performance
and personality test 2.

9.3 Linear Regression Analysis


Linear regression analyzes the influence of attendance scores, independent study scores,
and homework scores on the final scores (Table 3).

Table 3. Linear regression analysis

Results of linear regression analysis (n = 43)


Non-standardized Standardized t p VIF R2 Adjust F
coefficient factor R2
B Standard Beta
error
Constant 157.464 58.028 – 2.714 0.010** – 0.182 0.119 F (3,39)
Attendance −19.904 11.398 −0.256 −1.746 0.089 1.025 = 2.898,
p = 0.047
Task score 0.498 0.207 0.356 2.404 0.021* 1.043
Interaction 0.346 0.704 0.072 0.491 0.626 1.028
score
Dependent variable: end-term performance D-W Value: 2.570 * p < 0.05 ** p < 0.01

It can be seen from the above table that the attendance score, homework score, and
interactive score are used as independent variables, and end-term performance is used
as the dependent variable for linear regression analysis. As can be seen from the above
Accurate Teaching Data Analysis 167

table, the model formula is: end-term performance = 157.464–19.904*attendance score


+0.498*homework score +0.346*interaction score, and the model R square value is
0.182. This means that attendance scores, homework scores, and interactive scores can
explain the 18.2% change in end-term performance. When performing F test on the
model, it is found that the model passes F test (F = 2.898, p = 0.047 < 0.05), which
means that at least one of attendance scores, homework scores, and interactive scores
will have an impact on end-term performance. The regression coefficient of attendance
scores is −19.904 (t = −1.746, p = 0.089 > 0.05), which means that attendance scores
will not affect end-term performance. The regression coefficient of the homework score
is 0.498 (t = 2.404, p = 0.021 < 0.05), which means that the homework score will have
a significant positive influence on the end-term performance. The regression coefficient
of the interactive performance is 0.346 (t = 0.491, p = 0.626 > 0.05), which means that
the interactive performance will not affect the end-term performance.
The summary analysis shows that: homework grades will have a significant positive
impact on end-term performance. However, attendance scores and interactive scores do
not affect end-term performance.

10 Conclusion

This paper uses normal distribution, Pearson correlation analysis, and linear regression
analysis to analyze the usual grades and end-term performance. It can be seen that the
results with personalized guidance in the teaching process have a significant impact on
the teaching effect, such as the usual homework results and the final personality test.
From the self-learning process data, it can be seen that students can complete the self-
learning content, but the lack of assessment in the teaching management process makes
it impossible to determine the effect of students’ self-learning. Therefore, the diversified
assessment of independent learning content should be added in the teaching process to
further improve the data and complete the precise teaching design.

Acknowledgments. Heilongjiang Province Education Science Planning 2020 Key Project


Research Results, Research, and practice on the construction of “gold course” of computer major
based on the achievement-oriented, (Subject Approval No.: GJB1320330).

References
1. Griffin, C.P., Murtagh, L.: Increasing the sight vocabulary and reading fluency of children
requiring reading support: the use of a precision teaching approach. Educ. Psychol. Pract. 2,
186–209 (2015)
2. Binder, C., Watkins, C.L.: Precision teaching and direct instruction: measurably superior
instructional technology in schools. Perform. Improv. Q. 3, 74–96 (1990)
3. Athabasca University. Precision teaching: Concept definition and guiding principles. <https://
psych.athabascau.ca/open/lindsley/concept.php>
4. Downer, A.C.: The national literacy strategy sight recognition programme implemented by
teaching assistants: a precision teaching approach. Educ. Psychol. Pract. 2, 129–143 (2007)
168 F. Jing

5. Gallagher, E.: Improving a mathematical key skill using precision teaching. Irish Educ. Stud.
3, 303–319 (2006)
6. Strømgren, B., Berg-Mortensen, C., Tangen, L.: The use of precision teaching to teach basic
math facts. Eur. J. Behav. Anal. 2, 225–240 (2014)
7. Zhu, Z.T., Peng, H.C.: Efficient knowledge teaching supported by information technology:
energizing precision teaching. China’s Audio-Vis. Educ. 01, 18–25 (2016)
8. Liang, M.F.: Probing into Precision Teaching. Basic Educ. 06, 4–7 (2016)
9. Zhang, L.Z.: A study on the accurate teaching model in the flipped classroom. J. Wuhan
Metall. Manage. Cadre Coll. 2, 50–52 (2016)
10. Zheng, Y.W., Chen, H.X., Bai, Y.H.: An experimental study on the accuracy of students
’attention in classroom teaching based on big data. Mod. Educ. Sci. 2, 54–57 (2016)
An Improved Clustering Routing Algorithm
Based on Leach

Jintao Yu(B) and Yu Bai

School of Computer and Information Engineering,


Harbin University of Commerce, Harbin 150028, Heilongjiang, China
hityjt@sina.com

Abstract. Energy consumption has been an important content of wireless sensor


network research in recent years, and node energy consumption can be effectively
reduced by optimizing routing algorithms. Aiming at the problem of random clus-
tering and uneven clustering in the LEACH algorithm, which leads to unbalanced
energy, a routing algorithm for uniform clustering is proposed. The network is
evenly clustered, and reasonable cluster heads are selected by competition in the
clustering stage of the network. The data transmission path is optimized. The
polling control mechanism is introduced into the intra-cluster communication
during the data communication stage, which is carried out by combining single-
hop and multi-hop. The simulation results show that the algorithm can effec-
tively reduce network energy consumption, extend network lifetime, and improve
throughput.

Keywords: Wireless sensor network · LEACH · Cluster routing algorithm ·


Network lifetime

1 Introduction
In an era of information and intelligence, the application value and potential value of
Wireless Sensor Network (WSN) have successfully attracted the attention of academia
and industry. Routing algorithm has become an important research content of Wireless
Sensor Network. Normally, the wireless sensor network nodes are randomly distributed
in harsh environments that are even difficult to enter. The exhausted node energy can-
not be replenished, and the failure of some nodes will cause the network to collapse.
So energy is an important factor affecting the further development of wireless sensor
networks. Among the several modules of the network, energy consumption is mainly
concentrated on the communication module. Therefore, it is important to study the
efficient, energy-saving, and extended network life cycle routing algorithm [1].
According to the different network topologies, routing algorithms are mainly divided
into the plane and layered types. The hierarchical routing algorithm is also called the
cluster routing algorithm. The entire network is divided into multiple clusters, and each
cluster has one and only one backbone node, and the rest of the points are ordinary nodes.
Compared with the planar routing algorithm, the network lifetime of the clustering rout-
ing algorithm is longer, and it has become the most widely used routing algorithm [2].

© The Author(s), under exclusive license to Springer Nature Switzerland AG 2022


A. E. Hassanien et al. (Eds.): BIIT 2021, LNDECT 107, pp. 169–180, 2022.
https://doi.org/10.1007/978-3-030-92632-8_17
170 J. Yu and Y. Bai

LEACH algorithm [3] is the earliest cluster routing algorithm, which has very important
significance in the routing algorithm of hierarchical structure. LEACH algorithm adopts
the periodic rotation replacement of cluster heads to avoid the problem that excessive
energy loss of fixed cluster heads can lead to premature failure. Once a node with low
remaining energy or a node far from the base station is used as a cluster head, which
will be accelerated death rate of the node. The algorithm proposed in the literature [4]
improves the LEACH threshold calculation formula and selects cluster heads by com-
prehensively considering node energy and surrounding information. But the problem of
uneven clustering may occur. In the literature [5], the cost function is calculated accord-
ing to the node information to determine the cluster head scientifically. In the literature
[6], the particle swarm algorithm divides the network area, clusters first, and selects
the cluster head. The literature [7] proposes that KUCR uses a clustering algorithm to
evenly cluster, prolonging the network life cycle, but not considering later communi-
cation consumption. The literature [8] introduced multi-hop communication based on
non-uniform clustering, effectively balancing network energy consumption. The algo-
rithm in literature [9] proposed a communication method that combines single-hop and
multi-hop during data transmission, which reduces energy loss. So far, most of the opti-
mized routing algorithms proposed by many researchers are based on the basic idea of
the LEACH algorithm. The ultimate common goal is to enhance the energy utilization
rate of nodes and extend the life cycle of the network.

2 System Model
2.1 Network Model
The network model of this article [10] has the following characteristics:

(1) All nodes have a certain network identification (ID).


(2) The network nodes are randomly deployed and will not move after deployment.
(3) All sensor nodes are homogeneous and have limited energy, so all sensor nodes
have the same properties.
(4) All nodes can adjust the transmission power according to the communication
distance.
(5) All nodes can transmit data with the base station.

2.2 Energy Consumption Model


The model selected here is the first-order radio energy consumption model [11]. The
distance between the sending end and the receiving end determines the energy consumed
when the node transmits data. When a sender sends a k-bit data to transmit for d distance,
the energy consumed by the sender is as follows:

kEelec + kεfs d 2 , d < d0
ETX (k, d ) = (1)
kEelec + kεmp d 4 , d ≥ d0

εfs
d0 = (2)
εmp
An Improved Clustering Routing Algorithm Based on Leach 171

Where, Eelec is the energy consumed per bit of data received or sent by the node,
εfs and εmp are the power amplification energy loss coefficients in the free space model
and the multipath attenuation model respectively, and d0 is the threshold of transmission
distance.
The energy required by the receiving end to receive k-bit data is:

ERX (k, d ) = kEelec (3)

The energy required for node fusion of k bits of data is:

EDA = kEda (4)

3 Improved Cluster Routing Algorithm

It is very significant to design a reasonable routing algorithm to balance the network load
in WSN. The improved algorithm clusters the network and then selects the cluster head,
and finally performs data communication. The paper will design an energy-saving clus-
tering algorithm from the following three aspects: uniform network clustering through
clustering, selection of cluster heads, and data communication. The research framework
of this article is shown in Fig. 1.

Network Clustering Process Data Communication Process


The improved K-means
Communication between
algorithm clustering the
cluster
network evenly

Communication within the


A competitive neural cluster
network model is used to
select cluster heads

Fig. 1. The research framework.

3.1 Network Clustering Process

Network Evenly Clustering. There are still many problems to be solved in wireless
sensor networks, among which prolonging network life is an important one that can be
achieved by reducing energy consumption. In recent years, researchers have conducted
experimental studies to prove that clustering algorithms can improve wireless sensor
network survival time. Clustering is the process of classifying objects according to
172 J. Yu and Y. Bai

certain criteria [12]. As a typical algorithm of clustering and clustering, K-Means has
the characteristics of uniform clustering. Due to the random clustering of the traditional
LEACH algorithm, it may cause very large clusters and very small clusters. The density
of cluster heads will also be affected. Therefore, the clustering algorithm is introduced
in the clustering stage of the LEACH algorithm.

Since random nodes are deployed in wireless sensor networks in real life, there
will inevitably be some deviating nodes. Such nodes will affect the clustering results
of the K-means algorithm. In order to avoid these isolated nodes from affecting the
clustering effect of the K-Means algorithm, the concept of truncated average is introduced
here. Firstly, the clustering center is set within a certain distance range. If the node is
not within this range, the node will be regarded as an isolated node. To avoid it from
affecting the judgment of the entire cluster center, the algorithm temporarily removes
it when calculating the clustering center. In the LEACH algorithm, the overall energy
consumption of the network is not only proportional to the square sum of the distance
from the cluster head to the node in the cluster but also proportional to the fourth
power of the distance from the cluster head to the base station [13]. Correspondingly,
the standard measure function is set here as the sum of the square of the Euclidean
distance between the node in the cluster and the cluster center and the sum of the fourth
power of the Euclidean distance between the cluster center and the base station to obtain
the minimum network energy consumption. The improved standard measure function
formula is shown as follows:
k 
E= ( εfs ∗ Di2 (p, μi ) + εmp ∗ Di4 (BS, μi )) (5)
i=1 p∈Ci

Where, BS is the base station, μi is the cluster center point of cluster i, p is the
member node of the cluster, Di2 (p, μi ) is the square of the distance between node p and
cluster center point μi , Di4 (BS, μi ) is the fourth power of the distance from the cluster
center μi to the base station.

Determination of Number of Cluster Heads. In the clustering routing algorithm of


wireless sensor networks, an important parameter that needs to be determined before
clustering is the account of cluster heads. The proportion of cluster heads directly affects
the utilization of network resources, so the appropriate number of cluster heads will more
evenly distribute cluster heads. In all nodes, the cluster head plays an important role.
in a certain range, the data perceived by some nodes are similar and repeatable. The
cluster head is responsible for data fusion and forwarding within the cluster, which
significantly decreases data transmission in the network. As a result, the cluster head
loses more energy and is more likely to die. If there are too few cluster heads in the
network, the backbone nodes in the cluster will die prematurely because of the heavy
burden; if there are too many cluster heads, it will undoubtedly increase the network’s
energy consumption, shorten the network life cycle. Selecting a reasonable number of
cluster heads is helpful to cluster evenly and balance the load of nodes.
The energy consumed by the cluster head node is:
 
N N
ECH = LEelec − 1 + LEDA + LEelec + Lεmp dtoBS 4
(6)
K K
An Improved Clustering Routing Algorithm Based on Leach 173

The energy consumed by the non-cluster head node is:

Enon−CH = LEelec + Lεfs dtoCH


2
(7)

N sensor nodes are distributed in the area of M × M. Assuming that the network
√ the area of each cluster is M /K,
is uniformly clustered and is an ideal circle, 2 and the
radius of the circle is R, then R = M / π K, the node density ρ is K/M 2 .Then the
following formula can be obtained:
√ √
  2π M / πK K 2π M / πK M2
E 2
dtoCH = r ρ(r, θ )rdrd θ = 2
3
r 3 drd θ = (8)
0 0 M 0 0 2π K
The energy consumption of the whole network is as follows:

M2
Enet =KECH + (N − K)Enon−CH = L 2NEelec + NEDA + εmp dtoBS
4
+ N εfs
2π K
(9)

If the derivative of the above equation is zero, the optimal number of cluster heads
K can be obtained, as shown in the following formula:

N εfs M
K= (10)
2π εmp dtoBS2

3.2 Cluster Head Election

The choice of cluster head node is the core part of the clustering routing algorithm. The
quality of the cluster head directly determines the prospect of the entire network. The
LEACH algorithm will randomly replace the cluster heads if the energy of the selected
node is not enough for it to complete heavy tasks. The process will speed up the death
rate of the selected node, which is not conducive to the operation of the whole network
and even causes the collapse of the network. When running for cluster heads in a cluster,
multiple factors should be considered comprehensively to ensure that the selected cluster
heads can meet the normal operation of the network. The cluster head that collects and
merges the data has a relatively large load energy consumption. On the contrary, the load
of the nodes in the cluster is not that large. The choice of the cluster head can use the
Hamming network algorithm in the neural network to balance the load of the network.
In the process of node campaign, the nodes take their relative remaining energy, the
distance from the base station, the number of cluster heads, and the number of adjacent
nodes as the input of the competing network. After the continuous learning of the neural
network, the node with more residual energy near the base station and more neighbor
nodes finally become the cluster head. The calculation formula for the relative remaining
energy of sensor node is as follows:
Ec
E= (11)
Em
174 J. Yu and Y. Bai

Where Ec represents the current energy of the node, Em represents the initial energy
of the node.
The specific process of network learning is as follows:
(1) Set network parameters and variables
Take X (n) = [x1 (n), x2 (n), . . . , xN (n)]T as the input vector of the network, where
X (n) is the sensor node participating in the election in the network, the elements in
Xi (n) respectively represent the node’s distance to the base station, the relative remaining
energy of the node, the number of neighboring nodes of the node, and the number of
times that the node is used for the cluster head.
Wi1 (n) = [wi11 (n), w 1 (n), . . . , w 1 (n)]T as the weight vector of the forward subnet-
i2 iN
2
work, and wkl as the weight vector of the competing subnetwork.
The update formula of the weight vector is as follows:

ωi1 (n1 + 1) = ωi1 (n) + η[Xi (n) − ωi1 (n)] (12)

Where η is the adaptive learning efficiency, 0 < η < 1, and n1 is the number of
iterations of the competitive subnetwork.
Y (n) = [y1 (n), y2 (n), . . . , yN (n)]T as the output vector.
(2) Initialization
Initializing the forward subnetwork weight is actually a random assignment to it,
and satisfies the following formula:
N
wij1 = 1 (13)
j=1

The weights of the initialized competing subnetwork are shown in the following
formula. k and l represent the k th and the l th sensor nodes in the network.

1, k = l
wkl =
2
(14)
−δ, k = l
The output function of the forward subnetwork neuron is a linear function, and the
formula is as follows:

⎨ 0, x < ζ
y = x − ζ, ζ < x < μc (15)

C, x > μc
(3) Select training samples
After the sensor node distribution is completed, the node needs to send relevant
parameter information to the base station, these information as input vector of training
sample.
(4) Output results of the forward subnetwork are as follows:
N
yk (0) = Vi1 = f1 ( wij1 xj ), i = 1, 2, . . . M (16)
j=1

(5) The iterative process of the competitive subnetwork is as follows:



yk (n + 1) = f2 (yk (n1 ) − δ y1 ), k = 1, 2, . . . M (17)
k=l
An Improved Clustering Routing Algorithm Based on Leach 175

(6) Judge the output of competing subnetworks


If the output results of the competing subnetwork meet the requirements of the
conditions, then proceed to Step (7), otherwise go to Step (5) for the next iteration,
n1 = n1 + 1.
(7) Select the winning neuron
The largest output in the competitive network is the winner, that is, the cluster head
node for this time.
(8) Update the weight vector
The updated formula is as follows: ωi1 (n1 + 1) = ωi1 (n) + η[Xi (n) − ωi1 (n)].
(9) Judge the training times in the network
If the number of training times of the current network meets the conditions, the
training ends. Otherwise, continue to train the network.
After the above-mentioned cluster head selection process, it then enters the next
stage.

3.3 Data Transmission Stage


Improvement of Intra-cluster Communication Mode. In many WSN clustering
routing algorithms, time division multiple access (TDMA) is usually used for data com-
munication [14]. After the network clustering is completed, the nodes in the cluster are
awakened in the allocated TDMA time slot. A large amount of data is redundant and
repetitive that collected by sensor nodes. When the cluster head collects the repeated
data of the node, it will first conduct data fusion processing on these data and transmit
them to the base station. Because the cluster head establishes the TDMA schedule, the
node will still work normally when no valid data is collected during the working period
of the node. But it may result in time slot waste. In response to the above problems,
a polling mechanism is used for intra-cluster communication [15]. In each cluster, the
cluster headsets up a polling table based on nodes’ information in the cluster. The polling
table records the corresponding relationship between the IDs of the nodes in the cluster
and the polling sequence. Each node receives services according to the polling table.
When the energy of a node is exhausted, or a node is in a dormant state, which makes
it unable to transmit data, the cluster head will remove this node from the polling table.
The process can avoid the problem of time slots being wasted, thus reducing unneces-
sary network energy loss and improving network performance. If the member nodes in
the cluster are close to the sink node, then it directly communicates with the sink node.
Otherwise, the member nodes in the cluster transmit data to the cluster head, which
integrates and sends the data to the sink node. This can reduce the pressure on the cluster
head and save energy in the network.

Improvement of Communication Between Clusters. Many algorithms are based on


the idea of clustering to balance the energy consumption problem in the cluster and
ignore the energy consumption problem between clusters. In the LEACH algorithm,
the single-hop communication method transmits data, which conduces that the cluster
heads farther away from the base station, the more energy it consumes, the faster the
node’s death rate, resulting in partiality network collapse. This communication mode will
increase the energy consumption of the cluster head and lack consideration of the energy
176 J. Yu and Y. Bai

consumption of the whole communication network. At the same time, the transmission
rate of single-hop communication will also decrease, and the network’s lifetime will
inevitably be shortened. All nodes of the M-Leach protocol [16] network use multi-hop
communication to transmit data. Although the energy consumption of distant nodes is
improved, the power consumption of nodes close to the base station will increase under
this communication mode.
Therefore, the improved algorithm in this paper uses single-hop and multi-hop com-
munication methods to send data to the base station. Firstly, the paper analyzes transmit
data required the energy between the single-hop and multi-hop communication modes.
Assuming that the node’s transmission energy consumption model is a free space chan-
nel model, a cluster head node can reach a base station with a fixed distance of r through
n-1 hops [17]. The energy consumption of single-hop communication for k -bit data
transmission by the cluster head is:

⎨ ETX (k, n × r)
Edir = k × Eelec + k × εfs × (nr)2 (18)
⎩  
k × Eelec + εfs × n × r
2 2

The energy consumption of multi-hop communication is:



⎨ n × ETX (k, r) + (n − 1) × ERX (k)
Emul = n × k × Eelec + εfs × r 2 + (n − 1) × Eelec × k (19)
⎩  
k × (2n − 1)Eelec + εfs × n × r 2
If the energy consumption of multiple hops is less than that of single hops, that is:
   
k × (2n − 1)Eelec + εfs × n × r 2 < k × Eelec + εfs × n2 × r 2 (20)

From the above formula, we can get:



2Eelec
r> (21)
nεfs

Since Eelec = 50 nJ/bit and εfs = 10 pJ/ (bit.m2 ) are known. Therefore, when the
minimum value of n is 2, that is, r > 70 m, multi-hop transmission consumes less
energy.
On the basis of distance from the cluster head to base station, the data transmission
mode is selected by setting a threshold value. If the distance is greater than 70 m, the
cluster head applies to multi-hop communication method; otherwise, the cluster head
directly applies to single-hop communication method. This can save the energy cost in
the process of network communication.

4 Simulation Results and Analysis


4.1 Simulation Parameter
In this paper, the algorithm presented in this paper and the two comparison algorithms are
simulated on the Matlab R2016b platform. The parameters of the simulation environment
are shown in Table 1:
An Improved Clustering Routing Algorithm Based on Leach 177

Table 1. The parameters of the simulation.

Parameter Value
Sensor field 100 m × 100 m
Number of nodes 200
Packet size 500 bytes
Initial energy of nodes 0.5 J
Eelec 50 nJ/bit
εfs 10 pJ/(bit·m2 )
Emp 0.0013 pJ/(bit·m4 )

4.2 Network Life Cycle


The changes of surviving nodes in the three routing algorithms are shown in Fig. 2. Due
to the uneven energy consumption of nodes, the LEACH algorithm causes node death
earlier. LEACH algorithm, KUCR algorithm, and LEACH-improved algorithm have
455 rounds, 626 rounds, and 718 rounds of node death, respectively. Compared with
the LEACH algorithm and KUCR algorithm, the improved algorithm in this paper has
improved by 58% and 15%, respectively. This is mainly because the improved LEACH
algorithm evenly clusters the network, considers multiple factors in the cluster head
election, and improves the data communication method. Compared with the compari-
son algorithms LEACH and KUCR, the improved algorithm in this paper significantly
improves the network life cycle.

Fig. 2. Changes of surviving nodes.


178 J. Yu and Y. Bai

4.3 The Remaining Energy of the Network

As the number of simulation rounds increases, the change of the remaining energy of
the whole network is shown in Fig. 3. Under the premise of consuming the same energy,
the algorithm in this paper will run more rounds than the LEACH algorithm and KUCR
algorithm. This is because the data transmission stages consume a lot of energy, and the
algorithm in this paper improves the data communication method and saves energy. In
the clustering stage of the network, the clustering algorithm is used to avoid extremely
maximal and minimal clusters. Therefore, the energy consumption of the nodes in the
network is effectively balanced, and the energy utilization rate is improved.

Fig. 3. Remaining energy of the network.

4.4 Network Throughput

Throughput is an important reference performance index for measuring network per-


formance. Therefore, the experiment simulates the throughput of the three algorithms.
With the continuous change of simulation time, the change curve of network throughput
is shown in Fig. 4. From the results, the network throughput of the algorithm in this
paper is significantly higher than the other two comparison algorithms. This is due to
the introduction of the polling mechanism into the data communication phase within the
cluster, which effectively utilizes the time slots of the network.
An Improved Clustering Routing Algorithm Based on Leach 179

Fig. 4. Network throughput.

5 Conclusion
This paper proposes an improved algorithm to solve the problems existing in LEACH.
The network clustering uses an improved algorithm based on the K-means algorithm
to make the network clustering uniform. In the cluster head selection process, various
factors are considered comprehensively, which is more conducive to balancing network
energy consumption. In data communication, the data transmission mode is changed,
and the energy utilization rate is improved. The simulation results show that, compared
with the other two algorithms, the proposed algorithm can solve premature node death,
effectively balance the network energy consumption, improve network throughput and
prolong the network life cycle.
The algorithm mentioned in this article was tested on a simulation platform. Due
to the limited experimental conditions, the nodes could not be deployed in the actual
environment, and the impact of unexpected factors was ignored in the actual working
environment. In the future, the algorithm design will focus on research work in this area
and consider more practical factors to make the designed algorithm more perfect.

Acknowledgment. This work was supported by the Plateau Discipline Innovation Team Project
of the Harbin University of Commerce.

References
1. Wang, T.S., et al.: Genetic algorithm for energy-efficient clustering and routing in wireless
sensor networks. Syst. Softw. 146, 196–214 (2018)
180 J. Yu and Y. Bai

2. Pan, J.Q., Feng, Y.Z.: Improved LEACH sensor network cluster routing algorithm. J. Jilin
Univ. (Sci. Edition) 56(6), 1476–1482 (2018)
3. Heinzelman, W.R., Chandrakasan, A., Balakrishnan, H.: Energy-efficient communication
protocol for wireless microsensor networks. In: Proceedings of the 33rd Annual Hawaii
International Conference on System Sciences, Hawaii, pp. 3005–3014. IEEE (2000)
4. Huang, L.X., et al.: Improved LEACH protocol algorithm based on energy balanced and
efficient WSN. J. Commun. 38(S2), 164–169 (2017)
5. Zhu, S.X., Ma, H.F., Sun, G.L.: An energy-efficient wireless sensor network improved LEACH
protocol. J. Harbin Univ. Sci. Technol. 26(3), 91–98 (2021)
6. Singh, S.P., Sharma, S.C.: An improved cluster-based routing algorithm for energy opti-
misation in wireless sensor networks. Int. J. Wireless Mobile Comput. 14(1), 82–89
(2018)
7. Zhang, Y.Q.: Research on uniform clustering routing algorithm for wireless sensor networks
based on K-means. Control Eng. 22(6), 1181–1185 (2015)
8. Senthilkumar, C., Manickam, J.P.: A path-aware clustering mechanism for energy-efficient
routing protocol in wireless sensor networks. J. Comput. Theor. Nanosci. 14(11), 5478–5483
(2017)
9. Li, Y.N., Xu, F.T., Chen, J.X.: Clustering optimization strategy for WSNs Based on LEACH.
Chinese J. Sens. Actuators 27(5), 670–674 (2014)
10. Wu, X.N., et al.: Clustering Routing protocol based on improved particle swarm optimization
algorithm in WSN. J. Commun. 40(12), 114–123 (2019)
11. Huang, W., Ling, Y., Zhou, W.: An improved LEACH routing algorithm for wireless sensor
network. Int. J. Wireless Inf. Netw. 25(3), 323–331 (2018). https://doi.org/10.1007/s10776-
018-0405-4
12. Bano, S., Khan, M.: A survey of data clustering methods. Int. J. Adv. Sci. Technol. 113,
133–142 (2018)
13. Zhang, X.L.: Improved IoT Energy Consumption Balance Routing Algorithm Based on
LEACH Protocol. Jilin University, Jilin (2016)
14. Al-Baz, A., El-Sayed, A.: A new algorithm for cluster head selection in LEACH protocol for
wireless sensor networks. Int. J. Commun. Syst. 31(1), 1–13 (2018)
15. Liu, L.J., et al.: Research on FPGA WSN polling access control protocol. J. Commun. 37(10),
181–187 (2016)
16. Sony, C.T., Sangeetha, C.P., Suriyake, C.D.: Multi-hop LEACH protocol with modified clus-
ter head selection and TDMA schedule for wireless sensor networks. In: Communication
Technologies (GCCT), pp. 539–543. IEEE (2015)
17. Wang, B., Fu, D.S.: Improvement of LEACH routing protocol for wireless sensor networks.
Instrum. Technique Sens. 8, 71–74 (2016)
Association Rules Mining Algorithm Based
on Information Gain Ratio Attribute Reduction

Tongtong Han1 , Wenjing Wang1 , Min Guo1 , and Shiyong Ning2(B)


1 Harbin University of Commerce, Harbin 150028, China
2 Heilongjiang Provincial Key Laboratory of Electronic Commerce and Information Processing,
Harbin 150028, China
101102@hrbcu.edu.cn

Abstract. In actual association rule mining, data sets collected from enterprises
or real life often have some problems, such as a large amount of data missing
or data redundancy, which greatly increases the spatial complexity of mining
association rules and makes mining efficiency inefficient. Not only that, some
actual data set contain hundreds or even more attributes. Not only does it take too
long to mine association rules, but there are too many association rules obtained,
making it difficult for users to distinguish which is more valuable information
in practical applications. It is difficult to apply these data to actual enterprises
to get greater benefits. In response to these problems, this paper proposes an
association rule algorithm based on the FP-Growth association rule algorithm of
information gain ratio attribute reduction to extract more valuable information and
improve the efficiency of association rule mining. Finally, through experiments and
comparisons, it is verified that the algorithm proposed in this paper can effectively
mine the association rule information of multi-attribute data sets.

Keywords: Fuzzy rough sets · Information gain ratio · Attribute reduction ·


Association rules · FP-Growth

1 Introduction
Association rules are a rule-based machine learning method used to discover valuable
internal connections between transaction items in the data set. In 1993, Agrawal proposed
the Apriori algorithm [1]. As an association rule algorithm, the Apriori algorithm aims
to discover interesting connections between products in the customer’s shopping basket,
creating a way to find association rules, and thus the association rules. It has received
extensive attention, and many scholars have studied it. During the research process,
scholars discovered that the Apriori algorithm needs to scan the database multiple times
before generating a complete set of frequent patterns. Many candidate sets are generated
in this process, which makes the mining of association rules inefficient, and it is difficult
to use when the amount of data is large. In 2000, J. Han [2] proposed the FP-Growth
algorithm. The mining efficiency of the algorithm was very high, and it solved some
problems in the Apriori algorithm. Scholars continue to study and put forward many

© The Author(s), under exclusive license to Springer Nature Switzerland AG 2022


A. E. Hassanien et al. (Eds.): BIIT 2021, LNDECT 107, pp. 181–189, 2022.
https://doi.org/10.1007/978-3-030-92632-8_18
182 T. Han et al.

improved methods, but when these algorithms have more attributes in the data set, the
efficiency of the algorithm’s execution becomes very low.
The appearance of rough set theory provides favorable conditions for mining asso-
ciation rules. Rough set [3–6] is a set of theories proposed by Professor Z. Pawlak of the
Polish University of Technology. Attribute reduction [7, 8] is one of the core contents of
the rough set. Its role is to eliminate redundant attributes under the premise that the nature
of the original decision table does not change to obtain a reduced decision table. As a
typical attribute reduction algorithm, information gain attribute reduction can reduce the
data set’s attributes very well. Still, when the data set is too large, the attribute reduction
algorithm is not ideal. In 2012, Jianhua Dai et al. proposed an attribute appointment
algorithm based on information gain ratio [9], Reducing many attributes when the data
set is large. The reduction efficiency is greatly improved.
In the development of association rules research, no scholar has combined attribute
reduction with the FP-Growth algorithm. Therefore, to deal with the problems of too
many attributes of association rule data sets, and low algorithm execution efficiency,
this paper uses the method of information gain than attribute reduction on the data set
to preprocess the data to clean out the attributes that are irrelevant or unimportant to
the research content, then combined with the FP-Growth algorithm to realize efficient
and accurate mining of association rules. This is a new research direction and provides
research ideas.

2 About Attribute Reduction


2.1 Basic Knowledge

Definition 1: [7] Information system: It is composed of an ordered four-tuple, namely:


IS = {U, R, V, f}, where U is the set of objects, and R is the collection of all attributes of
the object. V = Ur⊆R Vr , V R in the formula is the value domain of the attribute r. f: U ×
R → V means that a specific value is assigned to the object from the attribute domain
Information function.
Definition 2: [7] Decision system: Information system with decision attributes. That is,
DS = {U, C ∪ D, V, F} and C ∩ D = φ. Among them, U is a finite non-empty domain,
C = {c1, c2, …, cm} is a set of conditional attributes, D is a set of decision-making
attributes, and D = φ.
Definition 3: [7] Equivalence relationship: R is a binary relation on the set X, and the
equivalent relation R needs to satisfy the following three properties at the same time:
Reflexivity: ∀a ∈ S, there must be an ordered pair (a, a), and the ordered pair must
satisfy the relation R;
Transitivity: If ∀d ∈ X, ∀e ∈ X, ∀f ∈ X, (d, e) ∈ R and (e, f) ∈ R, then (d, f) ∈ R;
Symmetry: ∀d ∈ X, ∀e ∈ X, If (d, e) ∈ R, then (e, f) ∈ R.
Definition 4: [9, 11] The related indistinguishable relationship:

INB(B) = {(x, y)|∀a ∈ B, f (a, x) = f (a, y)} (1)


Association Rules Mining Algorithm 183

approximation R(X ) and the upper approximation R(X ) respectively:

R(X ) = {xi |[xi ]R ⊆ X } (2)

R(X ) = {xi |[xi ]R ∩ X = φ} (3)

Definition 6: [9, 12, 13] Let the information system IS = {U, R, V, f}, P ⊆ R, U/P =
{X 1 , X 2 , …, X n }. The Shannon’s entropy H(P):


n 
n
|Xi | |Xi |
H (P) = − p(xi ) log(xi ) = − log (4)
|U | U
i=1 i=1

Definition 7: [9, 12, 13] Taking B in the condition attribute as the condition, the
conditional entropy of decision attribute D:
 
 m Xi ∩ Yj 
n 
Xi ∩ Yj
H (D|B) = − log (4)
j=1 |U | |Xi |
i=1

2.2 Information Gain Ratio Attribute Reduction Algorithm

Information gain indicates the degree to which this condition reduces information
entropy. See Definition 8 for its calculation formula.
Definition 8: [9, 14] In the decision-making system DS, B = ∅, the gain of attribute a:

Gain(a, B, D) = H (D) − H (D|{a}) = I ({a}; D) (6)

Definition 9: [9, 15] When B = ∅, the information gain ratio of attribute a:

Gain(a, B, D) I ({a}; D)
GainRatio = = (7)
H ({a}) H ({a})
Therefore, the algorithm steps of information gain ratio attribute reduction [9] are
as follows:

(1) Calculate the relationship matrix List1 of all attributes in the data set, Calculate the
incidence matrix D of decision attributes;
(2) B = ∅;
(3) Calculate information gain;
(4) ∀a ∈ C − B, calculate the conditional attribute of a and GainRatio ;
(5) Choose the attribute k that maximizes the mutual gain ratio of a. And add the
selected attribute k to B;
(6) if GainRatio >0, then B ← B ∪ {a}, Go to step 4, otherwise go to step 7;
(7) Get the reduced attribute B.
184 T. Han et al.

2.3 Information Gain Ratio Attribute Reduction Case


The data set in Table 1 comes from the Algeria forest fire data set [16] on the UCI website.
The data set has a total of 122 instances and contains 12 attributes, which are day (D),
month (M), temperature (T), relative humidity (RH), wind speed (WS), Rainfall, fine
fuel humidity code in FWI system (FWI stands for Forest Fire Meteorological Index)
(FFMC), Duff humidity code of FWI system (DMC), drought code of FWI system (DC),
initial spread index of FWI system (ISI), cumulative index of FWI system (BUL), fire
weather index (FWI), and class (C): “fire” and “Non-fire”.

Table 1. A dataset of forest fires in Algeria

D M T RH Ws Rain FFMC DMC DC ISI BUI FWI C


1 6 29 57 18 0 65.7 3.4 7.6 1.3 3.4 0.5 0
2 6 29 61 13 1.3 64.4 4.1 7.6 1 3.9 0.4 0
3 6 26 82 22 13.1 47.1 2.5 7.1 0.3 2.7 0.1 0
4 6 25 89 13 2.5 28.6 1.3 6.9 0 1.7 0 0
5 6 27 77 16 0 64.8 3 14.2 1.2 3.9 0.5 0
… … … … … … … … … … … …

Using the information gain ratio attribute reduction algorithm, implemented in


Python language, the 9 attributes after reduction are obtained: ‘ISI’, ‘FFMC’, ‘RH’,
‘month’, ‘day’, ‘Ws’, ‘Temperature’, ‘Rain’, ‘DC’. The reduction result is shown in
Fig. 1.

3 Association Rules Related Definitions and Related Algorithms


3.1 Support
Support is one of the important evaluation indicators of association rules. Support is
usually used to delete meaningless rules.
The support of X ⇒ Y:
Support(X ⇒ Y ) = P(X ∪ Y ) = Count(X ∪ Y /|D| (8)
The minimum importance of association rules is represented by the threshold of the
minimum support of the item set [17], denoted as min_ sup [17].

3.2 Confidence
For a given rule X ⇒ Y , the higher the confidence level, the greater the probability that
Y will be included in X’s transaction. Confidence degree [18] indicates the strength of
the association, or the reliability of the rule:
P(XY ) support(X ∪ Y ) count(X ∪ Y )
Confidence(X ⇒ Y ) = P(Y |X ) = = = (9)
P(X ) support(X ) count(X )
Association Rules Mining Algorithm 185

Fig. 1. Data set after attribute reduction

The minimum confidence [18], denoted as min _conf, represents the minimum
confidence of the association rule.

3.3 FP-Growth Algorithm

In 2000, Han Jiawei and others proposed the association rule algorithm FP-Growth algo-
rithm [19]. The algorithm first needs to generate FP-tree for the data set, and secondly,
it continuously mines frequent itemsets from the generated FP-tree [20].
FP-Tree [20] is a special prefix tree structure; each itemset and its frequency are
stored as a string in the prefix tree. Constructing an FP-tree requires establishing a
project header table and two traversals of the data set. After constructing the FP-tree,
you can start mining frequent itemsets from the previously processed data set. First,
obtain the conditional pattern base from the constructed FP-tree, build the conditional
FP-tree, and finally, repeat the first two steps until the tree contains an element item.
Then generate candidate association rules from the obtained frequent itemsets, set the
minimum confidence, and further filter the candidate association rules to finally get the
association rules.

4 FP-Growth Mining Algorithm Based on Information Gain Ratio


Attribute Reduction

The FP-Growth [21] mining algorithm based on information gain ratio attribute reduction
is mainly divided into two steps:
Step 1: Use information gain ratio attribute reduction algorithm to clean data to
remove redundant or unimportant data.
186 T. Han et al.

Step 2: Use the FP-Growth algorithm to extract valuable association rule information
from the reduced data set, which improves the accuracy of mining efficiency and helps
users better.
Algorithm 1 Information gain ratio attribute reduction algorithm:

Input: The original data set Dataset.


Output: Data set B after attribute reduction.
(1) Calculate the relation matrix List1 of all attributes in the data set including
decision attributes;
(2) Calculate the incidence matrix D of decision attributes;
(3) B=ѫ;
(4) Let a=C-B, and calculate the conditional attribute of a and GainRatio ;
(5) Choose the attribute k that maximizes GainRatio ; and B←B∪{k};
(6) If GainRatio >0, then B←B∪{a}, go to (6), otherwise go to (9);
(7) Get attribute B, which is the reduced attribute.

Algorithm 2 FP-Growth algorithm [19]:

Input: Data set B; min _ sup.


Output: Complete set of frequent patterns.
(1) Traverse data set B once, calculate the support degree of each data item, and
arrange in descending order;
(2) According to min _sup, remove the data items that do not meet the minimum
support, and get List1;
(3) Create FP-tree root node N, create frequent entry table;
(4) Traverse the data set for the second time, build an FP-tree;
(5) Find the conditional pattern base of each item from top to bottom;
(6) Recursively call the tree structure to remove data items that do not meet min
_sup;
(7) Get the complete set of frequent patterns.

After two algorithms, at this time, it is necessary to generate candidate association


rules from the obtained frequent itemset, set the minimum confidence, and further filter
the candidate association rules to finally get the association rules.

5 Experiment Results and Analysis


The experiment environment is on a computer with AMD Ryzen7 5800H CPU, 16 GB
RAM, Windows 2020 operating system, and Python implementation language.
The experimental data consists of 3 data sets, the Catalog Cross-Sell data set, and
the other two are the Breast _Cancer [22] data set and the Bank Marketing [23] data set
from the UCI website.
The attribute reduction [24] was performed on the three data sets based on the
information gain ratio. See Table 2.
Association Rules Mining Algorithm 187

Table 2. Experimental data set and reduction results

Number Total Number of original attributes Number of attributes after reduction


Catalog 4998 10 8
Breast 569 32 20
Bank 45211 17 10

The Catalog Cross-Sell data set comes from a multi-department catalog company.
It is a random customer sample with 4998 rows of data and 10 attributes. The attributes
are customer number, clothing department, household goods department, health product
department, automobile department, and personal electronics. Product department, com-
puter department, gardening department, strange gift department, jewelry department.
In the data set, “1” means that 6 products were purchased from a certain catalog in the
department, and “0” means that no purchases were made.
Use the algorithm proposed in this paper and the traditional FP-Growth algorithm to
mine association rules on the Catalog Cross-Sell data set, and compare the two mining
algorithms’ establishment time and mining time. The results are shown in Fig. 2 and
Fig. 3.
It can be seen from the results in Fig. 2 that the traditional FP-Growth algorithm
takes longer to build than this algorithm, and as the data set becomes larger and larger,
the proposed algorithm and the traditional FP-Growth algorithm build time. The gap is
getting bigger and bigger. From the results in Fig. 3, the mining time of the algorithm
proposed in this paper is shorter. The greater the number of data sets, the more obvious
the superiority of the algorithm proposed in this paper.

FP-Growth The algorithm proposed in this article


60
BUILDING TIME/S

50
40
30
20
10
0
1000 2000 3000 4000 5000
ACTUAL NUMBER OF EXAMPLES

Fig. 2. Comparison of the establishment time of catalog cross sell data set
188 T. Han et al.

FP-Growth The algorithm proposed in this article


60

50
MINING TIME/S

40

30

20

10

0
1000 2000 3000 4000 5000
ACTUAL NUMBER OF EXAMPLES

Fig. 3. Comparison of catalog cross sell data set mining time

6 Conclusion
This article provides a good research idea. Faced with the problems of too many attributes
in the data set, resulting in too much space complexity and too long time for mining
association rules, this paper proposes an association rule algorithm based on information
gain ratio attribute selection to improve the accuracy of mining association rules and
reduce the space complexity makes the obtained association rules more pertinent, easy
for users to understand and beneficial for merchants to make better use, and the mining
efficiency is greatly improved. And the effectiveness of the algorithm in this paper is
proved through experiments. However, because the data set is static data, the association
rule changes of the increased data set cannot be mined in real-time. In this respect, further
improvement is needed. The next step will improve the algorithm to solve the problem
of dynamic data association rule mining.

References
1. Rakesh, A., Tomasz, I., Arun, S.: Mining association rules between sets of items in large
databases. Manage. Data 22(2), 207–216 (1993)
2. Han, J., Pei, J., Yin, Y.: Mining frequent patterns without candidate generation. In: Proceedings
of the 2000 ACM SIGMOD International Conference on Management of Data, 16–18 May
(2000)
3. Pawlak, Z.: Theoretical Aspects of Reasoning about Data. Kluwer Academic Publishers,
Dordrecht (1991)
4. Pawlak, Z.: A rough set perspective. Comput. Intell. 11, 227–232 (1995)
5. Pawlak, Z.: Rough set approach to knowledge-based decision support. Eur. J. Oper. Res. 99,
48–57 (1997)
6. Pawlak, Z.: Rough set theory and its applications to data analysis. Cybernet. Syst. Int. J. 29,
661–688 (1998)
7. Xun, J., Xu, L., Qi, L.: Association rules mining algorithm based on rough set. In: Proceedings
2012 IEEE International Symposium on Information Technology in Medicine and Education,
pp. 361−364 (2012)
Association Rules Mining Algorithm 189

8. Ma, J., Ge, Y., Pu, H.: Survey of attribute reduction methods. Data Anal. Knowl. Discovery
4(1), 40–50 (2020)
9. Dai, J., Xu, Q.: Attribute selection based on information gain ratio in fuzzy rough set theory
with application to tumor classification. Appl. Soft Comput. J. 13(1), 211–221 (2013)
10. Pawlak, Z.: Rough sets. Int. J. Comput. Inf. Sci. 11(5), 341–356 (1982)
11. Liu, D., Li, T., Miao, D.: Three-Way Decision and Granular Computing. Science Press, Beijing
(2013)
12. Hu, Q., Yu, D., Xie, Z., Liu, J.: Fuzzy probabilistic approximation spaces and their information
measures. IEEE Trans. Fuzzy Syst. 14, 191–201 (2006)
13. Lee, T.: An information-theoretic analysis of relational databases-part I: data dependencies
and information metric. IEEE Trans. Softw. Eng. 13, 1049–1061 (1987)
14. Miao, D., Hu, G.: A heuristic algorithm for knowledge reduction. Comput. Res. Dev. 36,
681–684 (1999)
15. Jia, P., Dai, J., Pan, Y., Zhu, M.: Novel algorithm for attribute reduction based on mutual-
information gain ratio. J. Zhejiang Univ. (Engineering Edition) 40, 1041–1044 (2005)
16. Zhang, Q., Yang, J., Yao, L.: Attribute reduction based on rough approximation set in algebra
and information views. IEEE Access 4, 5399–5407 (2016)
17. Wang, Q., Xie, F., Zhao, M.: Analysis of cross-selling model of digital tv package based on
data mining. Radio Telev. Inf. 12, 57–59 (2016)
18. Yue, S.: Research on Association Rule Mining Algorithm Based on Frequent Pattern Tree.
Tianjin Polytechnic University, Tianjin (2019)
19. Ranjith, K., Yang, Z., Caytiles, R., Iyengar, N.: Comparative analysis of association rule
mining algorithms for the distributed data. Int. J. Adv. Sci. Technol. 102, 49–60 (2017)
20. Ma, R., Wu, H.: Research and application of association rules mining based on FP_growth
algorithm. Jo. Taiyuan Normal Univ. (Natural Science Edition) 20(01), 19–22 (2021)
21. Yang, Z., Geng, X.: Research on mining association rules based on multi-granularity attribute
reduction. Comput. Eng. Appl. 55(6), 133–139 (2019)
22. Nick, S., Wolberg, W., Mangasarian, O.: Nuclear feature extraction for breast tumor diagnosis.
In: Biomedical Image Processing and Biomedical Visualization, International Society for
Optics and Photonics, vol. 1905 (1993)
23. Moro, S., Cortez, P., Rita, P.: A data-driven approach to predict the success of bank
telemarketing. Decis. Support Syst. Elsevier 62, 22–31 (2014)
24. Lin, T.Y., Yin, P.: Heuristically fast finding of the shortest reducts. In: Tsumoto, S., Słowiński,
R., Komorowski, J., Grzymała-Busse, J.W. (eds.) RSCTC 2004. LNCS (LNAI), vol. 3066,
pp. 465–470. Springer, Heidelberg (2004). https://doi.org/10.1007/978-3-540-25929-9_55
Based on the Inception and the ResNet Module
Improving VGG16 to Classify Commodity
Images

Yuru Zhang1,2(B) and Yang Shen1,2


1 Harbin University of Commerce, Harbin 150028, China
2 Heilongjiang Provincial Key Laboratory of Electronic Commerce and Information Processing,

Harbin 150028, China

Abstract. With the advent of the new retail era, intelligent identification of goods
in the shelf image has become an important technology for managing unmanned
supermarkets. In recent years, classification methods based on convolutional neu-
ral networks have been widely used in image classification. In this paper, a deep
neural network based on VGG16-IR is designed to improve the classification
accuracy of low-resolution commodity images. First, Inception and residual ideas
are integrated, and four modules are designed, namely Inception ResNet module
1, Inception ResNet module 2, ResNet module 3, and ResNet module 4. These
modules replace the convolution phase of the original VGG16, and VGG16-IR
replaces the new network. Secondly, a small batch gradient descent algorithm with
momentum is used. Finally, the activation function of ELU is used to avoid the
phenomenon of partial neuron necrosis. The experiment is conducted in Microsoft
PI100 product image training and testing. The experimental results show that the
model’s accuracy in the training set reaches 99.69%, and the accuracy of the test
set reaches 90.63%. Compared with 89.10% of VGG16, the model in this paper
improves 1.5% and increases the generalization ability to a certain extent.

Keywords: VGG16 · Inception ResNet module · VGG16-IR · PI100 commodity


image date set · SGDM optimizer

1 Introduction

At present, the unmanned supermarket has become a trend as a new economy. To realize
the function of unmanned supermarkets to query and update commodity information. It
is necessary to analyze commodity information combined with image processing tech-
nology and identify commodity types in commodity images. So automatic acquisition of
commodity categories has become a hot research direction combining image processing
technology and retail industry. The traditional image classification algorithm is mainly
divided into two stages: the first stage adopts the traditional SIFT feature operator [1–3]
and HOG feature operator [1–3] and other artificially designed feature algorithms to
extract the image features; In the second stage, support vector machine [4] (SVM) and

© The Author(s), under exclusive license to Springer Nature Switzerland AG 2022


A. E. Hassanien et al. (Eds.): BIIT 2021, LNDECT 107, pp. 190–205, 2022.
https://doi.org/10.1007/978-3-030-92632-8_19
Based on the Inception 191

other classifiers are used for classification. Ke Xiao et al. [5] realized the classifica-
tion and retrieval of flower images by using the image multi-feature and support vector
machine method. Liu Chenxi et al. [6] made use of the multi-feature fusion technology
to judge the image. Still, these shallow learning models could not represent complex
functions and could not effectively describe the data itself.
In view of the problems existing in traditional image classification, deep convolu-
tional neural network gradually occupies a dominant position in the image classification
field. In 2014, Visual Geometry Group of The University of Oxford proposed VGGNet
[7], which replaces the 7 × 7 convolution kernel in the original convolution layer with
3 × 3 convolution kernel based on AlexNet [8]. In 2014, GoogLeNet proposed Incep-
tionV1 [9], and then proposed several versions of Inception (V2~V4). After Inception,
the width and depth of the entire network structure were increased, and the max-pooling
layer was used to replace the full connection layer in the last layer, reducing the number
of parameters. In 2014, He Kaiming’s team published ResNet [10], which transmits the
feature map of the upper layer to the next layer through the residual network, which can
make the network layer very deep and get a good classification effect at the same time.
In order to further improve the classification function of VGG16 and solve the prob-
lem that the parameters of VGG16 network layer are not updated. This paper is designed
and improved based on VGG16. In terms of model improvement, the advantages of
Inception and residual ideas are integrated, and four new modules are designed, namely
Inception ResNet module 1, Inception ResNet module 2, ResNet module 3, and ResNet
module 4. Compared with the original module, these new modules have many convo-
lution kernels with different sensing field sizes and channel merging layer and residual
connection. In the forward propagation, the acquisition, fusion and transmission ability
of image features are stronger. Then, these modules are used to replace the convolu-
tion stage of the original VGG16, and the new network designed is named VGG16-IR.
The problem that the parameters of VGG16 network layer are not updated is solved.
The model’s generalization ability and classification accuracy are judged by comparing
the test accuracy between VGG16 and VGG16-IR. In terms of algorithm selection, the
small-batch gradient descent algorithm [11] is adopted to solve the problem of travers-
ing all training data in the stochastic gradient descent algorithm, and the training time
is reduced, and the generalization ability of the model is improved through the batch
method. At the same time, the momentum method is used to prevent the network from
falling into the local minimum when the learning rate is low. The activation function
chooses ELU to solve the problem that the parameters of ReLU activation function can-
not be updated in the negative half interval. In terms of experiments, all experiments
in this paper are carried out under the Microsoft PI100 commodity image. VGG16 and
VGG16-IR are trained and tested 3 times each, and the proportion of training set and test
set is 8:2. The training set and test set of each experiment are selected again. The loss
value is reduced by 0.1093, and the test accuracy is improved by 1.50%, which shows
that VGG16-IR is better than VGG16 in terms of model accuracy and loss.
192 Y. Zhang and Y. Shen

2 Network Structure Design


2.1 Inception ResNet Module Design
Among the existing image classification methods, GoogLeNet’s inception module adopts
the convolutional layer series-parallel approach, which can effectively connect the fea-
tures of the ground floor and the features of the top floor. But GoogLeNet’s architecture
has some deep problems in the network, and the parameters can’t be updated timely and
effectively [12]. The reasons are as follows:
∂J (W ,b)
(l) is taken as the gradient loss function of the jth node of the l th layer. J(W, b)
∂Zj
(l)
is the total loss function, and zj is the output of the jth node in the l th layer. The gradient
loss function of the l th layer is shown in formula (1).
 
s
l+1  
(l) (l+1) (l+1) (l)
σj = Wkj σk f  zj (1)
j=1

Sl+1 is the number of nodes in layer l + 1. f (•) represents the activation function,
(l+1)
and Wkj represents the weight parameter between the k th node in layer l + 1 and the
j th node in layer l .
Formula (2) is obtained by expanding formula (1) down one level.
sl+1  l+2    
(l)  (l+1) s (l+2) (l+2)  (l+1) (l)
σj = Wkj Wik σi f zk f  zj (2)
k=1 i=1

It can be seen that the propagation


 error  can bewritten
 in the form of the multiplication
(l+1) (l+2) (l+1) (l)
of parameters Wkj , Wik , f  zk and f  zj . At this time, the gradient is easy
to disappear or explode, which affects the parameter learning of this layer. Therefore, the
fitting and generalization ability of deep neural network is poor. To solve this problem,
He Kaiming’s team proposed the ResNet network structure. ResNet’s residual structure
can effectively improve the learning problem of deep neural networks, making it possible
to train deeper networks.
Combined with the idea of Inception feature superposition and residual, this paper
designs four basic modules, each of which functions as an independent convolution stage
in the whole network, effectively improving the capability of network feature selection
and feature fusion, ensuring timely updating of network parameters and improving model
accuracy. Multiple NiN [13] parallel structures are used to replace the original convolu-
tion layer in the ResNet module, and Inception residual module 1 as shown in Fig. 1 is
designed. Module 1 acts on the second convolution phase of VGG16.
The advantages of this approach are as follows: firstly, feature maps extracted by
different convolution kernel are superimposed together, which not only have low-level
features, but also features with high semantic level, so feature information will be richer.
In addition, through training, the network will have more opportunities to select fea-
tures, thus improving the accuracy of the model. Moreover, the convolution layer is
connected in series and features are extracted multiple times, which is more delicate
than the features extracted by a single convolution layer. Secondly, without increasing
Based on the Inception 193

Fig. 1. Inception ResNet module 1

network parameters, the network width and depth are increased, and the generalization
ability of the model is improved. Thirdly, a convolution layer Conv1 of 1 × 1 is added
before the parallel structure, and the convolution nuclear power of 1 × 1 can realize
cross-channel information interaction and information integration, and automatically
realize the dimensionalities’ increase or decrease of the number of feature channels,
and maintain the original input features to a large extent. Fourthly, the actual number of
convolution layer channels in Conv1 is twice that of input channels. The purpose is to
expand the number of input feature maps and better extract image features. The channel
merge layer strengthens the connection between low-level features and high-level fea-
tures and improves model accuracy. Fifth, module 1 continues to use the characteristics
of small convolution kernel, so that the network will go through multiple nonlinear acti-
vation functions, which can make the nonlinear capability of the network stronger and
improve the training accuracy. Sixth, the ResNet module can be used to propagate back
faster and more accurately [14] and update the weight parameters, so that the network
depth can be deepened without affecting the accuracy.
Figure 2 shows Inception 2 ResNet module. The module is the improved version of
module 1 and 2 for VGG16 third stage of convolution, module 2 is to the advantage of
the module 1 is 3 × 3 convolution kernels into two layer 1 × 3 and 3 × 1 series, reduced
the number of arguments increases the nonlinear network, at the same time to obtain
the characteristics of horizontal and vertical, rich network of feature extracting, further
improve the accuracy of the model.
194 Y. Zhang and Y. Shen

Fig. 2. Inception ResNet module 2

Figure 3 shows ResNet module 3, which is designed to act on the fourth convolution
stage of VGG16 by using the idea of ResNet and the convolution characteristic of 1 ×
1. Module 3 adopts the principle that three 3 × 3 convolution kernels and one 7 × 7
convolution kernels have the same size receptive fields. In the case of extracting higher-
level semantic features, image categories can be better differentiated, so that the network
can classify images more effectively.

Fig. 3. ResNet module 3


Based on the Inception 195

Figure 4 shows ResNet module 4, which is an improved version of module 3. Com-


pared with module 3, the number of parameters is reduced, and it acts on the fifth
convolution stage of VGG16. The design concept is the same as that of Module 2. Under
the condition of obtaining advanced semantic features, the horizontal and vertical fea-
tures in the feature map can still be obtained, so as to enhance the non-linearity of the
network and make it better used for classification.

Fig. 4. ResNet module 4

2.2 Overall Network Structure

Because VGGNet network in the classification task performance outstanding. It is proved


that using a small convolution kernel 3 × 3, 2 × 2 MaxPooling and increasing the network
depth can effectively improve the model effect. However, VGG16 has a large number
of parameters, consumes disk space, and the network is deep, so the parameters can’t be
updated in a timely and effective way and the model is not very ideal.
Therefore, on the basis of VGG16, this paper uses the Inception ResNet module and
the ResNet module of design to replace the convolution stage of the original VGG16,
and solves the problems of VGG16 parameter update, large number of parameters, and
insufficient network extraction features. The new network is named VGG16-IR, as shown
in Fig. 5.
In order to make a fair comparison between VGG16-IR and VGG16, the feature
extraction network in this paper is also composed of five convolution stages. Except the
first stage is unchanged, other convolution stages are replaced by designed modules and
196 Y. Zhang and Y. Shen

Fig. 5. VGG16 - IR

1 × 1 convolution. The 1 × 1 convolution here is to ensure that the output channel of


each convolution stage is consistent with VGG16. It is followed by a Maxpool layer, so
that the output characteristic map size and number of channels after Maxpool are the
same in the two networks.
In the specific convolution path of the network, each convolution module is composed
of multiple convolution layers of 1 × 1 and 3 × 3. For all the 3 × 3 convolution kernels
in the network, the stride is 1 and the padding is 1. In this way, features can be extracted
repeatedly and the delicacy and accuracy of features can be improved. The convolution
kernel of 1 × 3 and 3 × 1 adopts the same convolution, and the convolution kernel of
1 × 1 has the step size of 1 and the fill of 0, so as to ensure the same size of features
extracted in the module. The ELU activation function is used after each convolution
layer to prevent the gradient disappearance phenomenon that may be caused during
training. The maximum pooling layer can retain the most significant features of the
image. In view of the diversity of commodity images, in The Inception residual module
and ResNet module, the forward propagation of network can well connect the features
of the lower layer and the higher layer, and promote the fusion of the features of different
network layers, while the backward propagation speeds up the updating of parameters,
so as to improve the model effect. The comparison of VGG16-IR and VGG16 parameters
is shown in Table 1.
Among them, VGG16 convolution parameter is 14710464, VGG16-IR convolution
parameter is 8943360, the number of parameters is 5767104 less.
The reduced parameters reduce the risk of overfitting to some extent. There are other
ways to reduce the risk of overfitting. For example, getting more data allows the model
to learn more effective features. L2 regularization [15] is to add certain regularization
constraints to the parameters of the model and add the weight value into the loss function.
Based on the Inception 197

Table 1. Number of VGG16-IR and VGG16 parameters

Network Parameters of Parameters of Parameters of Parameters of Parameters of


model convolution convolution convolution convolution convolution
phase one phase two phase three phase four phase five
VGG16 38592 221184 1474560 5898240 7077888
VGG16-IR 38592 78912 175104 2621440 6029312

Ensemble learning can also reduce the risk of overfitting a single model. However, since
this paper is to study the influence of the new model on the experimental results, in order
to reduce other influencing factors, these methods to reduce overfitting are not used in
this paper.

3 Optimization of Network
3.1 Selection of Optimization Algorithm
The essence of neural network training is optimization, so that the training parameters
can fit the image as much as possible. The most widely used method in neural network
is stochastic gradient descent [16]. The principle is that along the negative gradient
direction, the function drops the fastest. In this paper. The cross-entropy loss function is
calculated with softmax results and the small-batch gradient descent algorithm isused.
Assuming that the batch is a set of m training data x(1) , y(1) , · · · , x(m) , y(m) . The
loss function is shown in formula (3).

m 
n
(i) (i)
J (W , b) = − m1 yk ln ok (3)
i=1 k=1
(i)
ok represents the probability that the prediction of sample i belongs to category k,
(i)
and softmax function is used. ok is shown in formula (4).

ef (W ,b,x )
(i)
(i)
ok = m f (W ,b,x(i) ) (4)
i=1 e

(i) (i)
yk is the actual probability (if the true category of the i TH sample is k, then yk = 1,
(i)
otherwise yk = 0). x(i) is the network input value. f W , b, x(i) is the output value of
the last layer of the full connection layer. W is the weight parameter, and b is the bias
parameter.
The optimal parameter (W , b) solved in this paper, according to the principle of
gradient descent method, uses J (W , b) to obtain ∂J ∂W (W ,b)
and ∂J (W
∂b
,b)
by taking partial
derivatives of W and b respectively.
Then, parameters Wt and bt are updated according to their negative gradient direction
to obtain parameters Wt+1 and bt+1 after training. a is the learning rate, as shown in
formula (5) and (6).
Wt+1 = Wt − a ∂J (W
∂Wt
t ,bt )
(5)
198 Y. Zhang and Y. Shen

bt+1 = bt − a ∂J (W t ,bt )
∂bt (6)

Although the gradient descent method is also called the fastest descent method. The
direction of each iteration always advances towards the local minimum value, and it is
easy to cause the local optimal solution. In order to solve this problem, a momentum
[17] term is introduced to accelerate the convergence speed and the convergence curve
is more stable. The weight parameter Wt+1 with momentum is updated, seeing formula
(7) and (8).

Vt = μVt−1 + a ∂J (W
Wt
t ,bt )
(7)

Wt+1 = Wt − Vt (8)

For the update of parameter bt+1 , see formula (9) and (10).

Ut = μUt−1 + a ∂J (Wbtt ,bt ) (9)

bt+1 = bt − Ut (10)

μ is momentum, ranging from 0 to 1. The value of μ is 0.9 in this paper. Vt is the


update direction at the next moment of the weight parameter Wt . Ut is the update direction
at the next moment of the bias parameter bt . The downward direction is mainly the
cumulative downward direction before and slightly biased toward the current downward
direction, which is the optimization algorithm of SGDM.
At the same time, a also determines the performance of the model. In order to accel-
erate the convergence rate and improve the solving accuracy, the attenuation learning rate
is usually adopted. When the error curve reaches a plateau, a more refined adjustment
can be made by reducing the learning rate.

3.2 Selection of Activation Function


In the original VGG16 network structure, ReLU will be selected as the nonlinear acti-
vation function of the network after the convolutional layer, but the phenomenon of
gradient disappearance will occur in the actual network training. Since the function is
always zero in the left half region, its gradient will be zero forever and the parameter
cannot be updated, so the use of the ReLU activation function will completely deactivate
many neurons in the network.
In order to solve the problem of gradient disappearing during ReLU training, the
activation function selected in this paper is ELU [18] activation function, as shown in
formula (11).

xx ≥ 0
ELU = (11)
a(ex − 1)x < 0

Compared with the ReLU function, the gradient of the negative half interval of the
ELU activation function is aex , and the whole function does not contain the dead zone.
Based on the Inception 199

When the input is greater than or equal to 0, the gradient is consistent with the ReLU
function. Parameter a is set to 1 to ensure that the gradient at x = 0 is 1. In the case
of x < 0, the gradient in the negative half region is 0~1, and the parameter can also be
updated, which solves the problem that the output result of ReLU function does not have
negative value and the neuron is necrosis.

4 Experiment and Analysis


4.1 Data Set Processing
This paper aims to study the classification of commodity images. The experimental data
comes from PI100 commodity images established by Microsoft Research Asia. There
are 100 categories of PI100 data set, 100 pictures in each category, and the size is 100
× 100. A total of 4,000, and the remaining 1,000 as a test set.
The commodity image has the following characteristics: most of the background is
white, less background interference; The target product is in the middle of the image,
taking up most of the space; It’s mostly frontal shots. These features are helpful to image
classification, but there is still a gap in the practical application, the actual image needs
to be preprocessed.

Table 2. Sample PI100 data set

Commodity
Sample date set
categories

Cup

Electric drill

Eardrop

Guitar

Data preprocessing in this paper mainly includes: manual labelling of data sets; The
resolution of the modified image is 224 × 224. Table 2 shows part of the modified data
set.
200 Y. Zhang and Y. Shen

4.2 Network Model Parameter Setting


The initial weight is initialized by Xavier [19]. The mean is 0. The variance is 2/(numIn
+ numOut), numIn = k × C in , numOut = k × k num . k is the size of the convolution
kernel. C in is the number of input channels. k num is the number of convolution kernel.
Bias is 0. Momentum is set to 0.9. The initial learning rate is 0.001, and the learning rate
attenuation rate is 0.5. If the batch size is 25, 160 iterations are required for 4000 images
in the training set. The 160 iterations are taken as one round. The number of training
rounds is set to 20, and the number of training iterations is set to 3200. The learning rate
decays once every 5 rounds, and the test is carried out once every 10 iterations.

4.3 Experimental Results and Analysis


This paper takes MATLAB as the deep learning framework. In order to analyze the
reliability of the VGG16-IR model proposed in this paper, a total of 6 experiments are
carried out in this paper. On the data set PI100, VGG16 and VGG16-IR are trained and
tested 3 times each, and the experimental results are averaged. The final experimental
results are shown in Table 3. The product image classification by classic VGG16 network
is compared with the VGG16-IR results in this paper.

Table 3. Comparison of VGG16 and VGG16-IR results

Training accuracy Test accuracy Training loss Test loss


VGG16 experiment 1 88.00% 89.90% 0.2175 0.4952
VGG16 experiment 2 100.00% 88.20% 0.0210 0.5841
VGG16 experiment 3 96.00% 89.20% 0.0581 0.4936
VGG16-IR experiment 1 100.00% 91.40% 0.0053 0.3853
VGG16-IR experiment 2 100.00% 90.70% 0.0027 0.3969
VGG16-IR experiment 3 100.00% 89.80% 0.0037 0.4627

ntrain−correct
Train-accuracy = 25 , ntrain−correct
is the total number of batch training sam-
1   (i)
m n
(i)
ples that are correctly classified, and train-loss = − 25 yk ln ok . Test-accuracy
i=1 k=1
ntest−correct
= 1000 , ntest−correct
is the total number of Test set samples that have been correctly
1   (i)
m n
(i)
classified, test-loss = − 1000 yk ln ok . The average training accuracy of VGG16
i=1 k=1
is 94.67%. The average test accuracy is 89.10%. The average training loss is 0.0989, and
the average test loss is 0.5243. The average training accuracy of VGG16-IR is 99.69%.
The average test accuracy is 90.63%. The average training loss is 0.0039, and the average
test loss is 0.4150.
It can be seen from Table 3 that the VGG16-IR network used in this paper is better
than VGG16 in terms of accuracy and loss index, indicating that the model and algorithm
adopted in this paper are improved compared with VGG16. This is mainly due to the
Based on the Inception 201

feature extraction capability of Inception residual module, which increases the network’s
ability of feature selection and utilization, and then increases the ability of commodity
image recognition. In addition, through SGDM optimizer and ELU activation function,
the model accuracy is further improved, and the test accuracy is increased by 1.50%,
indicating that the generalization ability of the method in this paper is increased, and the
model performance is improved. The training accuracy increased by 5.02%.
Figure 6 shows the training progress chart of the first 1000 iterations of VGG16 for
three times.

Training accuracy for the first 1000 iteraons


120%
100%
80%
Accuracy

60%
40%
20%
0%
1 70 140 210 280 350 420 490 560 630 700 770 840 910 980

Iteraon
VGG16 experiment 1 VGG16 experiment 2 VGG16 experiment 3

Fig. 6. VGG16 training progress chart for the first 1000 iterations

Figure 7 shows the testing progress chart of the first 1000 iterations of VGG16 for
three times.
202 Y. Zhang and Y. Shen

Tesng accuracy for the first 1000 runs


90.00%
80.00%
70.00%
60.00%
Accuracy

50.00%
40.00%
30.00%
20.00%
10.00%
0.00%
1 80 160 240 320 400 480 560 640 720 800 880 960

Iteraon
VGG16 experiment 1 VGG16 experiment 2 VGG16 experiment 3

Fig. 7. VGG16 testing progress chart for the first 1000 iterations

Figure 8 shows the training progress chart of the first 1000 iterations of VGG16-IR
for three times.

Training accuracy for the first 1000 runs


120%
100%
80%
Accuracy

60%
40%
20%
0%
1 70 140 210 280 350 420 490 560 630 700 770 840 910 980

Iteraon
VGG16-IR experiment 1 VGG16-IR experiment 2
VGG16-IR experiment 3

Fig. 8. VGG16-IR training progress chart for the first 1000 iterations

Figure 9 shows the testing progress chart of the first 1000 iterations of VGG16-IR
for three times.
Based on the Inception 203

Tesng accuracy for the first 1000 runs


100.00%
80.00%
Accuracy

60.00%
40.00%
20.00%
0.00%
1 80 160 240 320 400 480 560 640 720 800 880 960

Iteraon

VGG16-IR experiment 1 VGG16-IR experiment 2


VGG16-IR experiment 3

Fig. 9. VGG16-IR testing progress chart for the first 1000 iterations

The x-coordinate represents the number of iterations, and the y-coordinate represents
accuracy. By comparing the training progress chart and test progress chart of VGG16 with
those of VGG16-IR, it can be seen that the training curve of VGG16-IR is steeper, and the
number of iterations needed to reach the stable state is less, mainly because the VGG16-
IR model uses wider and deeper network architecture and residual connection in terms
of model complexity. Residual structure and momentum gradient descent algorithm
accelerate the iteration speed of parameters, resulting in steep training curve. In the same
number of iterations, the test accuracy is higher than that of VGG16. By comparing the
training progress chart and test progress chart of VGG16-IR, it can be seen intuitively
that the generalization performance of VGG16-IR model is improved. Table 4 reflects
the test accuracy of different iterations.

Table 4. Test accuracy of VGG16 andVGG16-IR in the nth iteration

100 200 300 400 500 600 700 800


VGG16 6.63% 29.47% 48.73% 55.70% 64.73% 69.00% 72.73% 78.90%
VGG16-IR 50.67% 68.73% 75.63% 79.50% 81.47% 83.70% 85.60% 86.50%

Vgg16-IR model predicts the effect of PI100 products, as shown in Fig. 10. The pre-
diction title consists of prediction label and prediction accuracy. The prediction accuracy
of curtain is 89.74% and that of electric drill is 90.50%. Pendant prediction accuracy is
89.70%. The model predicted commodity categories with an accuracy of about 90%.
204 Y. Zhang and Y. Shen

Fig. 10. Prediction effect diagram

5 Conclusion
This paper’s main contributions and advantages are as follows: (1) Inception ResNet
module and the ResNet module are adopted in the overall network model, which strength-
ens the network’s ability to capture and utilize image features and improves the ability
to classify commodity images accurately. (2) An ELU activation function is added after
each convolution layer to avoid the phenomenon of gradient disappearance and enable
all neurons to exert effects in the network. (3) In order to avoid the network falling into
the local minimum point, a small batch random gradient descent algorithm with momen-
tum term is used to maintain the true descent direction and accelerate the convergence
speed.
The model used in this paper is tested in the Microsoft PI100 commodity image data
set. The test accuracy is 90.63% in the test set, which significantly improves the clas-
sification ability of commodity image. Compared with the existing general convolution
network model VGG16, the vGG16-IR in this paper has higher classification accuracy.
Because the Inception residual module is proposed in this paper, the training parame-
ters are reduced, the training speed is effectively accelerated, and the model accuracy
is improved. The next research focus is to continue to explore more efficient feature
learning modules.

Acknowledgment. This work is supported by the Natural Science Foundation of Heilongjiang


Province of China (No. LH2020F007).

References
1. Parashivamurthy, R., Naveena, C., Sharath, K.Y.H.: SIFT and HOG features for the retrieval
of ancient Kannada epigraphs. IET Image Process. 14(17), 4657–4662 (2020)
2. Lowe, D.G.: Object recognition from local scale-invariant features. In: Proceedings of the 7th
IEEE International Conference on Computer Vision, vol. 2, pp. 1150–1157(1999)
3. Dalal, N., Triggs, B.: Histograms of oriented gradients for human detection. In: Proceedings
of the 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition,
vol. 1, no. 1, pp. 886–893(2005)
Based on the Inception 205

4. Akshit, K., Pavan, D., Aarya, V., Manan, S.: A comprehensive comparative study of artificial
neural network and support vector machine on stock forecasting. Ann. Date Sci. 1–26 (2021).
https://doi.org/10.1007/s40745-021-00344-x
5. Ke, X., Chen, X.F., Li, S.Z.: Flower image retrieval based on multi-feature fusion. Comput.
Sci. 37(11), 282–286 (2010)
6. Liu, C.X., Chu, J.H., Lu, W.: Image salient judgment based on multi-feature fusion. Comput.
Eng. Appl. 50(9), 150–154 (2014)
7. Krizhevsky, A., Sutskever, I., Hinton, G.E.: ImageNet classification with deep convolution
neural networks. Commun. ACM. 60(6), 84–90 (2017)
8. Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image
recognition. Comput. Sci. (2014)
9. Szededy, C., Liu, W., Jia, Y.Q.: Going deeper with convolutions. In: IEEE Conference on
Computer Vision and Pattern Recognition, Boston, pp. 1–9 (2015)
10. He, K.M., Zhang, X.Y., Ren, S.Q., Sun, J.: Spatial pyramid pooling in deep convolutional
network for visual recognition. IEEE Trans. Pattern Anal. Mach. Intell. 37(9), 1904–1911
(2015)
11. Zhang, H., Fusheng, Y., Sun, J., Shen, X., Li, K.: Deep learning for sea cucumber detection
using stochastic gradient descent algorithm. Eur. J. Remote Sens. 53(1), 53–62 (2020)
12. He, K.M., Zhang, X.Y., Ren, S.Q.: Deep residual learning from image recognition. In:
Conference on Computer Vision and Partner Recognition, pp. 770–778 (2016)
13. Lin, M., Chen, Q., Yan, S.C.: Network in network. In: Proceedings of International Conference
on Learning Representations (2014)
14. Mehmet, A., Özgür, K., Tefaruk, H.: Suspended sediment prediction using two different
feed-forward back-propagation algorithms. Can. J. Civ. Eng. 34(1), 120–125 (2007)
15. Lv, W., Chen, Y.G., Shen, C.: Convergence analysis of backward iterative neural network
algorithm with L2 regularization term. Inf. Technol. Inf. 6, 183–186 (2015)
16. Robbins, H., Monro, S.: A stochastic approximation method. Ann. Math. Stat. 22(3), 400–407
(1951)
17. Qian, N.: On the momentum term in gradient descent learning algorithms. Neural Netw. 12(1),
145–151 (1999)
18. Bai, Y., Jiang, D.M.: Application of improved ELU convolutional Neural network in SAR
image ship detection. Bull. Surveying Mapp. 1, 125–128 (2018)
19. Glomt, X., Bengio, Y.: Understanding the difficulty of training deep feedforward neural
networks. J. Mach. Learn. Res. 9, 249–256 (2010)
Improved Algorithm of Multiple Minimum
Support Association Rules Based on Can Tree

Xiao Zhao(B) and Shi Yong Ning

Harbin University of Commerce, Harbin, Hei Longjiang, China


101102@hrbcu.edu.cn

Abstract. To improve the mining efficiency of Can Tree in association rules, this
study found that the header table in the traditional Can Tree mining algorithm
repeatedly traverses the path, causing excessive overhead. And the single support
will lead to invalid rules. The improved Can Tree mining algorithm performs
reduction processing and removes the header table, making it more suitable for
the idea of Can tree construction. Traversing the path only once avoids multiple
traversals, and the addition of multiple minimum support does not introduce too
much overhead. Experiments have proved that the improved Can tree mining is
better than the traditional mining algorithm.

Keywords: Data mining · Can Tree · Association rule mining · Frequent


itemsets · Multiple minimum support

1 Introduction

Association rules have been an important field of data mining since they were proposed by
Agrawal [1]. With fixed data sets and support, scholars continue to mine static association
rules. The classical algorithms include Apriori [2], FP-growth [3], Ecalt [4], etc.
With the development of The Times, people are concerned about how to better mine
data and update the original data mining results with the least cost, namely dynamic asso-
ciation rule mining. Dynamic association rules refer to the mining of association rules
when the database is modified or the minimum support changes. Incremental mining has
been a research topic in dynamic mining since Cheung et al. first proposed the incremen-
tal mining idea –FUP [5, 6] algorithm based on the Apriori framework in order to solve
the incremental updating problem. Afterward, under the tree structure framework, Koh
et al. proposed an incremental mining algorithm based on FP tree (AFPIM) [7] algo-
rithm. To achieve the effect of the building once, mining many times, and to overcome
the shortcomings of constantly scanning the database, Cheung et al. improved the tree
structure and proposed the compressed and arranged transaction sequence tree (CATS
tree) [8] and the mining algorithm FELINE through continuous sorting to improve the
compressed storage. Aiming at cats’ shortcomings, Leung et al. Proposed a natural ordi-
nal tree (Can Tree) sorted according to a certain normative order [9]. Compared with
incremental mining, it is found that CATS trees are more suitable for interactive mining

© The Author(s), under exclusive license to Springer Nature Switzerland AG 2022


A. E. Hassanien et al. (Eds.): BIIT 2021, LNDECT 107, pp. 206–213, 2022.
https://doi.org/10.1007/978-3-030-92632-8_20
Improved Algorithm of Multiple Minimum Support Association 207

because the insertion of transaction items needs to adjust the position according to the
frequency. Can Tree is especially suitable for incremental mining because of its stability.
In practical application, the single support confidence framework cannot fully express
the differences of each component of the mining object in its characteristics and actual
occurrence frequency, so conventional information or some useless information is often
mined. In order to satisfy the characteristics of each project and fully excavate each
project, researchers propose association rule mining with multi-minimum support [10–
13].

2 Can Tree
2.1 Related Theories of Can Tree Construction
Can Tree (Canonical-order Tree) adopts the prefix tree model, arranges the transaction
items in a certain order, the user sets the order. Generally it can be lexicographical
or alphabetical order and the user’s interest. The sorted transaction items are directly
inserted into the Tree with “null” as the root node. The Can Tree can have multiple
subtrees. Once the order is determined, the transaction tree is unique, and the insertion
of a transaction item does not require a large-scale search to find the insertion position.
Only Fig. 1 shows the establishment process. The minimum support degree does not
participate in the construction of the Can Tree but only participates in the mining process
after the Tree is established. The algorithm can perform frequent pattern mining under
different support degrees without rebuilding the tree structure.

Fig. 1. The process of building a can tree

As for tree mining, the proposer of Can Tree did not specifically describe it, but
suggested one-way mining with the idea of divide and rule like FP tree, as shown in
Fig. 2. Because of the unique nature of Can Tree, scholars are attracted to continue
in-depth research. Zou Li-Kun et al. [14] used the structure of Can Tree to make up for
the deficiency of FP tree in 2008 and put forward an algorithm of rapidly generating
a conditional mode tree, which improved the operation efficiency. Chen Gang et al.
proposed a fast construction algorithm based on Can-Tree (FCCAN) in 2014 [15]. In
this algorithm, an auxiliary storage structure based on a hash table is added based on a
rapidly generating conditional schema tree to reduce the search time of items. In 2014,
Hu Junequan proposed a Can [16] Tree based on data quantity sorting, which reduced
the size of Can Tree.
208 X. Zhao and S. Y. Ning

Fig. 2. Can Tree mining under the idea of Fp mining

2.2 Minimum Support


Sort down closure: that is, all subsets of frequent itemsets are frequent itemsets.
In the multi-support algorithm model, the threshold value of a frequent itemset is
no longer a fixed value but multiple values according to the characteristics of each
item. In this way, in generating frequent itemsets, targeted selection of each data item
will be more accurate and flexible than frequent itemsets screening with fixed values,
and invalid rules will also be avoided. In the traditional association rule algorithm, the
mined frequent itemsets are all sorted down closure. But the multi-minimum support
does not conform to the rule of sort down closed. This is because the frequent itemsets
under the definition of multi-support only need to be greater than or equal to the support
of the minimum support item, so the mined itemsets include Infrequent itemsets.For
example {(a,c,d,f):3} where minsupport{a:6,c:5,d:2,f:3}, MIN(a,c,d,f) = min(d) = 2.
support{(a,c,d,f):3} > MIN(a,c,d,f), so (a,c,d,f) is a frequent item set, but its subset
support{( a,c):3} < MIN(a,c) = 5, so {a,c} is not a frequent item set.

3 Optimization and Improvement Based on Can Tree


3.1 Optimization of Traversal
Firstly, the use of the traditional header table is removed, the next item position pointing
to the node with the same name is not recorded, and only the transaction item set table
is used to save the item set information. Because the Can tree scans the database only
once, a data scan into memory is directly merged into the Can Tree after it is arranged
in a fixed order. No data is deleted, so the head table is not needed to screen frequent 1
item set (the support degree is greater than or equal to the minimum support degree) in
the tree construction stage.
Secondly, Can tree mining based on a shared stack is proposed in the mining stage.
The path is traversed only once. When the conditional pattern base is obtained from
bottom to top along the leaf node, the conditional pattern bases of all data items on the
path are obtained and saved together. That is, starting from the root node, the tree nodes
are traversed from top to bottom in depth first. In this process, the nodes are continuously
pushed. When traversing to a leaf node, the node obtains the conditional mode base and
then pops the stack, and then traverses other child nodes of the parent node of this node.
Improved Algorithm of Multiple Minimum Support Association 209

For example, as shown in the figure below, the traversal starts from the root node “Null”,
and the depth traversal path {a, d, b} is pushed onto the stack. At this time, the elements
of the stack are < a, d, b >, b is the leaf node, and b is popped out. Extract the remaining
elements in the stack as {(a, d): 1} as the conditional pattern base of b, backtracking
to d, at this time d has no untraversed child nodes, and d pops the stack to extract the
conditional pattern base {a: 3}, backtracking To a, at this time, a has unprocessed child
nodes and continue to traverse down deeply (Fig. 3).

Fig. 3. Optimization of traversal using stack

3.2 Add the Minimum Support to the Can Tree

At present, the algorithms for the minimum support research mainly focus on the tradi-
tional Apriori and FP-Growth algorithms. The Can Tree is originally constructed accord-
ing to user-defined order and contains all data items, which leads to long mining process-
ing time. The additional minimum support will increase the processing time. The Can
Tree needs to process each node on the path when traversing the Tree. The processing
time will not increase when the multi-minimum support is regarded as a single minimum
support of multiple values.
In order to record the minimum support of each item, add a column to store the
minimum support in the transaction item set table. Although the minimum support is
set separately for each project, it still faces the same problem as the single support, that
is, it is unreasonable to set the minimum support too high or too low. In order to solve
the deviation of the minimum support of manual assignment, the minimum support of
the item is determined by the number of times the item appears in the database and the
maximum support set. The formula is as follows.
maxminsupport minsupport
= (1)
firstsupport isupport
Maxminsupport is a manual set of the maximum and minimum support, first support
is the support for the most supported data item, minSupport is the minimum support for
the current data item to be set, and support is the support for the current item.
If we write the left-hand side of the formula as α, the multimin supports calculated by
this formula in descending order must be equal to the multimin supports in descending
210 X. Zhao and S. Y. Ning

order. In this way, in the tree building stage, sorting by support order is sorting by multiple
minimum support order. For a Can tree constructed in this order, assuming a path from
the root node to the leaf node, if the support degree of the leaf node is greater than its own
minimum support degree, then the path must be frequent. Because the minimum support
of leaf nodes is the minimum value of multiple minimum support of all nodes on this
path, that is, minsupport(leaf) = min{min (I1), min (I2),……, min(In)}, I1-In is the node
on this path. Similarly, when extracting the conditional pattern base for a node, it is not
necessary to look at whether the support of the minimum support term in the conditional
pattern base is greater than its own minimum support degree, but only need to look at the
support of this node and this path. In this way, the problem is simplified to the traditional
algorithm mining under multiple minimum support models. Although the maximum
and minimum support are manually set as a whole, the minimum support of other items
is indirectly determined by the maximum and minimum support. To a large extent, it
avoids the non-standard error caused by the manual setting of the minimum support
for each project, and it will be more reasonable to determine the minimum support for
other projects according to the proportion of the maximum project in the database. In
conclusion, the frequent item set algorithm expanded and mined according to formula
(1) must have recursion, and must be sorted downward closed, because all subsets of
frequent candidate item sets in the frequent item set after the recursive production of L2
are frequent (Fig. 4).

Fig. 4. Optimization with the addition of multiple minimum support

4 Algorithm Process
According to the theorem combined with the characteristics of Can Tree, scan the
database twice. During the first scan, all item sets are sorted in order of support from
large to small to make a sequence table of item sets, and the minimum support of each
item is calculated. Scan the database for the second time, sort each transaction according
to the item set order table, and insert the sorted data items directly into the Can Tree
with “null” as the root node to build the Tree. In the process of tree construction, search
for common items layer by layer from the root node. If there are any common items,
add one to the count of support degree of data items under the path. The pseudo code is
as follows.
Improved Algorithm of Multiple Minimum Support Association 211

1. If thisitem.getChildren() ! = null
2. If tmpRoot = subTreeRoot.findChild(record.peek())) ! = null
3. tmpRoot.countIncrement(1);
4. subTreeRoot = tmpRoot;
5. record.poll();
6. Else Node = new Node();
7. Node.count = 1;
8. addNode();

After building the Tree, initialize the stack, traverse the Tree in depth first, and press
the items on the stack at the same time; On the premise of descending order of support
degree and multi-minimum support degree as the measurement standard, Once the item’s
support is less than its minimum support, the path is no longer traversed down the stack,
If there are other path nodes at the top of the stack that haven’t been traversed, traversing
other child paths, Each time out of the stack, the conditional mode base of the stack item
is extracted and saved. The pseudo code is as follows.

1. if(nowNode.getName()! = null)
2. stack.push(nowNode.getName());
3. if (nowNode.getChildren()! = null){
4. for(TreeNodechild:nowNode.getChildren())
5. if(child.getCount > itsmin)
6. CPBtable = traversal(child, stack,singlefequentlist);
7. if(stack.empty() = = false)
8. stack.pop();

5 Experiment

The experimental environment is Intel®Coretm i7-8750h CPU@2.20 GHz 8 gbyte mem-


ory, the operating system is the ultimate version of Windows 10, and all algorithms are
programmed and implemented eclipse (version = 3.7.0; java1.8.0). Compared with other
algorithms, because the Can Tree saves all the original database information, including
frequent itemsets and infrequent itemsets, it inevitably occupies more memory. Although
the database is traversed only once, the Tree is too large, and the minimum support does
not participate in the tree construction, so there is no difference in the running time of
the fixed data set in the tree construction stage. When the support is large, there are only
a few frequent itemsets, the time consumed by Can tree construction is much more than
that of the two-time scanning database algorithm led by FP algorithm. And the simple
Can Tree with its head removed and compact tree shape will surely save the time to
build a tree. The figure below is a comparison chart of running time under a fixed data
set (Figs. 5 and 6).
In mining the Can Tree, compared with its advantages and disadvantages, it is also
obvious. Although it is unnecessary to repeat the path when obtaining the conditional
pattern base, the conditional pattern base’s requirement can be achieved by traversing the
path only once. But its disadvantage is that it needs to develop an additional, conditional
212 X. Zhao and S. Y. Ning

Fig. 5. The results of a simple can tree running Fig. 6. Experimental results of can tree
experiment mining after traversal optimization

mode base for memory storage data items, which occupies a larger memory space.
And it is necessary to judge whether all nodes belong to a frequent 1 item set, which
also increases time consumption. In the incremental aspect, when adding new data,
although the Can Tree does not need to be reconstructed like the FP algorithm, it still
needs to process all nodes when mining. Constantly add quantitative data to simulate
the formation of incremental databases.

6 Conclusion
Based on the selection of Can trees, throwing away the constraints of the initial ideas, and
using a reduced Can tree construction method; Using stack to assist traversal simplifies
the mining process, avoiding the time waste caused by repeated traversal path, greatly
shortens the running time. The multi minimum support is integrated into the Can Tree to
solve the defect of inaccurate rules under the single support confidence framework. At
the same time, the formula is used to avoid the defect of manually setting the minimum
support of each item. The comparison of experimental results proves that the improved
algorithm can greatly improve the mining efficiency and has very good practicability.

References
1. Agrawal, R., Imieliński, T., Swami, A.: Mining association rules between sets of items in
large databases. ACM SIGMOD Rec. 22(2), 207–216 (1993)
2. Agrawal, R., Srikant, R.: Fast algorithms for mining association rules. In: International
Conference on Very Large Data Bases, San Francisco, pp. 487–499 (1994)
3. Han, J., Pei, J.: Mining frequent patterns by pattern-growth: methodology and implications.
ACM SIGKDD Explor. Newsl. 2, 14–20 (2000)
4. Zaki, M., Parthasarathy, S., Ogihara, M., et al.: New algorithms for fast discovery of associa-
tion rules. In: 7th International Workshop Research Issues on Data Engineering, pp. 283–286
(1997)
5. Cheung W.L., Lee S.D., Kao, B.: A general incremental technique for maintaining dis-
covered association rules In: International Conference on Database Systems for Advanced
Applications, Singapore, pp. 185–194 (1997)
Improved Algorithm of Multiple Minimum Support Association 213

6. Han, J., Pei, J., Yin, Y., Mao, R.: Mining frequent patterns without candidate generation: a
frequent-pattern tree approach. Data Min. Knowl. Disc. 8, 533–587 (2004)
7. Koh, J.-L., Shieh, S.-F.: An efficient approach for maintaining association rules based on
adjusting FP-tree structures. In: Lee, Y., Li, J., Whang, K.Y., Lee, D. (eds.) Database Systems
for Advanced Applications, pp. 417–424. Springer, Heidelberg (2004). https://doi.org/10.
1007/978-3-540-24571-1_38
8. Cheung, W., Za¨ıane, O.: Incremental mining of frequent patterns without candidate
generation or support constraint. In: Proceedings of IDEAS, pp. 111–116 (2003)
9. Leung, C. K.-S., Khan, Q.I., Li, Z., Hoque, T.: CanTree: a canonical-order tree for incremental
frequent-pattern mining. Knowl. Inf. Syst. 11(3), 287–311 (2007)
10. Zhao, C.J., et al.: Algorithm for mining association rules with multiple minimum supports
based on FP-Tree. NZ J. Agric. Res. 50(5), 1375–1381 (2010)
11. Alhusaini, N.Q., Hu, M., Hawbani, A.: Improved P-tree Mine Rare Association Rule with
Multiple Minimum Support, Hong Kong (2015)
12. Liu, Y.C., Cheng, C.P., Tseng, V.S.: Discovering relational-based association rules with mul-
tiple minimum supports on microarray datasets. Bioinformatics (Oxford, England). 27(22),
3142–3148 (2011)
13. Huang, T., Cheng, K.: Discovery of fuzzy quantitative sequential patterns with multiple
minimum supports and adjustable membership functions. Inf. Sci. 222, 126–146 (2012)
14. Cheung, D.W., Han, J., NG, V.T., et al.: Maintenance of discovered association rules in large
databases: an incremental updating technique. In: Twelfth International Conference on Data
Engineering, pp. 106–144 (1996)
15. Zou, L.K., Zhang, Q.S.: Efficient incremental association rules mining algorithm based on
CANTree. Comput. Eng. 34(3), 29–31 (2008)
16. Cheng, G., Yan, Y.Z., Liu, B.Q.: A fast construction algorithm based on CAN-Tree.
Microelectron. Comput. 1, 76–82 (2014)
Mining Association Rules of Breast Cancer
Based on Fuzzy Rough Set

Min Guo1 , Tongtong Han1 , Wenjing Wang1 , and Shiyong Ning1,2(B)


1 Harbin University of Commerce, Harbin 150028, China
101102@hrbcu.edu.cn
2 Heilongjiang Provincial Key Laboratory of Electronic Commerce and Information Processing,

Harbin 150028, China

Abstract. In view of the current increasing status of breast cancer patients, to


enable patients to predict whether they have breast cancer by themselves from
physical examination data, this paper proposes a method for mining breast cancer
association rules based on fuzzy rough sets. The method proposed in this paper
first analyzes the attributes in the traditional blood data, then applies the attribute
reduction of the fuzzy rough set, deletes the attributes irrelevant to breast cancer,
and uses the Apriori algorithm in data mining to obtain the frequent items in
the remaining attributes Set, apply low support and high confidence to extract
many practical, strong association rules. Specific examples verify this method. The
experimental results show that this method can dig out more and higher-quality
rules compared with traditional algorithms. At the same time, these extracted rules
are highly effective reference values in diagnosing and preventing breast cancer.

Keywords: Association rules · Apriori algorithm · Fuzzy rough set · Attribute


reduction · Breast cancer

1 Overview
According to statistics, the incidence and mortality of breast cancer are among the best
among all kinds of malignant tumors in women, one of the leading killers threatening the
life and health of all human women. Therefore, breast cancer screening is an essential
strategy that enables people to find breast cancer earlier and ensure better treatment with
greater probability. Our goal is that the data collected from routine consultation and
blood analysis can predict breast cancer robustly. So the critical data selection method
is the key to predicting breast cancer.
With the deepening of scientists’ research, more and more data mining algorithms are
available, and it is the main difficulty to find a suitable algorithm for the problems studied.
A.Bourouis and others [1] use an artificial neural network algorithm to analyze the retinal
images collected by microscope films and identify retinopathy. Research shows that BP
neural network has a slow convergence speed and average prediction accuracy. Wang
Cheng [2] optimized the traditional random forest algorithm and proposed an improved
random forest algorithm based on decision tree clustering reduction, which improved

© The Author(s), under exclusive license to Springer Nature Switzerland AG 2022


A. E. Hassanien et al. (Eds.): BIIT 2021, LNDECT 107, pp. 214–224, 2022.
https://doi.org/10.1007/978-3-030-92632-8_21
Mining Association Rules of Breast Cancer 215

the accuracy and efficiency of the conventional random forest algorithm. Still, it was
difficult for decision trees to find rules based on the combination of multiple variables.
EI-Alfy [3] has studied the parallel attribute reduction method of large-scale data sets
oriented to rough decision-making sets based on a genetic algorithm. Still, it is difficult
for rough sets to deal with continuous attributes directly, so attributes must be discretized
first. 
To sum up, through combing the literature, it is found that any single data mining
method has certain limitations, especially in the face of various data. It is often neces-
sary to integrate and apply multiple ways to obtain more practical results [4, 5]. Fuzzy
rough sets are involved. Because breast cancer data are all continuous attribute values,
If the classical coarse mass theory is used directly for data mining, the data needs to be
discretized, which will destroy some relations among attributes and may not get some
valuable ties [6–9]. Therefore, this paper uses fuzzy rough set theory based on the gain
ratio to reduce the data, which can delete some conditional attributes that have nothing
to do with breast cancer. Finally, the Apriori algorithm is used to get the rules that can
predict breast cancer, and the extracted rules are expressed productively so that people
can use them themselves.

2 Attribute Reduction Algorithm Based on Fuzzy Rough Set


2.1 Rough Set Related Concepts
Definition 1. U is a domain, P, Q are two equivalence relations (i.e., knowledge) on U,
Then the definition of probability P, Q is defined as a subset of U.
 
  X1 X2 . . . Xn
X; p =
p(X1 ) P(X2 ) . . . P(Xn )
 
  Y1 Y2 . . . Yn
Y; p =
p(Y1 ) p(Y2 ) . . . p(Yn )
  |Yj |
Among them, p(Xi ) = |X i|
|U | , i = 1, 2, · · · , n; p Yj = |U | , j = 1,2,..,m; Symbol |E|
represents the cardinality of set e, then Shannon entropy H(P) is defined:
n n |Xi | |Xi |
H(P) = − p(Xi ) log2 p(Xi ) = − log (1)
i=1 i=1 |U | |U |
Entropy H (Q | P) measure Q corresponding to the knowledge P is defined as:
n m
H(Q|P) = − p(Xi ) p(Yj |Xi ) log2 p(Yj |Xi ) (2)
i=1 j=1

Definition 2 [8, 10]. There is a resolution table DS = ‹U, C ∪ D, V›, where C is the
definite set and D is the substantial value, B ⊆ C. This is the definition of information
between b and d.

I(B; D) = H(D) − H(D|B) (3)


216 M. Guo et al.

Definition 3 [10]. ∀a ∈ C – B, the gain of detail a is defined as

Gain(a, B, D) = I(B ∪ {a}; D) − I(B; D) (4)

Definition 4 [11]. ∀a ∈ C – B, and the mutual information gain ratio of detail a is defined
as
Gain(a, B, D) I (B ∪ {a}; D) − I (B; D))
GainRatio(a,B,D) = = (5)
H ({a}) H ({a})

2.2 Fuzzy Rough Set Related Concepts and Algorithms

Definition 5. Given a non-empty finite set of X, where R is an undefined equivalent


relationship defined by X, creating a relationship matrix M (R):
⎛ ⎞
r11 r12 r1n
⎜ r21 r22 · · · r2n ⎟
⎜ ⎟
M R̃ = ⎜ .. .. .. ⎟ (6)
⎝ . . . ⎠
rn1 rn2 · · · rnn

You can calculate the equivalence relation using the following two similarity
functions:

|x −x | |x −x |
1 − 4 ∗ |amaxi −ajmin | , |amaxi −ajmin | ≤ 0.25
rij = (7)
0, otherwise
   
xj − (xi − σa ) (xi + σa ) − xj
rij = max min , ,0 (8)
xi − (xi − σa ) (xi + σa ) − xi

Definition 6 [12, 13]. U can be regarded as the universe. According to the fuzzy
equivalence relations r, u can be divided into fuzzy partitions, which are defined as
 n
U /R̃ = [xi ]R̃ i=1 (9)

Definition 7 [12, 13]. The cardinality [xi ]R̃ is defined as


n
[xi ]R̃ = rij (10)
j=1

Definition 8 [12, 13]. There is a fuzzy information system FIS = <U, A, V, f>, storing
attribute sets. P and Q are two subsets of A, and the joint entropy of P and Q is defined
as
 
 
  1 n [x ]
i R̃ ∩ [xi Q̃ 
]
 
H̃ (PQ) = H Rp RQ = − log (11)
n i=1 n
Mining Association Rules of Breast Cancer 217

Definition 9 [12, 13]. There is a fuzzy decision system FDS = <U, A, V, f>, The
conditional entropy of D conditioned on B is defined as
 
 
1 n [xi ]R̃ ∩ [xi ]Q̃ 
H̃ (D|B) = − log (12)
n i=1 |[xi ]R̃ |

Definition 10. H̃ (D|B) = H̃ (BD) − H̃ (B)

Definition 11. In FDB, the mutual information b and d is defined as:

Ĩ (B; D) = H̃ (D) − H̃ (D|B) (13)

Definition 12. In FDB, the attribute gain (a, b, d) is defined as:

 B, D) = Ĩ (B ∪ {a}; D) − Ĩ (B; D)
Gain(a, (14)

Definition 13. In FDS, the expected input ratio of an attribute can be defined as

 B, D)
Gain(a,

Gain_Ratio(a, B, D) = (15)
H̃ ({a})
Then the selection of attributes based on the Gain is an algorithm.
GAIN_RATIO_AS_FRS.

Step 1. At heart, we first stipulate that B = Ø;


Step 2. Next, we need to calculate the significance of each conditional property in the

data Gain_Ratio(a, B, D);

Step 3. Select the maximum magnification point Gain_Ratio(a, B, D), save as a; finally
B ← B ∪ {a};

Step 4. If Gain_Ratio(a, B, D) > 0, then B ← B ∪ {a}, go to Step 2 or go to Step 5;
Step 5. Finally, the traits obtained in group B are the final results after reduction.

2.3 Related Concepts of Association Rules


Definition 14. Transaction D database includes transactions with unique TIDs. Among
them, each transaction belongs to a subset of elements I, T ⊆ I. The relevant rules can
be expressed with the logical simplicity of X ⇒Y.

Definition 15. The degree of S support in the union rules indicates the X ∪ Y trades
percentage in the trading database D.

Definition 16. Confidence c in connection rules refers to the ratio of the number of
trades containing X ∪ Y to the number containing x in the trading database.
support(X ∪ Y )
confidence(X ⇒ Y) = (16)
supoort(X )
218 M. Guo et al.

3 Algorithm of Breast Cancer Association Rule Mining Based


on Fuzzy Rough Set
The algorithm in this paper first uses the fuzzy approximate feature set reduction func-
tion to extract features affecting breast cancer from common blood data. It then uses
association rules to mine the regulations related to breast cancer, and the Apriori algo-
rithm is used to extract the rules. The specific process of breast cancer association rule
extraction based on fuzzy rough set is as follows:

3.1 Data Preprocessing


The primary purpose of data preprocessing is to remove noise data. The main operation
is to delete the data with obvious errors and make up the missing data.

3.2 Attribute Reduction


First, we must use the similarity function in formula 7 to calculate the relational matrix
for each property, let B = ∅, then calculate the Gain and gain ratio of each conditional
feature, select the maximized attribute gain ratio, and record it as a, and let B ← B ∩{a},

if Gain_Ratio(a, B, D) > 0, continue to search for other attributes, calculate the gain
values of the remaining characteristics until they are equal to 0, and then get the attribute
set selected using the gain ratio standard.

3.3 Data Discretization


Apriori algorithm of association rules can only deal with character data, so the data
after attribute reduction should be discretized according to data transformation rules.
Different characters should be used to represent different index data ranges.

3.4 Generating Frequent Itemsets L1


In rule extraction, itemset I represents the set of attribute values of all conventional blood
and attributes that influence breast cancer. One attribute value is one of the itemsets.
Calculate the support degree of each item set. Items more significant than the support
threshold constitute frequent item set L1.

3.5 Join Itemsets to Generate Frequent Itemsets Lk


Search for all the first (k − 2) data items in the itemset Lk − 1, and the last data item
set pairs with different things are joined to generate candidate itemset Ck. The itemset
caused by Lk − 1 join is Ck. Calculate the support degree of each itemset in each Ck. The
support degree of itemsets with support degrees less than the threshold is not calculated to
avoid unnecessary calculation. The itemsets that meet the threshold of support constitute
frequent itemsets Lk. Repeat the above steps until the set of frequent itemsets cannot be
generated.
Mining Association Rules of Breast Cancer 219

3.6 Generating Association Rules

Usually, the breast cancer type selection rules are multidimensional association rules.
The antecedents of the rules contain at least two items, that is, frequent itemsets with
more than three items. Find out the itemset Lk (k > 3) containing breast cancer types
from frequent itemsets, take breast cancer as the afterpart of the rule, and other attributes
as the antecedents of the government, and calculate the confidence of each rule Y ⇒Z.
The strict control of breast cancer type selection meets the confidence threshold.

4 Case Analysis
For the convenience of analysis, the breast cancer data in the UCI data set is simplified.
It is assumed that only the following five attributes are included, namely, Age, Glucose,
HOMA, Resistin. The following is an example of some of the original data (Table 1).

Table 1. Part of raw data

Age Glucose HOMA Resistin Classification


X1 24 88 1.33 6.85 1
X2 63 118 1.88320133 5.1042 1
X3 59 74 1.658774133 8.2049 2
X4 34 95 1.232827667 30.73606 2
X5 85 201 20.6307338 24.3701 2
X6 64 98 1.37788 13.91245 2

4.1 Attribute Reduction

We use fuzzy approximate group theory to remove features unrelated to breast cancer
from the dataset. Table 1 can be regarded as an information system. The universe U
includes six objects, namely U = {x1 , x2 , x3 , x4 , x5 , x6 }. The conditional attribute set
includes four attributes: Age, glucose, Homa, and resistance, which are marked as C1 ,
C2 , C3, and C4 respectively, that is, C = {C1 , C2 , C3 , C4 }, and the decision attribute
set D has only one attribute of disease type. First, we use Formula 7 to calculate the
relational matrix for each property.
220 M. Guo et al.

Next, we need to calculate the Gain of each Gain characteristic Gain(a, B, D) and
 B, D). And you need to select B = Ø.
the gain ratio Gain(a,

 1 , ∅, D) = Ĩ ({C1 }; D) = 0.34507503
Gain(C

Gain(C2 , ∅, D) = Ĩ ({C2 }; D) = 0.5162319
 3 , ∅, D) = Ĩ ({C3 }; D) = 0.1187528
Gain(C
 4 , ∅, D) = Ĩ ({C4 }; D) = 0.5362271
Gain(C


Next, we need to calculate the gain ratio Gain_Ratio(a, B, D) of each attribute.


Gain(C1 ,∅, D)

Gain_Ratio(C1 , ∅, D) = = 0.19556396
H̃({C1 })

Gain(C2 ,∅, D)

Gain Ratio (C2 , ∅, D) = = 0.31164397
H̃({C2 })

Gain(C3 ,∅, D)

Gain Ratio (C3 , ∅, D) = = 0.16571818
H̃({C3 })
Mining Association Rules of Breast Cancer 221


Gain(C4 ,∅, D)

Gain_Ratio(C4 , ∅, D) = = 0.32315529
H̃({C4 })

Then we choose C4 as the standard of gain ratio. And let B = {C4 }. and then continue
to select other attributes.

Gain(C1 ,{C4 },D)

Gain_Ratio(C1 , {C4 }, D) = = 0.0833209
H̃({C1 })

Gain(C2 ,{C4 },D)

Gain_Ratio(C2 , {C4 }, D) = = 0.0936883
H̃({C2 })

Gain(C3 ,{C4 },D)

Gain_Ratio(C3 , {C4 }, D) = = 0.0
H̃({C3 })

Then we choose C2 as the standard of gain ratio. And let B = {C4 , C2 }. and then
continue to select other attributes.

Gain(C1 ,{C4 ,C2 },D)

Gain_Ratio(C1 , {C4 , C2 }, D) = = 0.1285770
H̃({C1 })

Gain(C3 ,{C4 ,C2 },D)

Gain_Ratio(C3 , {C4 , C2 }, D) = = 0.0
H̃({C3 })

Then we choose C1 as the standard of gain ratio. And let B = {C4 , C2 , C1 } and then
continue to select other attributes.

Gain(C3 ,{C4 ,C2 ,C1 }, D)

Gain Ratio (C3 , {C4 , C2 , C1 }, D) = = 0.0
H̃({C3 })

Therefore, after the attribute reduction of the fuzzy rough set, the attribute that
influences breast cancer is {C4 , C2 , C1 }, that is {Age, Glucose, Resistin}.

4.2 Data Discretization

In this paper, according to the routine blood information table in physical examination,
referring to related papers [4], and consulting doctors and experts, the critical value of the
range of discrete attributes is determined. Age (A1: 20–40, A2: 41–65, A3: 66), Glucose
(B1: 70, B2: 70–108, B3: 108), Resistin (C1: 18.57, C2: 18.75–28.07, C314-16] (Table
2).

4.3 Rule Extraction


Assuming minimum transaction support minsup = 20% and minimum confidence min-
conf = 60%. Scanning the transaction database to obtain candidate itemsets C1 = {A1 ,
A2 , A3 , B1 , B2 , B3 , C1 , C2 , C3 , D1 , D2 }, filtering out those that do not meet the minimum
support degree in C1 to obtain frequent itemsets L1 = {A1 , A2 , B2 , B3 , C1 , C3 , D1 ,
D2 }; L1 self-connects to get candidate itemsets C2, C2 filters to get frequent itemsets
L2 = {{A1 , C1 }, {D1 , C1 }, {A2 , C1 }, {D2 , C3 }, {B2 , C3 }, {D2 , A2 }, {A2 , B2 }, {D2 ,
B2 }}, and iterates to get frequent L3 = {{D2 , B2 , C3 }, {D2 , A2 , B2 }}, and no candidate
set can be generated. Rules can be extracted from frequent itemsets L2, but there is
only one conditional attribute, so the extracted rules are of little significance. Extracting
222 M. Guo et al.

Table 2. Data table after discretization

Age Glucose Resistin Classification


1 A1 B1 C1 D1
2 A2 B3 C1 D1
3 A2 B2 C1 D2
4 A1 B2 C3 D2
5 A3 B3 C2 D2
6 A2 B2 C3 D2

rules from frequent itemsets L3, L3 can extract 9 rules. As shown below, the association
rules with patients as the successor of rules are all strong association rules with 100%
confidence.

{'Resistin-C3', 'Glucose-B2'}) → frozenset({2})


{'Age-A2', 'Glucose-B2'}) --> frozenset({2})

5 Experimental Results and Analysis


The experimental data is breast cancer data in the UCI data set, with 116 records and
ten attributes. By using the attribute reduction method in this paper, there are eight
attributes in the most straightforward attribute set, which are {glucose, resistance, age,
BMI, adiponectin, mcp.1, leptin, insulin}. After consulting relevant papers and consult-
ing doctors to discretize the above attributes, set the minimum support degree to 20% and
the minimum confidence degree to 60% for rule extraction. Besides the rules obtained
in the above examples, there are the following rules for association rules with patients
as the afterparts of the rules:

{'MCP-1-H3', 'Glucose-B2', 'A2'} frozenset({2})


{'Insulin-D2', 'Glucose-B2', 'A2'} --> frozenset({2})
{'Insulin-D2', 'MCP-1-H3', 'A2'}) --> frozenset({2})
({'Adiponectin-F2', 'Glucose-B2', 'A2'}) --> frozenset({2})
{'MCP-1-H3', 'A2', 'Adiponectin-F2'}) --> frozenset({2})
{'Insulin-D2', 'Adiponectin-F2', 'A2'}) --> frozenset({2})
{'Resistin-C1', 'MCP-1-H3', 'A2'}) --> frozenset({2})
{'Insulin-D2', 'MCP-1-H3', 'Glucose-B2', 'A2'}) --> frozenset({2})
{'MCP-1-H3', 'Adiponectin-F2', 'Glucose-B2', 'A2'}) --> frozenset({2})

According to the above rules, age, the values of MCP-1, glucose, resistin, and
adiponectin in blood all have significant effects on breast cancer. Middle-aged women
Mining Association Rules of Breast Cancer 223

aged 41–65 are more likely to suffer from breast cancer than other ages. When the glu-
cose in the blood is within the normal range and resistin is on the high side, they will
have breast cancer. When adiponectin and glucose are typical for middle-aged women,
but MCP-1 in the blood is high, they will have breast cancer. Therefore, more attention
should be paid to women aged 41–65. When MCP-1 and resistin are on the high side in
physical examination results, doctors should be consulted in time, and early detection
and treatment should be done.

6 Conclusion

In this paper, the attribute reduction of the fuzzy rough set can delete the attributes
that do not influence breast cancer, improve the accuracy of subsequent rules, reduce
the calculation amount of rule extraction, and improve the efficiency of experiments.
Using the Apriori algorithm to extract the association rules between routine blood data
and breast cancer can strengthen prevention according to physical examination results,
which is very practical.

Acknowledgment. The author sincerely thanks the research group members for their efforts on
this paper and thanks to the editors and reviewers for their valuable comments.

References
1. Bourouis, A., Feham, M., Hossain, M.A.: An intelligent mobile based decision support system
for retinal disease diagnosis. Decis. Support Syst. 59, 341–350 (2014)
2. Wang, R., Xuan, Y.: A research on the improvement of dual optimization on BP neural
network. IEEE Computer Society (2017)
3. El-alfy, E.M., Alshammari, M.A.: Towards scalable rough set based attribute subset selection
for intrusion detection using parallel genetic algorithm in MapReduce. Simul. Model. Pract.
Theory 64, 18–29 (2016)
4. Sun, M., Zhao, S., Duan, Y.: GLUT1 participates in breast cancer cells through autophagy
regulation. Naunyn Schmiedebergs Arch. Pharmacol. 394, 205–216 (2021)
5. Hu, Q., Yu, D., Xie, Z.: Information-preserving hybrid data reduction based on fuzzy-rough
techniques. Pattern Recogn. Lett. 27, 414–423 (2006)
6. Dai, J., Xu, Q.: Attribute selection based on information gain ratio in fuzzy rough set theory
with application to tumor classification. Appl. Soft Comput. 13, 211–221 (2013)
7. Wang, W., Xu, L., Shen, C.Y.: Effects of traditional Chinese medicine in treatment of breast
cancer patients after mastectomy: a meta-analysis. Cell Biochem. Biophys. 71, 1299–1306
(2015)
8. Song, J., Lyu, Y., Wang, M.: Treatment of human urinary kallidinogenase combined with
maixuekang capsule promotes good functional outcome in ischemic stroke. Front. Physiol.
9, 84 (2018)
9. Lee, T.T.: An information-theoretic analysis of relational databases-part I: data dependencies
and information metric. IEEE Trans. Softw. Eng. 13, 1049–1061 (1987)
10. Miao, D., Hu, G.: A heuristic algorithm for knowledge reduction. Comput. Res. Dev. 36,
681–684 (1999)
224 M. Guo et al.

11. Jia, P., Dai, J., Pan, Y., Zhu, M.: Novel algorithm for attribute reduction based on mutual-
information gain ratio. J. Zhejiang Univ. 40, 1041–1044 (2005)
12. Hu, Q., Yu, D., Xie, Z.: Information-preserving hybrid data reduction based on fuzzy-rough
techniques. Pattern Recogn. Lett. 27, 414–423 (2006)
13. Hu, Q., Yu, D., Xie, Z.: Fuzzy probabilistic approximation spaces and their information
measures. IEEE Trans. Fuzzy Syst. 14, 191–201 (2006)
Multi-task Model for Named Entity Recognition

Dequan Zheng2(B) , Baishuo Yong1 , and Jing Yang1


1 Harbin University of Commerce, Harbin 150028, China
2 Institute of System Engineering, Harbin University of Commerce, Harbin 150028, China

dqzheng@hrbcu.edu.cn

Abstract. Although the model based on deep learning significantly improves the
Natural Language Processing (NLP) model, compared with the real conversation
scene, Natural Language Understanding (NLU) task is still limited by training
data and training tasks. In this paper, we propose a Bi-directional Long Short-
Term Memory network model with multi-task learning (Multi-Bi-LSTM + CRF)
of Classification and named entity recognition, which combines multiple task
training to NLU. Multi-Bi-LSTM + CRF uses a large number of cross task data
and benefits from the regularization effect so that that word vectors can be more
commonly expressed and NLU tasks can be understood more deeply. Based on
Bidirectional LSTM + CRF models (Bi-LSTM + CRF), we propose a new neu-
ral network model, which can simultaneously carry out multi-task learning of
intention recognition and named entity recognition, focusing on named named
entity recognition training. In addition, we use different optimizers to control and
update task parameters to adapt to the loss. Compared with the baseline model,
our proposed model achieves better results.

Keywords: Bi-LSTM · Multi-task learning · Conditional random field

1 Introduction
If computers want to communicate with humans without obstacles, they need to under-
stand the structural characteristics of human language. Natural language understanding
(NLU) aims to analyze and process human language, parse a sentence or paragraph with
certain rules, the foundation for other tasks, such as relationship extraction, machine
translation, question answering system, etc. Named entity recognition (NER) is a basic
task in natural language processing (NLP) [2]. The basic task of NER is to intercept
one or more short sub-segments in a given information segment (audio or text), and the
intersection between sub-segments is empty. Suppose we think that text is the carrier of
information and can represent the speaker’s intention and attitude. Therefore, the types
of entities vary according to the speaker and intention. For example, the speaker’s entity
usually changes with the location, time and emotion, etc. Human beings can naturally
judge the entity or nonentity in the sentence and know the reason. Still, it is difficult
for computers to use the speaker’s information other than information fragments. For
example, in the text “Wang Wu goes to see the sunrise at five o’clock today” and “Wang
Wu says the sunrise will be at five o’clock today”, if other processing is not done to the

© The Author(s), under exclusive license to Springer Nature Switzerland AG 2022


A. E. Hassanien et al. (Eds.): BIIT 2021, LNDECT 107, pp. 225–233, 2022.
https://doi.org/10.1007/978-3-030-92632-8_22
226 D. Zheng et al.

text, the two paragraphs of text may be cut out to obtain the same entity, which are [0,
‘Wang Wu’, 1, ‘sunrise], similarly, In the “welcome to Nanjing Yangtze River Bridge”,
it is impossible to identify whether it is the “Yangtze River Bridge” of the “mayor of
Nanjing” or the “Nanjing Yangtze River Bridge”. If you do not understand the speaker’s
intention or background, it is difficult to correct judgment. In order to solve these real
problems, we next introduce a sequence-based neural network model for named entity
recognition.
This paper proposes a joint multi-tasking training learning direction to construct
a named entity recognition system considering the speaker’s intention. The proposed
multi-task learning model does not aim to improve the accuracy of intention recognition.
Different from the work of Liu et al. [3], on this basis, we determine the main task and
auxiliary task. We expect to learn to share complementary information between similar
tasks and serve the main task. Considering that the proportion of intention recognition
in multi-task learning is too large, compared with Tatsuya et al. [4] controlling the
proportion of tasks with weights, we have explored the improvement of proportion
balance by updating the parameters of the task model with different Optimizers and
learning rates. In conclusion, the main contribution of this paper is to show how our neural
network model deals with the situation of multiple factors affecting entity recognition.

2 Related Work

In this part, in order to review and summarize more comprehensively, we introduce the
work in the natural language processing model of intention recognition, named entity
recognition, and multi-task learning.

Intention Recognition: Intention recognition aims to judge the speaker’s intention


from a sentence or a document. Given a sentence or document, a pre-defined label
has been used to mark it. In this paper, the data set is oriented to the question and
answer of the Winter Olympic Games. Early methods were mainly based on Rules [5]
and statistics [6], but these methods required many human and material resources. They
could not understand deeper language information with the rise of deep learning. A large
number of researchers focus on neural networks and Word embedding [7] Kim et al. [8],
constructing a simple Convolutional Neural Network (CNN) [9] based on the pre-word
vector. Still, the CNN model full connection mode is too redundant and inefficient and
ignores the understanding of semantic level. Wang et al. [10] proposed Convolutional
Recurrent Neural Networks for text classification, combined with CNN to extract local
features and LSTM [11] to extract global features. More in-depth understanding of sen-
tence information. However, the feature extraction is too single to achieve good results
at the global level. Later, Some scholars tried to use Gated Recursive Unit (GRU) [12]
network for text classification. In particular, the development of pre-training models such
as transformers [13] and Bert [14] in recent two years has greatly improved the model
performance, refresh scores for many NLP tasks, but the massive model parameters and
unlimited deepening network layers have blurred the interpretability of the model. In
short, all these pave the way for future development.
Multi-task Model for Named Entity Recognition 227

Named Entity Recognition: Named entity recognition aims to identify the subject with
specific meaning for an input text, such as person name, place name, organization name,
etc. entity recognition is divided into character level, phrase level, and text level, depend-
ing on the task requirements. Natural language processing is usually used as the first step
of natural language understanding for many subsequent NLP tasks, such as entity linking,
relationship extraction, question answering system, etc. Its performance directly affects
the subsequent NLP tasks. Huang et al. [1] proposed using bidirectional long-term and
short-term memory networks to solve mission entity recognition in 2015. Conditional
random fields are added to solve the label dependency. Subsequently, Chiu et al. [15]
added CNN to learn the character-level features of words, significantly improving the
effectiveness of person names and place names. However, it has no significant effect on
organization names or nested entities with equal lengths of multiple words. At the same
time, Yin et al. [16] used word mixing (char) to learn more semantic features but ignored
global semantic information.

Multi-task Learning: For a given text or sentence, learning its vector space representa-
tion, the two most effective methods are multi-task learning and the language pre-training
model. Some of the knowledge we learned from the auxiliary task helps the main task.
For example, a person who studies English can learn German at the same time. In this
way, some of the knowledge in German can be applied to English learning and help him
master English. Similarly, joint learning of multiple tasks is also useful. For example,
Liu et al. [3] used the multi-task deep neural network (MT-DNN) model to learn multi-
ple tasks, which determined excellent results on 8 of the 10 NLG tasks, and multi-task
learning can eliminate the over-fitting effect caused by over adaptation to specific tasks,
which is also a way of regularization. Thus, the Multi-task learned is common among
tasks.

3 Model
3.1 Overview
Our model takes named entity recognition and intention classification as classification
tasks implemented by two classifiers, respectively. Through multi-task learning, named
entity recognition and intention classification are learned at the same time. Multi-task
learning usually involves several similar tasks because they can share information to
improve each task’s performance. However, our multi-task method aims to improve
named entity recognition, and the performance of intention classification is not affected.
This is different from general multi-tasking learning.
Our model is based on Bi-LSTM + CRF model [1]. It includes Bi-LSTM, CRF,
and a linear layer to solve entity naming recognition and intention classification. In our
experiment, the CRF layer is used for named entity recognition and the linear layer
is used for intention classification. Its architecture is shown in Fig. 1. Its architecture
is shown in the figure. The lower layer is shared by all tasks, while the upper layer
represents the output of a specific task. For the input X = {t0 , t1 ,…,tn-1 , tn } is a sequence
of tokens of length n. Firstly, the word embedding vector is obtained by word embedding
228 D. Zheng et al.

matrix, the output is expressed as Xemb = {temb 0 , temb 1 ,…, temb n-1 , temb n }, Then enter the
Bi-LSTM encoder to capture the context representation information of each word and
obtain a sequence context word embedding Xlstm = {tlstm 0 , tlstm 1 ,.., tlstm n-1 , tlstm n }. This
is represented by the shared parameters of multi-task target training. In the following
content, we will elaborate the model.

Fig. 1. The Bi-directional Long Short-Term Memory with multi-task learning model

Bi-LSTM Encoder: We use a bidirectional LSTM neural network structure to map the
input representation vector into a string of context embedding vectors Xemb ∈ Rn × m .
This part of the vector is shared across different tasks. The bidirectional LSTM uses
both the previous hidden state and the latter hidden state in calculating the hidden layer
and backpropagation:
−
→ ← −    
ht = ht : ht + LSTM xt , ht−1 ; LSTM xt , ht+1 ] (1)


→ ← −
where ht , ht is the forward, back propagation hidden layer parameter. xt is the
vector of a unit.

Intention Classification Output: We take the last vector tlstm n of Xlstm for intention
classification, the probability that tlstm
n is marked as class C after passing through a linear
layer:
 
P(c|tlstm
n ) = softmax W T
·
intent nt lstm
(2)

where WTintent is a specific classifier parameter matrix, and its dimension is WTintent ∈
R2024× 13

Named Entity Recognition Output: After obtaining the final vector representation
Xlstm of Bi-LSTM output vector, enter the CRF layer. The last layer is the linear-chain
Multi-task Model for Named Entity Recognition 229

conditional random field (CRF), which is parameterized as shown in formulas: the CRF
adopts a linear structure(linear chain conditional random field), and uses the label infor-
mation before and after to predict the current label. Finally, the vector representation of
CRF is obtained.
     
P(y|x) = z(x)
1
exp I,k λk tk yi−1 , yI , x, i + I,l μl sl yI , x, i (3)

      
Z(x) = y exp I,k λk tk yi−1 , yI , x, i + I,l μl sl yI , x, I (4)

where sl is the feature function and is only related to the current node, and the features
of the current node are extracted. tk is a context dependent feature function to extract
the information of nearby nodes. λk and μl are their weight coefficients. The CRF layer
output this is the sequence with the greatest probability.

3.2 Training

In the multi-task learning stage, we use mini-patch-based stochastic gradient descent


(SGD) to learn all shared layers, CRF layers and entity recognition layers, and use the
Adam algorithm to learn all shared layers and intention classification layers. As shown
in algorithm 1. Select a small batch BT in each epoch and calculate the loss update
model according to different tasks. Let x = {T0 , T1 ,…, Tn-1 , Tn } be a given statement
sequence, θ is the model parameter, and our model is updated based on the loss of each
task θ To train (5).

Intention Classification: For the classification task, the cross entropy loss is used as
the target. If the correct label is y, linear layer classification output of the model is ŷ.

Loss function of intention recognition is Lintent :

Lintent = −[ylogŷ − (1 − y)log(1 − y)] (5)

Entity Recognition: In CRF training, the CRF loss function only needs two scores: the
score of the real path and the total score of all possible paths. Among the scores of all
possible paths, the proportion of real path scores will gradually increase. Our goal is to
maximize the proportion of real paths and define the loss function:

LossFunction = RealPath
P1+P2+...+PN (6)

where the total path score remains unchanged, we finally optimize the real path, and
the final loss function:

Lner = log(eS1 + eS2 + ... + eSN ) (7)


230 D. Zheng et al.

Algorithm 1: Training a Multi-Bi-LSTM + CRF model model.


Initialize model parameters θ randomly.
encoder and the LSTM encoder.
Set the max number of epoch: epochmax.
Prepare the data for T tasks
for epoch in 1,2, ..., epochmax do
1.Shuffle D

4 Experiments

4.1 Datasets

This section briefly describes the data sets used in this experiment. The data set is con-
structed by Harbin University of technology and marked with data for Winter Olympics.
There are 19891 named entity identification data sets and 17010 intention identification
data sets. We selected 17010 intersection parts to form a new data set. Each data consists
of three parts, namely input statement, entity and intention. The newly established data
set is divided into 11010 training sets, 3000 development sets and 3000 test sets.
There are 13 kinds of intentions, namely “ask the mascot”, “ask the participating
countries” and “ask the competition venues of the Olympic Games”, “Ask about the
events of the Olympic Games”, “ask about the sub events of a certain category”, “ask
about the first place of a sub event”, “ask about the second place of a sub event”, “ask
about the third place of a sub event”, “ask about the achievements of an athlete”, “ask
what competitions an athlete has participated in”, “ask about which Winter Olympics
an athlete has participated in”, “Ask about the competitions held in the venues”, “ask
about the number of Winter Olympic Games held in the venues”.

4.2 Implementation Details

Our model code is implemented based on pytorch1.5. Adam and SGD are optimizers- 1
and optimizer-2, respectively. The model learning rates are lr1 = 0.0001 and lr2 = 0.015
respectively. The dimension of LSTM neuron implicit state is dimension_size = 1024
and the dimension of size word vector is embedded_size = 300, reserved parameter
scale in dropout layer d = 0.5, and appropriate learning rates are adopted for different
optimizers, the batch size is batch_ size = 8, and the maximum number of model training
epoch is 10. In order to prevent gradient explosion, the gradient is limited to 5.
Multi-task Model for Named Entity Recognition 231

4.3 Results

We use different systems to evaluate the quality of named entity recognition, and use
recall rate: the proportion of correctly labeled Winter Olympic entities in labeled entities
in this experiment, accuracy rate: the proportion of correctly labeled Winter Olympic
entities in predicted entities in this experiment, and the average harmonic value of the
two to test the performance of the system. The experimental results are shown in Table 1.
Where p is the accuracy rate, which indicates how many positive samples are predicted
to be positive, r is the recall rate, which indicates how many positive examples in the
sample are correct, and F1 is the harmonic average of them.
Comparative experiments analyze the experimental results of the named entity recog-
nition model based on Bi-LSTM and Bi-LSTM + CRF. It can be found that the effect
of named entity recognition is greatly improved after adding the CRF layer, indicating
that adding CRF can indeed learn the dependency between tags from the training data,
To improve the accuracy of label prediction.
Through the naming of named experimental entity results based on Bi-LSTM + CRF
and with the word mixing analysis, it can be found that the effect of the model is better
after incorporating the word information obviously improved, indicating that integrating
word vector information into word vector can effectively alleviate the problem of data
sparsity in training data,enriches the meaning of word vector, to improve the effect of
named entity recognition.
Through the named entity recognition model of word mixing and the named entity
recognition model integrating CNN with word information. The experimental results
show that adding Convolutional Neural Network can be a slight improvement in the effect
of the model. It shows that the context feature of window size extracted by Convolutional
Neural Network is useful for named entity recognition model beneficial.
By analyzing the experimental results of the multi-task learning named entity recog-
nition model and the other four existing named entity recognition models, it can be
found that after adding multi-task hybrid training, the named entity recognition effect is
effectively improved, And get the best score on such indicators as Precision it is close
to the best score in the index of Recall and F1, which shows that multi-task learning can
indeed learn complementary information from data and improve the accuracy of label
prediction.

Table 1. Experimental results with different models

Model p r F1
Bi-LSTM 80.25 81.06 80.65
Bi-LSTM+CRF 93.99 91.78 92.87
Bi-LSTM+CRF+char 95.47 95.65 95.56
Bi-LSTM+CRF+char+cnn 95.98 95.39 95.68
Multi-Bi-LSTM+CRF 96.45 94.57 95.50
232 D. Zheng et al.

5 Conclusion
This paper proposes a Multi-Bi-LSTM + CRF model, which integrates multi-task learn-
ing and named entity recognition to better understand language features. Multi-Bi-LSTM
+ CRF has achieved excellent results and demonstrated strong generalization ability
through four baseline criteria: Bi-LSTM, Bi-LSTM + CRF, Bi-LSTM + CRF + char,
Bi-LSTM + CRF + char + CNN. There are still many areas to be explored in the future
to improve the multi-Bi-LSTM + CRF model, including a deeper understanding of the
model sharing structure and the loss weight relationship between tasks, More training
correlations between tasks, or use fine-tuning and pre-training models. Finally, we also
want to verify whether the Multi-Bi-LSTM + CRF model has resilience against attacks.

References
1. Huang, Z., Xu, W., Yu, K.: Bidirectional LSTM-CRF models for sequence tagging. arxiv:
1508.01991 (2015)
2. Wen, Y., Fan, C., Chen, G., Chen, X., Chen, M.: A survey on named entity recognition. In:
Liang, Q., Wang, W., Liu, X., Na, Z., Jia, M., Zhang, B. (eds.) CSPS 2019. LNEE, vol. 571,
pp. 1803–1810. Springer, Singapore (2020). https://doi.org/10.1007/978-981-13-9409-6_218
3. Liu, X., He, P., Chen, W.: Multi-task deep neural networks for natural language understanding.
In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics
(2019)
4. Ide, T., Kawahara, D.: Multi-task learning of generation and classification for emotion-aware
dialogue response generation. In: Proceedings of the 2021 Conference of the North American
Chapter of the Association for Computational Linguistics: Student Research Workshop (2021)
5. Ramanand, J., Bhavsar, K., Pedanekar, N.: Finding suggestions and ‘buy’ wishes from product
reviews. In: Proceedings of the NAACL HLT 2010 Workshop on Computational Approaches
to Analysis and Generation of Emotion in Text, pp. 54–61. Association for Computational
Linguistics, Stroudsburg, PA (2010)
6. Li, X., Dan, R.: Learning question classifiers: the role of semantic information. Nat. Lang.
Eng. 12(3), 229–249 (2015)
7. Bhatta, J., Shrestha, D., Nepal, S.: Efficient estimation of Nepali word representations in
vector space. J. Innov. Eng. Educ. 3, 71–77 (2020)
8. Kim, Y.: Convolutional neural networks for sentence classification. arXiv https://arxiv.org/
abs/1408.5882 (2014)
9. Technicolor, T., Related, S.: ImageNet classification with deep convolutional neural networks.
Commun. ACM 60(6), 84–90 (2017). https://doi.org/10.1145/3065386
10. Wang, R., Li, Z., Cao, J.: Convolutional recurrent neural networks for text classification. In:
International Joint Conference on Neural Networks (IJCNN), Budapest, Hungary, July 14–19
2019. https://doi.org/10.1109/IJCNN.2019.8852406
11. Zhou, C., Sun, C., Liu, Z.: A C-LSTM neural network for text classification. Comput. Sci.
1(4), 39–44 (2015)
12. Chung, J., Gulcehre, C., Chok, H.: Empirical evaluation of gated recurrent neural networks
on sequence modeling. https://arxiv.org/abs/1412.3555 (2014)
13. Vaswani, A., Shazeer, N., Parmar, N., et al.: Attention is all you need. In: 31st Conference on
Neural Information Processing Systems (NIPS 2017), Long Beach, CA, USA (2017)
14. Devlin, J., Chang, M., Lee, K.: BERT: pre-training of deep bidirectional transformers for
language understanding https://arxiv.org/abs/1810.04805 (2018)
Multi-task Model for Named Entity Recognition 233

15. Chiu, J., Nichols, E.: Sequential Labeling with Bidirectional LSTM-CNNs, pp. 937–940.
https://anlp.jp/proceedings/annual_meeting/2016/pdf_dir/D6-2.pdf
16. Yin, Z., L., X.Z., Huang, D.: Chinese named entity recognition ensembled with character. J.
Chin. Inf. Process. (2019)
Research on Non-pooling Convolutional Text
Classification Technology Combined
with Attention Mechanism

Hui Li1,2 , Zeming Li1 , Wei Zhao1 , and Xue Tan1(B)


1 Harbin University of Commerce, Harbin 150028, China
2 Heilongjiang Provincial Key Laboratory of Electronic Commerce and Information Processing,

Harbin 150028, China

Abstract. Text classification technology is a relatively basic and important appli-


cation technology in natural language processing. Although the classification
method based on traditional machine learning algorithms and manual annota-
tion feature engineering has achieved certain results, the efficiency is not ideal. In
recent years, the application of deep learning in natural language tasks processing
has received extensive attention. In this paper, the structure of the traditional con-
volutional neural network model is modified, and an attention layer is introduced
to complete the task of text classification. The feasibility of the model is analyzed,
and a new research direction is provided for the application of neural networks in
the field of natural language processing.

Keywords: Deep learning · Text classification · Convolutional Neural Network ·


Attention mechanism

1 Introduction
As an effective means of information management and application, text classification
is mainly classified into different categories according to the text content and topics. In
recent years, the realization of text classification tasks through deep learning technology
has become the mainstream solution for researching this application. Deep learning is
based on low-level features combined to form abstract high-level features for text classifi-
cation tasks [1]. Text classification technology is used on many occasions, automatically
classifying work orders, providing personalized news [2], and analyzing customers’
intentions and moods [3].
With scholars’ continuous in-depth research on text classification technology at home
and abroad, many classification algorithms have emerged since artificial intelligence
has become a hot topic. The current more popular text classification methods can be
roughly divided into traditional machine learning-based text classification methods such
as Naive Bayes [4], random forest [5] model, support vector machine (SVM) [6], and
so on. The other is a text classification method based on deep learning. Most of the deep
learning models commonly used in text classification tasks are based on neural networks,

© The Author(s), under exclusive license to Springer Nature Switzerland AG 2022


A. E. Hassanien et al. (Eds.): BIIT 2021, LNDECT 107, pp. 234–244, 2022.
https://doi.org/10.1007/978-3-030-92632-8_23
Research on Non-pooling Convolutional Text Classification Technology 235

including Convolutional Neural Networks (CNN), Recurrent Convolutional Neural Net-


works (RNN), and Long and Short-term Memory Networks (Long Short-Term Memory,
LSTM). Among them, the convolutional neural network is the earliest because it can
effectively learn the corresponding features from many samples to simplify the feature
extraction operation. At the same time, due to its characteristics, it greatly reduces the
complexity of the model. The Text CNN text classification model [7] proves that CNN
can be used for text classification tasks and obtain classification results that are not
inferior to other machine learning and deep learning. After the convolutional neural
network structure was proposed, researchers discussed whether the structure was fixed.
In response to this problem, Ruderman et al. first suggested that the pooling layer is not
necessary [8], and Tu et al. [9] also created a PCNN (Pure CNN) model for convolutional
neural networks without a pooling layer. The study of the word segmentation method
proved that the model has good stability. In the comparison experiment between the
pooling and non-pooling layers, the training result of the convolutional neural network
with the pooling layer is not as good as the convolutional neural network without the
pooling layer. Therefore, it is also proved that the convolutional neural network without
a pooling layer is better than the traditional convolutional neural network with a pool-
ing layer in training time and accuracy. The attention mechanism has been favored by
researchers for its excellent performance in acquiring features, and this technology has
been introduced to tasks related to the field of natural language processing [10]. What
made the attention mechanism shine in natural language processing was the transformer
model [11] proposed by the Google team in 2017. This model is the first completely
attention-based model of sequence transformation.
Given the above research, this paper first proposes a variant structure of convolu-
tional neural network based on the traditional convolutional neural network, that is, a
non-pooling layer convolutional neural network combined with attention mechanism,
which is called the PA-CNN (Pure and Attention CNN), and applies it to the Chinese
text classification task. The second chapter introduces the data preprocessing for Chinese
text classification tasks. In the third chapter, the structure of the non-pooling layer con-
volutional neural network model combined with the attention mechanism is described
in detail. In the fourth chapter, the effectiveness of the model is verified by experiments.
Finally, the experimental results are summarized, and a conclusion is drawn.

2 Data Processing for Chinese Text Classification Tasks


Text classification is a very classical problem in the field of natural language processing.
In the early days, it was classified by expert systems. After the rise of deep learning,
a method for solving text classification problems gradually emerged. The advantage of
this method is that it can solve the classification problem quickly, but it also brings
new problems, such as low efficiency and limited coverage and accuracy. This stage is
characterized by artificial feature engineering. The training process of text classifiers
is shown in Fig. 1, where the feature engineering task consisting of text preprocessing,
feature extraction, and text representation is the most important and time-consuming
part of the entire text classification task.
The traditional text classification method mainly includes two stages of text segmen-
tation and stop word removal in the Chinese text preprocessing stage. The commonly
236 H. Li et al.

Fig. 1. Classifier training process

used text representation method converts the text into one-hot encoding and uses TF-
IDF and its extension method for feature extraction. The above method is suitable for
datasets containing only a small amount of data because under normal circumstances, the
data volume of the corpus is over one million, and using one-hot encoding method will
cause the vector-matrix dimensionality and sparsity to be too high, leading to the curse
of dimensionality. Meanwhile, the relationship between contexts is ignored because the
one-hot encoding has only two values of 0 and 1. Therefore, in recent years, word vector
technology has gradually replaced the traditional text representation method. In 2013,
Mikolov et al. proposed word2vec [12], based on the prediction method, whose core idea
is to use the context of the current word to predict the target word, while the Skip-gram
method is the opposite of the CBOW method, that is, the word that occurs is used to
predict the word in its context.
According to the literature [7], a fast text model for Fast text classification [13]
appeared. This model is based on the CBOW model. The biggest feature is that the
model is simple, the training speed is fast, and it can handle tasks with many samples
and multiple category labels.
This article uses Microsoft’s open-source word2vec tool for word vector training.
The Skip-gram method in word2vec is used for word vector pre-training after removing
stop words from the Chinese text dataset by the Jieba package. The Skip-gram model
trains semantic embeddings by predicting target words in context and also captures
semantic relationships between words. Assuming that the length of a sentence S is N,
the entire text can be vectorized into the following formula:

S = x1 , x2 , x3 , · · · · · · , xN (1)

The text classification model uses a non-pooling layer convolutional neural network
combined with the attention mechanism. Zhu previously proposed a convolutional neu-
ral network classification algorithm based on the attention model in his paper [14] and
verified the attention and CNN. However, this paper still uses the traditional convolu-
tional neural network structure. It is believed that the pooling layer can indirectly reduce
the number of neural network parameters in the next layer, increase the calculation speed
and reduce overfitting. Based on the reference [15], this paper discards the pooling layer
on the existing CNN structure, and uses the attention mechanism to replace the pooling
layer to complete the feature extraction operation.
Research on Non-pooling Convolutional Text Classification Technology 237

3 Model Structure
In 2014, Kim proposed the Text CNN model, which was the first application of CNN
in natural language processing. After that, some text classification models with CNN as
the main structure have emerged one after another, but because CNN could not capture
the correlation of location information, CNN has not received much attention in natural
language processing. In order to improve the accuracy and efficiency of Chinese text
classification tasks, this paper proposes a non-pooling layer convolutional neural network
combined with an attention mechanism called PA-CNN (Pure and Attention CNN). In
the attention layer, the important features of each sentence are obtained by extracting the
maximum weight of each dimension feature of the sentence to complete the classification
task. The model in this paper is mainly composed of the input layer, convolution layer,
attention layer, and fully connected layer. The model structure is shown in Fig. 2, and
the various components of the model are introduced below.

Fig. 2. PA-CNN model structure diagram

3.1 Input Layer


The input layer can accept Chinese news text from the training set, and the text informa-
tion is converted to word vector form for computer recognition. The length of the news
text is inconsistent. In order to facilitate operation, this paper adopts fixed length. If the
text is too long, the interception operation is used. If the text is too short, the filling is
performed.
In addition to the necessary text segmentation, the removal of stop words, and the
fixed text length, there is a more important operation to convert the text form data
into vector form. The traditional vectorization method is one-hot encoding. This is a
technique for vectorizing natural language through discrete feature values. The one-hot
encoding uses N-bit registers corresponding to N states, and only one of the registers
is valid. In the encoding, only one bit is 1, which means that there is only one valid
code in the one-hot encoding [15]. However, this encoding method is random, there is
no relationship between words, and the vector dimension is determined by the number
of words in the training set. If the dataset is too large, the vector-matrix becomes very
sparse, and the curse of dimensionality will occur. To solve this problem, this paper
adopts the word2vec method. The main idea is to express each word as a vector of any
dimension through training. In this way, a vector representation of the data input to the
convolution layer is obtained.
238 H. Li et al.

3.2 Convolutional Layer


The convolution layer is mainly responsible for feature extraction and feature dimension-
ality reduction. After the text data passes through the convolution layer, the output result
is several different feature maps. In order to complete the extraction of local features,
each neuron in the convolutional layer adopts a local connection method to facilitate the
feature extraction of different convolution kernels. In terms of weight, the convolutional
layer assigns the same weight to the same neuron of the word vector to ensure that the
same word vector will not be affected by the spatial position.
Suppose the input matrix is m * n dimensional. In that case, the input data is filled to
zero when the remaining features are not enough for the convolution operation through
a w * w convolution kernel with a step size of d. The dimension of the feature map
obtained by the convolution operation is:
M − W + 2P M − W + 2P
∗ (2)
D D
Firstly, it has always been considered that a convolution neural network cannot obtain
contextual connections, but the context of news text is relevant. In order to improve the
accuracy of feature extraction, Zhu [14] used convolution kernels of different sizes
to convolute and obtain three feature maps of different sizes. Since one-dimensional
convolutional neural networks are more suitable for processing information related to
text sequences, this article uses this structure for experiments.

3.3 Attention Layer


The attention mechanism has attracted extensive attention since it was put forward.
Especially after 2017, the number of papers on attention mechanisms has increased
rapidly. The application field has gradually expanded from the initial computer vision
direction to other artificial intelligence fields such as natural language processing, and
good results have been achieved. For example, Wu et al. applied the attention mechanism
to complex ocean scenes combined with natural language processing related technolo-
gies to realize the automatic translation of complex ocean scene detection images [16].
Another example is the intracranial hemorrhage image segmentation technique pro-
posed by Zhang et al. [17], which combines dense linking and attention mechanisms.
The above researches fully prove that the attention mechanism has strong adaptability in
various fields of artificial intelligence and has a wide range of applications. At present,
the attention mechanism is a commonly used modeling long-term memory mechanism
in natural language processing. The principle of the attention mechanism is based on
the “encoder-decoder” structure (Fig. 3). Input data X = (x 1 , x 2 , ……,x n ), output data Y
= (y1 , y2 , ……yn ), S 1 , ……S n represents the decoder’s hidden status. Set the encoding
process to encoder and the decoding process to decoder, where the input data X becomes
the intermediate semantic Mid after the encoding is completed, and the encoding process
can be expressed as Mid = En(x 1 , ……,x n ). The decoding process actually decodes the
intermediate semantic Mid, which can be expressed as yn = De(y1 , ……,yn-1 , Mid).
It can be seen from the above that the intermediate semantics Mid and the output value
of the first n – 1 items need to be used when solving the output yn , and the intermediate
Research on Non-pooling Convolutional Text Classification Technology 239

semantics used when calculating the output of each item remains unchanged. Therefore,
this traditional coding structure has two disadvantages: first, if there are noise words
in the input sequence that are weak in distinguishing ability or not helpful for text
classification, it will affect the intermediate semantics and then affect the classification
ability. Secondly, because the value of y1 will influence the subsequent output solution,
the order of the word input sequence is very important.

Fig. 3. Encoder-decoder structure

To make up for the shortcomings of the codec framework, Bahadanau et al. [10]
proposed a model based on the attention mechanism. As shown in Fig. 4, unlike RNN,
the output result of the attention mechanism depends on the hidden status of the last
input. Each input will also affect the output with different weights α i,j [18]. The features
that impact the text classification are assigned a larger weight, and the weight is reduced
on the features with less distinguishing significance. The attention mechanism adjusts

Fig. 4. Encoder-decoder structure


240 H. Li et al.

the weight by continuously training the input data, and finally achieves the purpose of
improving the classification effect. This operation can further refine the feature words
after the convolutional layer operation to simplify the model parameters.

3.4 Full Connection Layer

In this model, the output of the attention layer is transmitted to the full connection layer
as an input. In order to improve the performance of the convolutional neural network,
the full connection layer generally selects the Relu function as the activation function.
However, the softmax function is generally used for multi-classification problems.
The softmax function, also known as the normalized exponential function, promotes the
two-class function sigmoid in multi-classification. Its function is to output the results
of multi-classification in the form of probability values. The biggest feature is that the
prediction probability is non-negative, and the prediction result is equal 1. The calculation
formula is:
exp(Wy .x)
P(y|x) =
 C (3)
exp(Wc .x)
c=1

take the y’th row of W and multiply that row with x:


d
Wy .x = Wyi xi (4)
i=1

Compute all f c for c = 1, C. Finally apply softmax function to get normalized


probability.

4 Analysis of Results
4.1 Experiment Preparation

This article is generated by Tsinghua University based on historical data of the Sina
News RSS subscription channel from 2005 to 2011. The dataset contains a training set
of 50,000 samples, a validation set of 5,000 samples, a test set of 10,000 samples, and a
vocabulary list of 5000. The text contains ten categories: sports, finance, real estate, home
furnishing, education, technology, fashion, current affairs, games, and entertainment.
The experimental configuration is shown in Table 1.
Research on Non-pooling Convolutional Text Classification Technology 241

Table 1. Experimental device

Lab environment Information


OS Windows10
CPU Inter i5-7300HQ 2.50 GHz
GPU NVIDIA GeForce GTX 1050
Development language Python

The experimental model parameters are shown in Table 2.

Table 2. Model parameter

Word vector dimension 64


Sequence length 600
Number of categories 10
Number of convolution kernels 256
Size of convolution kernels 5
Vocabulary size 5000
Dropout percentage 0.75
Learning rate 1e−3
Batch size 64
Epoch 100

4.2 Experimental Results


Use accuracy, precision, recall, and training duration as evaluation indicators, which are
explained as follows:

(1) Accuracy: The ratio of the correct prediction result to the sum of all prediction
results, which is:
TP + TN
accuracy = (5)
TP + TN + FP + FN
The training time and accuracy of this model are shown in Fig. 5.
(2) Precision: All correctly predicted positive samples are divided by all positive sample
predictions, which is:
TP
precision = (6)
TP + FP
242 H. Li et al.

Fig. 5. Training accuracy and time

(3) Recall: Divide the correctly predicted positive samples by all the positive sample
predictions, which is:
TP
recall = (7)
TP + FN
(4) F1 value (F1-score): is the weighted average of precision and recall, which is:
2 × recall × precision
F1 = (8)
recall + precision

1,000 pieces of data of each type in the test sample are intercepted, and there are
a total of 10,000 pieces of data in the test set. The experimental results are shown in
Table 3.

Table 3. Experimental result

Precision Recall F1-score Accuracy


Sports 0.99 0.99 0.99 0.97
Finance 0.97 0.98 0.97 0.96
Real estate 1.00 1.00 1.00 1.00
Home furnishing 0.94 0.83 0.88 0.88
Education 0.93 0.87 0.88 0.89
Technology 0.88 0.93 0.90 0.90
Fashion 0.89 0.95 0.92 0.88
Current affairs 0.88 0.94 0.91 0.92
Games 0.97 0.94 0.95 0.94
Entertainment 0.97 0.97 0.97 0.96

It can be seen from the experimental results that the accuracy of the text classification
task processed by the model in this paper is the highest, up to 94.68%, compared with the
CNN mentioned in the Zhu [13] paper, the accuracy of the attention model has increased
by 2.7%. At the same time, the training time of the latter under the same word vector
Research on Non-pooling Convolutional Text Classification Technology 243

dimension is 1h, and the training time of the model in this paper is 38 min, which has
nearly doubled in speed. Combined with the analysis of the experimental results of the
Tu paper, it can be considered that the pooling layer in the convolutional neural network
structure has a limited effect in processing text classification problems. The training
time of the model with the pooling layer is four times that of the model without the
pooling layer. Furthermore, because pooling is used to reduce dimensionality to extract
features, this feature is similar to the attention layer. The coexistence of both may cause
some relatively important content to be discarded as noise, which affects classification
accuracy. Therefore, it is believed that the convolutional neural network without the
pooling layer can effectively improve the efficiency of model training while ensuring
classification accuracy when dealing with text classification problems.

5 Conclusion

Based on the traditional convolutional neural network model and previous research
results, this paper proposes a reconstruction of the convolutional neural network structure
by deleting the pooling layer and introducing the attention layer after the convolutional
layer. By verifying the Chinese news dataset, the experimental results show that the
model proposed in this paper can improve the accuracy of text classification tasks. It
can be seen that although the convolutional neural network has always been considered
unable to capture the contextual and sequential characteristics of text information, the
attention layer can make up for this shortcoming and achieve similar effects as RNN and
the LSTM. At the same time, it is proved that the pooling layer is unnecessary in the
convolutional neural network model structure. The existence of the pooling layer has
almost no effect on the accuracy of the experimental results and the training time of the
model. The convolutional neural network model without a pooling layer even has some
degree of improvement in efficiency. This proves that the structure of the convolutional
neural network is not fixed. It can be combined with other technologies to change its
internal structure to obtain better performance, providing new directions and ideas for the
further research of neural networks. Through experiments, it can be concluded that the
attention mechanism as a feature extractor can extract features more efficiently than the
pooling operation and can be used in the feature selection stage of the deep learning task.
Therefore, in the following fields of image recognition and natural language processing,
attention mechanisms combined with relevant models can be considered to obtain the
required results more efficiently and accurately.

Acknowledgments. This work is supported by the Natural Science Foundation of Heilongjiang


Province of China (No. YQ2020G002), University Nursing Program for Young Scholars with
Creative Talents in Heilongjiang Province (No. UNPYSCT-2020212).

References
1. Wan, J., Wu, Y.: Review of text classification research based on deep learning. J. TianJin
Univ. Technol. 37(02), 41–47 (2021)
244 H. Li et al.

2. Antonellis, I., Bouras, C., Poulopoulos, V.: Personalized news categorization through scalable
text classification. In: Zhou, X., Li, J., Shen, H.T., Kitsuregawa, M., Zhang, Y. (eds.) APWeb
2006. LNCS, vol. 3841, pp. 391–401. Springer, Heidelberg (2006). https://doi.org/10.1007/
11610113_35
3. Cai, X., Lou, J.: Sentiment analysis of telecom official micro-blog users based on LSTM deep
learning model. Telecommun. Sci. 33(12), 136–141 (2017)
4. Li, D.: The study of Chinese text categorization based on Naive Bayes. Hebei University
(2011)
5. Sun, Y., Li, Y., Bian, Y.: Application of random forest algorithm for book subject classification.
Comput. Technol. Dev. 30(06), 65–70 (2020)
6. Yan, P., Tang, W.: Chinese text clasification based on improved BERT. 33(07), 108–110+112
(2020)
7. Kim, Y.: Convolutional neural networks for sentence classification. In: Proceedings of the
2014 Conference on Empirical Methods in Natural Language Processing, EMNLP, Doha,
Qatar, pp. 1746–1751 (2014)
8. Ruderman, A., Rabinowitz, N.C., Morcos, A.S., et al.: Pooling is neither necessary nor
sufficient for appropriate deformation stability in CNNs. arXiv (2018)
9. Tu, W., Yuan, Z., Yu, K.: Convolutional neural networks without pooling layer for Chinese
word segmentation. Comput. Eng. Appl. 56(02), 120–126 (2020)
10. Bahdanau, D., Cho, K., Bengio, Y.: Neural machine translation by jointly learning to align
and translate. Computer Science (2014)
11. Vaswani, A. Shazeer, N., Parmar, N., et al.: Attention is all your need. In: Advances in Neural
Information Processing Systems, pp. 5998–6008 (2017)
12. Mikolov, T., Corrado, G., Chen, K., et al.: Efficient estimation of word representations in
vector space. In: Proceedings of the International Conference on Learning Representations,
ICLR, Scottsdale, Arizona, USA, pp. 1–12 (2013)
13. Joulin, A., Grave, E., Bojanowski, P., et al.: Bag of tricks for efficient text classification.
In: Proceedings of the 15th Conference of the European Chapter of the Association for
Computational Linguistics: Volume 2, Short Papers (2017)
14. Zhu, M.: Research and implementation of Chinese text classification algorithm based on
machine learning. Beijing University of Posts and Telecommunications (2019)
15. He, K.: Reasearch and application of text classification based on natural language processing.
Nanjing University of Posts and Telecommunications (2020)
16. Wu, M., Wen, L., Sun, M.: Attention Mechanism Image understanding algorithm of ocean
scene. Computer Engineering and Applications, pp. 1–11. http://kns.cnki.net/kcms/detail/11.
2127.TP.20210317.1008.002.html.2021/07/11
17. Zhang, P., Xu, Z., Hu, P.: Segmentation of intracranial hemorrhage fusing dense connection
and attention mechanism. J. Chin. Mini-Micro Comput. Syst. 42(07), 1458–1463 (2021)
18. Liu, T., Zhu, W., Liu, G.: Advances in deep learning based text classification. Electr. Power
ICT 16(03), 1–7 (2018)
Network Security and Blockchain
A New Credit Data Evaluation Scheme Based
on Blockchain and Smart Contract

Yu Wang1,2 , Zheqing Tang3 , Peng Hong4 , Tianshi Wei4 , Shiyu Wang6 ,


and Yong Du5,6(B)
1 Harbin University of Commerce, Harbin 150001, People’s Republic of China
2 Dongbei University of Finance and Economics, Dalian 116025, People’s Republic of China
3 Heilongjiang Polytechnic, Harbin 150001, People’s Republic of China
4 Lanzhou Jiaotong University, Lanzhou 730000, People’s Republic of China
5 Tianjin University, Tianjin 300072, People’s Republic of China
6 Northeast Agricultural University, Harbin 150001, People’s Republic of China

Abstract. The traditional model of storing academic information data in colleges


and universities is centralization, which will cause data loss, forged resumes, and
false academic records. At the same time, rational use of the decentralized charac-
teristics of blockchain can effectively control the robustness of credit data from a
global perspective. This paper proposes a partially decentralized credit data man-
agement scheme based on the new blockchain technology to promote the practical
implementation of blockchain technology in the education field. The stability of
blockchain data storage structure, the transparency of proof-of-authority mecha-
nism identity, and the authenticity of event-driven trust data provided by the smart
contract can effectively ensure that the entire life cycle of information data can
be publicly and securely recorded. The experiment shows that this scheme can
effectively solve the problems caused by the traditional storage model.

Keywords: Blockchain · Smart contract · Decentration · Education record ·


Certificate validation

1 Introduction

The credit data records students’ primary data during college, including course tran-
scripts, exceptional certificates, graduation certificates, rewards, and punishment infor-
mation. The credit data can help students continue their education or seek jobs. However,
in the traditional data storage model, the data is often centrally processed and stored in
the internal platform of educational institutions. The credit data will be lost with the cen-
tral node fails to varying degrees. In addition, attackers can easily steal data by attacking
the central node.
In a worldwide report, falsification of academic data in the job market is rampant,
that millions of forged academic documents and certificates are produced every year
[1]. Then the enterprise suffered many losses with improper employees. The survey by
Career Builder showed that 33% of respondents had forged their certificates [2]. With

© The Author(s), under exclusive license to Springer Nature Switzerland AG 2022


A. E. Hassanien et al. (Eds.): BIIT 2021, LNDECT 107, pp. 247–256, 2022.
https://doi.org/10.1007/978-3-030-92632-8_24
248 Y. Wang et al.

the increasing number of advanced talents flooding into society and the limitation of the
verification ability of enterprises or colleges, it will be more difficult for the centralized
management model to deliver accurate credit data. The database is only for internal
staff of educational institutions to view and access, and there is no interoperability for
students [3]. In this paper, combined with these problems, we confirmed that the main
reason is the centralized storage of credit data.
The emerging blockchain technology proposed by Satoshi Nakamoto can effectively
fill this gap [4]. Blockchain technology can provide a trusted platform for exchanging
value with the chained storage structure, the mathematical encryption algorithm, and
the consensus mechanism. The chain structure and encryption algorithms ensure user
privacy and data security, and the consensus mechanism provides a foundation for trust
in a decentralized environment. Blockchain technology has also been applied to varying
degrees in the field of education.
The University of Glasgow has built an intelligent campus system based on Etherum’s
blockchain platform to store students’ academic information such as grades and degrees
[5]. The Maribor University has built EduCTX, higher education credit and score plat-
form based on the open-source Ark blockchain platform, which supports recording
and managing students’ scores [6]. The University of Texas at Dallas used blockchain
technology to manage students’ academic records, including transcripts, digital badges,
certificates, recommendation letters, award records, and other information [7]. The Bit-
Degree platform provided in-service employees with learning opportunities, records
and verifies information such as test scores, homework, and problem-solving skills, and
supports the payment of course fees and application for scholarships with Bitcoin [8].
Despite the above successful cases, the following points are still not considered in the
existing studies: i) the selection of consensus mechanism, and ii) whether the utterly
decentralized model is appropriate. Taking the PoW (Proof-of-Work) consensus mech-
anism adopted by most existing studies as an example, when all the student users on the
chain agree to a branch chain of forged data, it will make the data more accessible to forge
than the traditional model. When 51% of the nodes synchronize the profit-maximizing
account, the entire network synchronously updates the account in PoW. The number of
students is far greater than 51%, and the choice of the branch chain can make their credit
data better under universities cannot repair the blockchain. Moreover, because of the
complexity of the data, PoW requires much more computing power than the traditional
model.
To solve these problems, this paper proposes a credit data storage scheme based
on blockchain, PoA (Proof-of-Authority) consensus mechanism, and smart contract by
combining the centralization and partial decentralization ideas. This program aims to
provide references for the application of blockchain in various sub-fields of education
and promote the actual implementation of the project.
The remaining sections are arranged as follows: Sect. 2 introduces the theoretical
knowledge used in this paper. Section 3 describes the specific process of the whole
scheme. Section 4 is the experimental analysis. Section 5 is the summary of this paper.
A New Credit Data Evaluation Scheme Based on Blockchain 249

2 Technical Background
The blockchain is a new distributed database that integrates multiple technologies, where
the core characteristics are distributed data storage, data tamper-proofing, and data trace-
ability. “Distributed data storage” describes that data is not subject to centralized con-
trol without reliance on third-party intermediaries. “Data tamper-proofing” refers to each
block records a group of tree transaction status information composed of a hash algo-
rithm. The generated hash value is the unique identifier of the block, and the transaction
data of the block can be stored permanently. “Data traceability” means the traceability of
historical block information through the chain storage structure at the current height [9].

2.1 Chain Storage Structure


With the structure of chained data storage, blockchain architecture’s stability is tamper-
proof and forgery-proof compared with traditional databases. This structure binds the
blocks to each other and ensures a binding relationship between the data. Blockchain
is composed of a block head and block body [10]. The block body contains a set of
transactions in a specified number of the blockchain network, and all transactions in the
block are encapsulated into a root hash string by Merkle tree and hash algorithm. Then
the Merkle tree is essential for verifying whether the original data has been changed in
a distributed network, which is responsible for recording, updating, and questioning the
state of the event data.

2.2 Proof-of-Authority
The consensus mechanism drives peer nodes to maintain the distributed database by
establishing a unified system and rewarding them. The PoA (proof-of-authority) con-
sensus mechanism is a consensual mechanism based on permissions and whose identity
must be publicly verified. The ordinary node is only responsible for initiating trans-
actions but not verifying them. In PoA, nodes can reasonably obtain the bookkeeping
rights instead of relying on computing power to compete for bookkeeping rights. This
can effectively reduce energy consumption and increase the speed of block generation.
The working steps of the PoA consensus mechanism are as follows: i) the founder
of the blockchain sets rules and specifies a set of authorized nodes in the genesis block
according to these rules, ii) when the transaction set reaches a specific number, the set of
authorized nodes digitally signs the transaction set and then broadcasts the transaction
set to the blockchain network. The network randomly marks one of the authorized nodes
in each block as in turn and the rest as no turn. The mining difficulty of the authorized
nodes with the status of “in turn” is lower than that of other nodes. Their packaged
blocks have priority access to the blockchain, and iii) according to the root hash value,
the remaining nodes should calculate whether the address of the authorization node
responsible for the block packing is a member of the authorization set in this round of
mining. If the above steps are correct, then the block is legal, and the full network node
updates the distributed ledger [11].
In addition, the nodes applying to join the authorization set publish broadcast mes-
sages in the blockchain network and are voted by all the authorized nodes. Suppose
250 Y. Wang et al.

the number of authorized nodes that agree to authorize the node exceeds 50%. In that
case, the node will be added to the authorization set, and the authorization set will be
synchronized and updated by the network. The same applies to deleting authorization
nodes.

2.3 Smart Contract


The smart contract based on blockchain is a digital contract designed to replace traditional
third-party trusted intermediaries that are state-based, event-driven. Once deployed, it
cannot be modified and is automatically triggered by events as long as the contract
participants meet the present conditions specified in the contract.
In the contract generation stage, the contract participants make smart contracts based
on paper contracts with the assistance of experts in related fields. After the contract is
standardized, experts perform formal verification to ensure the accuracy of the contract
terms. Then, the smart contract is published to the blockchain network after being tested
on the virtual machine (or container) without error. The data in the triggered event
is packaged into the block after it is verified to be correct, and the unfinished event
continues to wait to be triggered. The smart contract will be destroyed after all the
provisions specified in the smart contract are triggered [12].

3 Credit Data Evaluation Program


This section will elaborate on the credit data evaluation scheme based on the basic theory
in Sect. 2. The unreasonable credit data management model is the root cause of resume
data falsification, certificate falsification, and other phenomena. To this end, our overall
plan will take the proper storage of credit data as the target origin to prevent the above
phenomena.

3.1 The Process of Evaluation Program


As shown in Fig. 1, we ensure the cost overhead and the controllability of the credit
data in two ways: i) the PoA consensus mechanism is adopted as the maintenance
criterion of the distributed network, and ii) the specific teacher node is taken as the central
node in the issuing smart contract. The specific teacher node is equivalent to a teaching
administrator who has senior control over credit data in traditional data management. The
smart contract issued except for the specific node will not be recognized, which further
ensures the authenticity of data storage. It should be noted that the way to maintain the
database in this paper still adopts the decentralized mode, and the centralization is only
a means to ensure authenticity.
The specific implementation process is as follows:

1) Firstly, identify the initial PoA consensus committee’s member nodes where autho-
rized members are the educational administration node with high-level authority and
the general teacher node. The committee members vote for the student nodes that
meet the preset conditions in the follow-up process.
A New Credit Data Evaluation Scheme Based on Blockchain 251

2) According to the categories of credit data, member nodes with corresponding per-
missions are responsible for the corresponding smart contracts. Such as awards at
the school level can only be published and recorded by the more advanced academic
administration node.
3) With the continuous triggering of the credit event, the committee nodes of PoA
maintain the credit data.
4) The high-level authority node grants identification certificates to students who meet
the graduation conditions.
5) After obtaining the permission node’s permission, the enterprise can access the full
life cycle of students’ credit data in school.

PoA Collection of authorized nodes

Common Teacher Node


Submit the application and half of
the nodes agree

Set preset rules

Advanced Permission
Node
Students Node

Release authorized Obtain the corresponding


Nodes with corresponding Advanced permission
smart contract learning events
permissions input corresponding specifies the access
credit data permission of the node

Students complete
the credit event

Submitting an access
request

Student users
Returns credit data

Enterprises or
universities
Students seek help or access
other students' records
Blockchain network

Fig. 1. Overall credit data evaluation program.

3.2 Smart Contract Algorithm

In the section, we will describe the main smart contract algorithms involved in storing
credit records, certificate management, and external access to certificates.
The address of the advanced permission node is seNode_address, the address list of
the teacher node to be authorized is teaList_address, the function triggered by the event
is authorize_tea, and permission is the permission type of the authorization. Then, as
shown in algorithm 1, we can grant permission to low-privilege nodes through high-
privilege nodes to further ensure the authenticity of the trust data, and any smart contract
issued by unauthorized nodes will be regarded as invalid. If the initiator msg.sender of
the current contract is seNode_address, and all addresses are granted permission, that
is, k == n, event emit(authorize_tea(teaListNode_address)) is triggered.
252 Y. Wang et al.

1 Initialize teaListNode_address and the number of authorized nodes n.


2 Create a mapping rule authorize(address=>address) between the permissions of the
advanced permission node and the node to be authorized.
3 Initialize the number of authorized nodes k=0.
4 While (k!=n && msg.sender==seNode_address) do
5 authorize[teaListNode_address]. permission=true;
6 k=k+1;
7 emit(authorize_tea(teaListNode_address));
8 End While
Algorithm 1. A smart contract of authorization.

As shown in Algorithm 2, in the current grade, the credit data entry contract is only
deployed once, where its effect is the duration of the grade. The contract is destroyed
until all credit data recording events of the current grade are triggered and completed.
This avoids the repeated deployment of smart contracts with similar functionality, which
can further reduce the cost of Gas. The address list of the node that has permission is
allRight_address, and allRight_address = teaListNode_address ∪ seNode_address. The
data is defined according to the type of the credit data.

1 Initialization the deployment time of contract y, and the deployment duration k=0.
2 Initialize the total number of credit events v_m that the current node is responsible
for entering, and the number of credit events that have been entered n=0.
3 Initialization the list of the credit data types responsible for the corresponding author-
ized node tealistnode_address.data_name.
4 Establish the mapping rule data_variety(address=>uint) between the permission
node and the credit data categories.
5 Establish the mapping rule data_property(uint=>type(data))between the types of
credit data and their attributes.
6 If (k!=y) Then
7 While (msg.sender==seNode_address) do
8 If (z!=v_m) Then
9 For (i=1 to M)
10 data_variety[seNode_address].data_property[i].data=input_creditData[i];
11 emit(luru(input_creditData[i]));
12 End for
13 End while
14 k=k+1;
15 End if
16 If (k==y) Then
17 selfdestruct(seNode_address) // Destruction of the smart contract
18 End if
Algorithm 2. Credit data entry contract.
A New Credit Data Evaluation Scheme Based on Blockchain 253

As shown in Algorithm 3, if the students meet the preset conditions for each year
during all periods of school, and the students are deemed to meet the graduation require-
ments, then the corresponding high-level authority node will grant certificates and sig-
nature certificates with the unique identifier. If the conditions are met, the smart contract
will be triggered automatically, and the signed certificate will be given. For the infor-
mation with modified records, the advanced permission node will review and decide
whether to sign. The contract is deployed only once for students at the same grade.

1 Initialize the number of qualified student credit data entry contracts n=0.
2 Initialize the modification times m of students’ academic information.
3 Initialize the total number of credit data entry contracts to be completed by students k.
4 Initialize the attributes of graduation conditions stu[address[num]].graduation.
5 Initialize the number of all students in the current grade M.
6 For (i=1 to M)
7 If (k==n&&m==0&&msg.sender==seNode_address) Then
8 stu[address[i]].graduation = true;
9 emit(sign(msg.sender));
10 End if
11 If (k==n&&m!=0&&msg.sender==seNode_address) Then
12 stu[address[i]].graduation = input[i];
13 emit(sign(msg.sender));
14 End if
15 End for
Algorithm 3. Graduation signature contract.

4 Experimental Analysis
Based on the theoretical framework in the above sections, we developed a data man-
agement system and tested its performance to make a comparative analysis between the
proposed scheme and the traditional centralized scheme form the rational analysis. The
experimental environment configuration are Windows 10, 16 GB of memory, and core
I7 CPU. The entire project is accessible at link https://github.com/LitSpirits/credit-data.

Fig. 2. The set of authorized nodes maintains the credit data.


254 Y. Wang et al.

In Fig. 2, the process of PoA authorization node combined with the maintenance
of learning information data. When students or teachers apply for login, the event is
triggered and the transaction request is sent to the blockchain network until the data
maintenance is completed and the user enters the system. This means that the user needs
to use their own address registered in the client to perform related operations and trigger
events. In Fig. 2, based on the characteristics of blockchain, untrusted data can be traced
through login records.
As shown in Fig. 3, unauthorized nodes cannot control the credit data in other con-
tracts beyond their authority, and the idea of partial decentralization can ensure the
security of data in a fully decentralized environment. Compared with the traditional
model, which cannot provide effective data proof, the scheme proposed in this paper can
help enterprises to easily obtain students’ accurate and effective credit data and credit
records. Moreover, we show how to use the management of credential data to provide a
reference for other research fields such as certificate anti-counterfeit verification here.

Fig. 3. The permission id of the unauthorized node does not match the stored data in the
blockchain, and the contract time cannot be executed.

In summary, centralized data storage means that the data is changed and read by a
centralized client, and if the hacker attack is not noticed, the changed data state is perma-
nently recorded. With the scheme proposed in this paper, as long as the node obtains the
permission, it can obtain the required data at will and write the data within the permis-
sion. Students only need to obtain permission from others to gain valuable experience. It
should not be ignored that not all students are willing to disclose their proprietary data
thoroughly, and how to provide such a platform is challenging to achieve in a centralized
storage system. Comparison with traditional schemes is shown in Table 1.
We find that consensus mechanisms need to be jointly selected or improved accord-
ing to credibility, transaction throughput, and maintenance cost by selecting different
A New Credit Data Evaluation Scheme Based on Blockchain 255

Table 1. The comparison between the proposed scheme and the centralization scheme.

The traditional centralized model The storage mode mentioned in this


article
Anti-attack No Yes
Tamper-proofing No Yes
Data verifiable No Yes
Integrality Yes Yes
E-Maintenance No Yes
Interoperability No Yes

consensus mechanisms. Credibility is the first criterion for assessing consensus mech-
anisms, but its maintenance costs should also be considered. The data is so complex
that the PoW consensus mechanism will lead to an exponential increase in costs. Nev-
ertheless, it will consume far more resources than traditional schemes. However, in the
PoS (Proof-of-Stack) consensus mechanism, the restriction of maintenance authority
through the amount of currency will lead to difficulty for students to obtain profits. In
the PoA consensus mechanism, the maintenance cost is lower. The teacher node and
student node jointly maintain the network, which can relieve the pressure of teachers
when facing the credit data and motivate students to obtain benefits.
The cost of us scheme is mainly related to the updatable state, the amount of data,
the functions complexity, and the state of the network. The smart contracts inherit from
each other, so the Gas cost is higher than the deployment of a single deployed. However,
this method can control the cost value of Gas below the GasLimit value stipulated by
Ethereum (12 million), which is conducive to the further development of the system in
the later stage. However, the deployment project consumed 7104,672 Gas, and the actual
cost was much lower than the cost of using and maintaining a traditional database. The
cost would be further reduced if educational institutions set their own trading rules. And
we assess trade throughput by making the trade generator randomly generate trades by
changing the frequency. Each time 200–1000 transactions are sent to the network, and
transactions on the blockchain are recorded in the unit. The stable network transaction
throughput is obtained by using the method of averaging, which the range of us program
between 130–190tps. The transaction throughput has yet to be improved for jumbled
credit data.

5 Conclusion
Based on the decentralized blockchain system and part of the decentralized smart contract
permission control mechanism, this paper can ensure the authenticity and traceability
of the credit data and effectively avoid the problems such as data loss in the traditional
credit data storage mode. To promote the practical application of blockchain in education,
we have developed the credit data management system, which can provide a reference
for other sub-fields in education. This can provide a credible academic record for the
256 Y. Wang et al.

business. Therefore, the new data evaluation scheme proposed in this paper has certain
practicability. The performance of the consensus mechanism will determine the cost and
the degree of utilization of decentralization. The formal verification of a smart contract
ensures the anti-attack ability of the contract, which protects the rights and interests of
users. We will study the formal verification of smart contracts and the improvement of
consensus mechanisms in the future.

References
1. Turkanović, M., Hölbl, M., Košič, K., et al.: EduCTX: a blockchain-based higher education
credit platform. IEEE access 6, 5112–5127 (2018)
2. LNCS. https://www.careerbuilder.com/share/aboutus/pressreleasesdetail.aspx?sd=8/7/
2014&id=pr837&ed=12/31/2014. Accessed 15 July 2021
3. Rasool, S., Saleem, A., Iqbal, M., Dagiuklas, T., et al.: Docschain: blockchain-based IoT
solution for verification of degree documents. IEEE Trans. Comput. Soc. Syst. 7(3), 827–837
(2020)
4. Wang, S., Zhang, Y., Zhang, Y.: A blockchain-based framework for data sharing with fine-
grained access control in decentralized storage systems. IEEE Access 6, 38437–38450 (2018)
5. Rooksby, J., Dimitrov, K.: Trustless education? A blockchain system for university grades 1.
Ubiquity J. Pervasive Media 6(1), 83–88 (2019)
6. Xie, J., Tang, H., Huang, T., Yu, F.R., et al.: A survey of blockchain technology applied to
smart cities: research issues and challenges. IEEE Commun. Surv. Tutor. 21(3), 2794–2830
(2019)
7. Kamišalić, A., Turkanović, M., Mrdović, S., Heričko, M.: A preliminary review of blockchain-
based solutions in higher education. In: Uden, L., Liberona, D., Sanchez, G., Rodríguez-
González, S. (eds.) LTEC 2019. CCIS, vol. 1011, pp. 114–124. Springer, Cham (2019).
https://doi.org/10.1007/978-3-030-20798-4_11
8. LNCS. https://ipfs.io/ipfs/QmdEnhQcWHTY4ndotRs1YRpeTdjZGAyisHz8buqfYidkP2.
accessed 15 July 2021
9. Mingxiao, D., Xiaofeng, M., Zhe, Z., et al.: A review on consensus algorithm of blockchain.
In: 2017 IEEE International Conference on Systems, Man, and Cybernetics (SMC), China,
pp. 2567–2572. IEEE (2018)
10. Leng, K., Bi, Y., Jing, L., et al.: Research on agricultural supply chain system with double
chain architecture based on blockchain technology. Future Gener. Comput. Syst. 86, 641–649
(2018)
11. Salah, K., Rehman, M.H.U., et al.: Blockchain for AI: review and open research challenges.
IEEE Access 7, 10127–10149 (2019)
12. Norta, A.: Creation of smart-contracting collaborations for decentralized autonomous orga-
nizations. In: Matulevičius, R., Dumas, M. (eds.) BIR 2015. LNBIP, vol. 229, pp. 3–17.
Springer, Cham (2015). https://doi.org/10.1007/978-3-319-21915-8_1
An Algorithm of Image Encryption Based
on Bessel Self-feedback Chaotic Neural Network

Yaoqun Xu1,2(B) , Meng Tang1 , and Jingtao Fan1


1 School of Computer and Information Engineering,
Harbin University of Commerce, Harbin 150028I, China
xuyq@hrbcu.edu.cn
2 Institute of System Engineering, Harbin University of Commerce, Harbin 150028, China

Abstract. Image encryption protects image information by scrambling digital


image data to make the image unrecognizable. One of the digital image character-
istics is a large amount of data, and the other is high redundancy. Therefore, based
on the linear self-feedback, the introduction of Bessel function as an item since
the feedback of chaotic neural network puts forward the Bessel since the nonlin-
ear feedback chaos neural network model. The dynamic characteristics of chaotic
neurons are analyzed by inverse bifurcation diagram and Lyapunov exponential
evolution diagram, and the model was applied to image encryption.

Keywords: Chaos · Bessel self-feedback · Image encryption · Single neuron


dynamical system

1 Introduction

Thriving on the increasing rise of big data, networking, and informatization will be
the general trend of the development and progress of human society. The amount of
information and speed of image data transmitted through the network is amazing. To
prevent the image data from being intercepted, stolen, or destroyed by others in the
process of transmission, it is particularly important to propose a secure and efficient
image encryption algorithm.
In the references, Chen-Aihara proved that neural networks with chaotic could
become stable gradually [1] and introduced the chaotic simulated annealing algorithm
by exponentially reducing the connection weight of the self-feedback. It confirmed that
the traversal search of chaotic was related to the self-feedback connection terms. The
self-feedback connection terms with different properties will produce a different chaotic
traversal search. In this paper, the self-feedback connection term of the Chen-Aihara net-
work is improved [2], and the Bessel function is introduced based on linear self-feedback
[3]. Therefore, a chaotic self-feedback neural network based on the Bessel function is
proposed. The inverse bifurcation and Lyapunov exponential evolution diagram of a
chaotic dynamic system are researched, the network model and the corresponding image
encryption algorithm are applied to the image encryption algorithm.

© The Author(s), under exclusive license to Springer Nature Switzerland AG 2022


A. E. Hassanien et al. (Eds.): BIIT 2021, LNDECT 107, pp. 257–266, 2022.
https://doi.org/10.1007/978-3-030-92632-8_25
258 Y. Xu et al.

2 Chaotic Self-feedback Neural Network with Bessel Function


2.1 Neural Network Model

Bessel function chaotic neural network with self-feedback model is described as below:
1
xi (t) =  (1)
1 + exp(−yi (t) ε0 )
⎡ ⎤

n
yi (t + 1) = kyi (t) + α ⎣ wij xj (t) + Ii ⎦ − zi (t)g(xi (t) − I0 ) (2)
j=1,j=i

g(u) = λJ0 (cu) (3)


 1  cu 2k
J0 (cu) = (−1)k (4)
k!(k + 1) 2
k=0

zi (t + 1) = (1 − β)zi (t) (5)

In the formula, at time t, yi (t) is neuronal internal state and xi (t) is the neuronal
output, k is in the range (0,1) represents the ability of a neuron to retain its internal
state, ε0 is the gradient parameter of the excitation function, g(xi (t) − I0 ) is the network
own self-feedback term, zi (t) is its own feedback connection item of the neuron i, λ
is combination coefficient, c is the expansion coefficient of Bessel function, β is the
annealing parameter.
When g(u) = u, the self-feedback term exhibits a linear state, and above Chaotic neu-
ral network model degenerates into Chen’s model. For this reason, a self-feedback term
based on nonlinear Bessel function is proposed. When g(u) = λJ0 (cu), this nonlinear
term makes the system show a new chaotic dynamic behavior (Fig. 1).

Fig. 1. The curves of Linear self - feedback and Bessel self – feedback.
An Algorithm of Image Encryption Based on Bessel Self-feedback 259

2.2 Neuron Model


Compared with Chen’s model, Bessel function as the item since the feedback function
of chaotic neural networks changes the part of the search status value plus or minus and
can gain the optimal global solution more quickly and effectively. Using Bessel function
as feedback function in image encryption, efficiency and encryption performance can
be improved. Neuron model of the Bessel self-feedback is described below:
1
x(t) =  (6)
1 + exp(−y(t) ε0 )
y(t + 1) = ky(t) − z(t)g(x(t) − I0 ) (7)

g(u) = λJ0 (cu) (8)



 1  cu k
J0 (cu) = (−1)k (9)
k!(k + 1) 2
k=0
z(t + 1) = (1 − β)z(t) (10)
Dynamic Behavior of a Single Neuron Dynamical System (SNDS). In the paper,
take parameters as ε0 = 0.004, I0 = 0.6, z(1) = −0.8, y(1) = 0.8, k = 0.2, β =
0.0003 respectively, parameter λ is set to a fixed value of 1/6, c is set to a fixed value of
6. The bifurcation diagram and the time evolution diagram of the maximum Lyapunov
index are shown in Fig. 2 after 9000 iterations.

Fig. 2. The diagram of Bifurcation and Lyapunov index when β = 0.0003.

Through the above single-neuron inverted bifurcation diagram and the maximum
Lyapunov exponential evolution diagram, it can be reflected that it’s a chaotic neural
network with transient chaos dynamics behavior, which reflected the chaos searching
ability of the network to some extent. In the meantime, we can change the four parameters
of the dynamic system of a single neuron to display the dynamic behavior of chaotic
systems [4].
When three of the parameters keep a certain value, we can change one parameter to
observe the impact on the SNDS dynamic behavior and draw the corresponding inverted
bifurcation diagram and Lyapunov exponential evolution diagram (Figs. 3, 4, 5 and 6).
260 Y. Xu et al.

Fig. 3. Bifurcation and the Lyapunov exponents of maximal index diagram of single parameter
x versus z for I0 = 0.6, ε0 = 0.004, k = 0.2.

Fig. 4. Bifurcation and the Lyapunov exponents of maximal index diagram of single parameter
x versus k for I0 = 0.6, ε0 = 0.004, z(1) = −0.8.

Fig. 5. Bifurcation and the Lyapunov exponents of maximal index diagram of single parameter
x versus ε0 for I0 = 0.6, k = 0.2, z(1) = −0.8.

2.3 Enhanced Single Neuronal Dynamic System


Incorporate SNDS into the framework proposed in [5], the enhanced single neuronal
dynamic system (ESNDS) is obtained. It is described as below:
1
xi (t) =  (11)
1 + exp(−yi (t) ε0 )
An Algorithm of Image Encryption Based on Bessel Self-feedback 261

Fig. 6. Bifurcation and the Lyapunov exponents of maximal index diagram of single parameter
x versus I0 for k = 0.2, ε0 = 0.004, z(1) = −0.8.

xi (t) = xi (t) × 2n − floor(xi (t) × 2n ) (12)

yi (t + 1) = kyi (t) − zi (t)g(xi (t) − I0 ) (13)

We gain the value of the parameter xi (t) after do an intermediate calculation to the
value of xi (t). In this paper, n is set to a fixed value of 18.

NIST SP800-22 Test. In order to demonstrate the robustness of Bessel self-feedback


with its potential application in the image encryption algorithm, the NIST SP800-22
test standard was used [6]. The NIST test contains 15 statistical packages for tests [7]
designed to test by the hardware or software based on a random password. Testing has
focused on various types of existing nonrandom sequence. Some tests can be divided
into seed tests.

The binary number used for the test is generated by the parameter’s value xi (t) in the
formula (12). For each value of xi (t), the former 10 decimal digits was diacarded and
compare the result with 0.5, the process is shown as Eqs. (14) and (15):

mi = (1010 × xi ) mod 1 (14)

0, 0 ≤ mi ≤ 0.5
ni = (15)
1, 0.5 ≤ mi ≤ 1

In the test, there are 15 subsets should be considered, and each subset will out-
put a p-value. If p-value is better than 0.01, the sequences is considered to be uniformly
distributed. We tested 100 sets of 1,000,000 bits binary sequences, in this process, param-
eters of the initial value are set a fixed value: ε0 = 0.004, I0 = 0.6, z(1) = −0.8, y(1) =
0.8, k = 0.2. Since the p-value is better than 0.01 for each of fifteen subsets, accept the
sequence as random [8] (Table 1).
262 Y. Xu et al.

Table 1. NIST statistical test results.

Test no. Subset p-value Proportion Result


1 Freq. 0.997823 100/100 Random
2 Block Freq. 0.249284 100/100 Random
3 Cumulative Sums 0.534146 100/100 Random
4 Runs 0.334538 99/100 Random
5 Lon. Run 0.699313 100/100 Random
6 Rank 0.455937 99/100 Random
7 FFT 0.616305 99/100 Random
8 Non-Over. Temp. 0.987896 100/100 Random
9 Over. Temp. 0.096578 100/100 Random
10 Universal 0.145326 100/100 Random
11 Appr. Entropy 0.051942 99/100 Random
12 Ran. Exc. 0.999360 100/100 Random
13 Ran. Exc. Var 0.979758 100/100 Random
14 Serial 0.798139 99/100 Random
15 Linear Complexity 0.080519 99/100 Random

3 Application to Image Encryption

3.1 Image Encryption Process

Step 1: The original grayscale image is read as an M × N matrix A (M, N), where each
element of matrix A corresponds to the pixel value at the image point (M, N).
Step 2: The chaotic sequence obtained by the iteration of the ESNDS is encrypted,
and the four parameters of the Bessel self-feedback chaotic neuron are used as the
security key of the chaotic sequence. We can iterate the chaotic neural network (M × N
+ M + N + H) times, then discard the first H data, and finally, get a length (M × N +
M + N) of the new sequence.
Step 3: Make the former M elements sequence a, the nest N elements are sequence
b, and the rest are sequence c. The following formula (16) is calculated for the two
sequences:

a = floor(a × M ) + 1
(16)
b = floor(b × M ) + 1

Step 4: Take the calculated sequences a and b to perform row and column
permutation for the original matrix A (M, N). Specific permutation methods in [4].
Step 5: Convert the transpose matrix into a one-dimensional matrix P, and sort the
matrix P in the order of matrix c to obtain the matrix P prime.
An Algorithm of Image Encryption Based on Bessel Self-feedback 263

Step 6: By the Eqs. (17) and (18), we can obtain the diffused matrix Q from the
matrix P and the sequence c. Finally, convert Q into an encrypted image of size M × N.

c = floor((c) × 1010 ) mod 256 (17)

Q = P  ⊕ c (18)

The decryption is a reversion process of encryption. Using initial the correct key
reversal encryption process can recover the original image.

3.2 Simulation Result


Two different grayscale images are used to simulate the ESNDS and verify the encryption
effect of the encryption method during the simulation. In the experiment, take the initial
value fixed as the ESNDS, and set the annealing parameter β to 0.00004. The original
and encryption images are shown in Fig. 7.

Fig. 7. Result of Simulation. (a) (b) the original images; (c) (d) the encrypted images; (e) (f) the
decrypted images.

3.3 Analysis of Information Entropy


Image Information entropy reflects the average number of information in the selected
image which represents a statistical feature form [9]. The calculation formula is
calculated as Eq. (19):

n
H (x) = − P(xi ) log2 P(xi ) (19)
i=0
264 Y. Xu et al.

P(xi ) represents the proportion appearance for grayscale symbol xi , and n represents
the grayscale of the image. In theory, the value of information entropy should be closed
to 8. As Table 2 shows, the encryption can change information from order to disorder.

Table 2. Information entropy of two different images in original and encrypted.

Image Original information entropy Encrypted information


entropy
Lena 7.4479 7.9993
Peppers 7.5986 7.9993

3.4 Analysis of Histogram

In the image, gray histogram analysis reflects the occurrence times of each gray value
[10], and the distribution is used to reflect the image pixel values. In the Fig. 8, pixel
value of encryption image distributes uniformity that can use to resist statistical attack.

Fig. 8. Gray histogram analysis of images in original and encrypted.

3.5 Analysis of Correlation

Correlation analysis is to eliminate the impact of accidental factors through the obser-
vation of amount digital data. Correlation analysis reflects the related degree between
inter-adjacent pixels of an image through the correlation coefficient. The correlation
coefficient is closer to 1, and the correlation will be greater between two adjacent pixels.
Reducing the related degree can reduce the possibility of statistical image attack [11].
Figure 9 shows the correlation of the original and encrypted image in the horizontal,
vertical, and diagonal directions calculated by the below formulas:

1 
N
E(x) = (xi ) (20)
N
i=1
An Algorithm of Image Encryption Based on Bessel Self-feedback 265

1 
N
D(x) = (xi − E(xi ))2 (21)
N
i=1

1 
N
cov(x, y) = (xi − E(xi ))(yi − E(yi )) (22)
N
i=1
cov(x, y)
rxy = √ (23)
D(x) D(y)

Fig. 9. Correlation analysis images in original and encrypted. (a) (b) (c) the original image; (d)
(e) (f) the encrypted image.

Table 3. Correlation coefficients in the original image and encrypted image.

Horizontal Vertical Diagonal


Original 0.9729 0.9858 0.9595
Encrypted −0.0035 −0.0064 −0.0008

3.6 Analysis of Key Space and Sensitivity

The number of bits usually counts encryption keys. The longer the number of bits,
the bigger the keyspace and the more powerful the resistance to exhaustion, clearly
illustrating the security and efficiency of this encryption algorithm (Table 3).
266 Y. Xu et al.

In this paper, an initial value and four parameters are selected as the key in the encryp-
tion algorithm. Other parameters of the Bessel self-feedback chaotic neural network can
also be used to expand the keyspace enough to meet the needs of encryption.
Key sensitivity refers to the key sequence generator or iterative function keys gen-
erated accordingly [12] in the encryption process when the initial key occurred small
changes. It can’t give a correct decryption process after changing the entire encrypted
image dramatically, to ensure the security of the algorithm.

4 Conclusion
Bessel function in this paper as the feedback is introduced into the chaotic neural net-
work, the network model for different parameters reflected different dynamics behavior.
Based on Bessel self-feedback chaotic neural network, an image encryption algorithm
is proposed. Through the experimental analysis from information entropy, histogram,
keyspace, and correlation, the effectiveness and security of the encryption algorithm are
verified. It is worth noting that the single-neuron chaotic dynamical system has a large
development space, and this applicability can make the system participate in more fields.

Acknowledgement. This work was supported by the Nature Science Foundation of Heilongjiang
Province (LH2021F035).

References
1. Chen, L., Aihara, K.: Chaos and asymptotical stability in discrete-time neural networks.
Neural Netw. 13, 731–744 (2002)
2. Xu, Y., He, S.: Chaotic neural network with self-feedback of trigonometric function and its
application. In: Proceedings of the 2009 China Intelligent Automation Conference (2009)
3. Ye, Y., Xu, Y.: A chaotic neural network with bessel function. J. Harbin Univ. Commer. (Nat.
Sci.) 29, 561–565 (2013)
4. Xu, X., Chen, S.: Single neuronal dynamical system in self-feedbacked hopfield networks
and its application in image encryption. Entropy 23, 456 (2021)
5. Abd El-Latif, A.A., Niu, X.: A hybrid chaotic system and cyclic elliptic curve for image
encryption. AEU-Int. J. Electron. 67, 136–143 (2013)
6. Moysis, L., Volos, C.: Modification of the logistic map using fuzzy numbers with application
to pseudorandom number generation and image encryption. Entropy 22, 474 (2020)
7. Meranza-Castillón, M., Murillo-Escobar, M., López-Gutiérrez, R., Cruz-Hernández, C.:
Pseudo random number generator based on enhanced HÉNon map and its implementation.
AEU-Int. J. Electron. Commun. 107, 239–251 (2019)
8. Bassham, L.E., III, Rukhin, A.L., Soto, J., Nechvatal, J.R.: Sp 800-22 Rev. 1a. a Statistical Test
Suite for Random and Pseudorandom Number Generators for Cryptographic Applications,
National Institute of Standards & Technology, Gaithersburg, MD, USA (2010)
9. Wu, Y., Zhou, Y., Saveriades, G.: Local shannon entropy measure with statistical tests for
image randomness. Inf. Sci. 222, 323–342 (2013)
10. Jeffrey, J.R., Christopher, C.Y.: High-resolution histogram modification of color images.
Graph. Models Image Process. 57(5), 432 (1995)
11. Behnia, S., Akhshani, A., Mahmodi, H., Akhavan, A.: A novel algorithm for image encryption
based on mixture of chaotic maps. Chaos Solitons Fractals 35, 408–419 (2018)
12. Zhou, Y., Li, S.J.: BP neural network modeling with sensitivity analysis on monotonicity
based spearman coefficient. Chemom. Intell. Lab. Syst. 200, 103977 (2020)
Evaluation of the Effect of Blockchain
Technology Application in the International
Trade

Jinping Zhang(B) and Ying Fan

Harbin University of Commerce, No. 1, Xuehai Street, Songbei District, Harbin,


Heilongjiang, China

Abstract. Blockchain technology is widely used in many fields due to its non-
tamper resistance, openness, transparency, and decentralization characteristics.
The application of blockchain technology in international trade mainly includes
five aspects: product traceability, trade payment, customs procedures, trade financ-
ing, and trade supply chain. Evaluation of the effect of blockchain technology
application in the above aspects by a method of analytic hierarchy process (AHP)
shows that the effect on trade payment and trade supply chain is better than others.
At the same time, it has a less obvious effect on product traceability. Currently,
there are still some barriers to applying blockchain technology in the field of
international trade. Blockchain technology still has certain technical risks and
lacks legal supervision. Better application of blockchain technology can be real-
ized by cultivating high-quality blockchain technical talents and improving the
supervision mechanism of blockchain technology.

Keywords: International trade · Blockchain technology · Analytic hierarchy


process

1 Introduction
Blockchain is a distributed ledger with non-tamper resistance, openness, transparency,
and decentralization characteristics, making it widely used in many fields. Blockchain
technology also provides a new impetus for the development of international trade. It
helps it to solve some long-standing problems, such as information asymmetry, high
trade communication cost, and high transaction cost [1]. Based on the application of
blockchain technology in international trade, this paper uses the analytic hierarchy pro-
cess (AHP) to evaluate the application effects. Accordingly, it puts forward targeted
policy suggestions to promote the high-quality development of international trade.

2 Blockchain Technology Application in the International Trade


2.1 To Make Product Traceability Identify Authenticity Accurately
Product traceability can realize the tracking and backtracking of commodity information
in a series of supply chain links such as raw material procurement, product production,

© The Author(s), under exclusive license to Springer Nature Switzerland AG 2022


A. E. Hassanien et al. (Eds.): BIIT 2021, LNDECT 107, pp. 267–275, 2022.
https://doi.org/10.1007/978-3-030-92632-8_26
268 J. Zhang and Y. Fan

storage, sales, and transportation. In case of product quality and safety problems, the
source of the product can be tracked and recalled in time to minimize the loss. Prod-
uct traceability can effectively prevent product fraud, guarantee food safety, improve
product competitiveness in the international market, and promote commodity circula-
tion and export trade [2]. With the improvement of living standards, people have higher
requirements for product quality and pay more attention to environmental protection.
Product traceability can make consumers feel safe to buy products. Therefore, product
traceability plays an important role in international trade, especially as the COVID-19
epidemic situation still exists in most countries in the world. Product traceability can
greatly contribute to the prevention and control of the epidemic and effectively prevent
major public health events [3].
Applying blockchain technology to product traceability in international trade can
solve the problem of product fraud in origin. Traditional anti-counterfeiting technology
generally uses TWO-DIMENSIONAL code or bar code to record information, prevents
product fraud by adding coating or printing in the bottle cap. But it cannot solve the
problem of counterfeiting by copying and transferring anti-counterfeiting logos. The
blockchain has the characteristics of decentralization and publicly records all informa-
tion on the account book without management by a third-party organization. All the data
recorded on the blockchain are timestamped and cannot be tampered with. The unique-
ness of a commodity can be ensured after every link from raw material purchase to sales
of a commodity is recorded on the blockchain, and the information of fake commodi-
ties cannot enter the blockchain system [4]. At the same time, blockchain technology
can realize the real-time traceability of commodity information. The information in the
blockchain is formed by packaging the data into blocks, adding timestamps, and finally
forming a chain. Therefore, a commodity’s whole production process can be recorded
in chronological order, and the information on the blockchain will also be arranged in
chronological order. It can be traced in real-time [5]. In product traceability, international
standards such as ISO9000 and ISO14000 can be combined with blockchain technol-
ogy. Blockchain technology records all the information of products on the block. This
information is authentic and cannot be tampered with, promoting export manufacturers
to produce and export products according to international standards and meet the quality
and environmental protection requirements of export products.

2.2 To Increase Efficiency and Lower Risk of Trade Payments

Currently, common payment methods of international trade mainly include remittance,


letter of credit, and collection. Remittance has some advantages of simple procedures
and low costs, but it is of high risk. That is, a buyer or a seller has to take high risks
and an unbalanced capital burden. A letter of credit is a bank credit that is more reliable
than commercial credit. However, it has some disadvantages of complicated procedures,
high costs, and is prone to fraud. The collection could reduce expense costs and facilitate
financing for the buyer, but the seller has to take a high risk, such as the buyer’s dis-
honor. To sum up, traditional payment methods of international trade have some security,
efficiency, and cost [6].
Evaluation of the Effect of Blockchain Technology Application 269

Applying blockchain technology to international trade payment can make up for the
shortcomings of traditional international trade payment and promote the high-quality
development of international trade. Firstly, the application of blockchain technology
can reduce the risks of international trade payment. When using blockchain technology
for international trade payment, the ledger is updated through the consensus mechanism,
and the information of each transaction is open and transparent. The transaction data can-
not be tampered with [7]. At the same time, the cryptographic technology of blockchain
can also ensure the security and authenticity of transaction data and reduce the possibility
of illegal operations by both parties. Secondly, the application of blockchain technol-
ogy can improve the efficiency of transactions. Traditional international trade payment
procedures are cumbersome and time-consuming. However, blockchain technology can
realize the information sharing between the buyer and the seller. Their transaction infor-
mation will be published in the payment system, and their information exchange could
be realized on the platform without the participation of a third party [8].
Meanwhile, the trading platform based on blockchain technology can realize the
direct transaction between the two parties by simplifying the transaction process in sec-
onds or hours. Thirdly, the application of blockchain technology can reduce transaction
costs. Trading platforms based on blockchain technology do not rely on traditional inter-
mediaries to provide credit certificates, which can reduce the related commission fees,
be more convenient to communicate, and reduce the cost of communication.

2.3 To Improve Quality and Lower Cost of Customs Procedures

Customs procedure refers to the procedure for imported goods to pass through the cus-
toms, which usually includes four steps: customs clearance, classification, valuation,
and taxation. In dealing with the documents and records generated in international trade
business, the customs need to invest a lot of money, and the work efficiency is not high
enough. At the same time, because these documents come from different participants,
there is information asymmetry, and fraud may occur.
Using blockchain technology can effectively solve the problems of high cost and low
efficiency in current customs procedures. Firstly, blockchain technology packages the
relevant data involved in the customs procedures into blocks and finally into chains. The
timestamp technology can ensure the authenticity and security of the relevant data in the
transmission and will not be tampered with. It can also help the Customs Department
confirm the goods’ destination and source more accurately and effectively track down
and provide evidence for the goods that may have problems [9]. Secondly, the imple-
mentation of smart contracts can realize the digitalization of trade contracts and bills,
effectively solve file damage and loss problems, reduce labor and information transmis-
sion costs, and simplify the work process [10]. Finally, the decentralized data sharing
model can enhance trust between enterprises and the customs supervision department,
help suppliers realize real-time monitoring of goods, and the Customs Department com-
plete all work with high quality and efficiency. Applying smart contract technology to
customs tariff management can avoid tax evasion to a certain extent, reduce the burden
of Customs departments, and improve the work efficiency of Customs departments.
270 J. Zhang and Y. Fan

2.4 To Enhance Efficiency and Lower Cost of Trade Financing


In international trade, the approval process of trade financing methods such as credit and
letters of credit commonly used by importers and exporters is cumbersome and time-
consuming, delaying payment and prolonging the delivery time. At the same time, fraud
frequently occurs due to information asymmetry in cross-border transactions [11]. The
emergence of blockchain technology provides a solution to the above problems. Firstly,
the application of blockchain technology in trade financing can reduce the burden on
banks. Due to lack of trust, when dealing with businesses with potential risks, banks take
a cautious attitude and arrange a lot of workforce costs to examine and verify them. The
application of blockchain technology can simplify the approval process, reduce labor
costs, and shorten the audit cycle. Secondly, the application of blockchain technology in
trade financing can also reduce capital costs. Enterprises have to pay interest, guarantee
fee, risk margins, and other fees through traditional financing methods. However, through
the cross-border blockchain platform, the financing rate is far lower than the loan interest
rate, and there are no other fees.

2.5 To Improve the Security and Stability of Trade Supply Chain


Applying blockchain technology to the trade supply chain can significantly improve its
security and stability. Firstly, blockchain technology can ensure the privacy of relevant
information more effectively. There are many kinds of information involved in the enter-
prise production process. For example, production process and transaction information
belong to trade secrets and only can be accessed by particular individuals. Blockchain
technology can effectively distinguish information and individuals to safeguard the con-
fidentiality of relevant information [12]. Secondly, blockchain technology can ensure
data transparency. The supply chain involves many enterprises. Poor information trans-
mission between relevant enterprises may lead to negative impacts, and the ambiguity of
relevant product parameters will make consumers doubt the product quality. Blockchain
technology can record the genealogy of information, ensure the credibility of informa-
tion, make information open and transparent, and relevant personnel can check the source
and process of information. Finally, the application of blockchain technology can realize
information sharing among enterprises in the supply chain. There are many enterprises
involved in the supply chain, exists a large span between the upstream and downstream
enterprises, which leads to the difficulty of information transmission between them. With
blockchain technology, the core enterprises and suppliers in the chain can establish their
blocks, which means each block realizes relative information sharing.
In addition to the application in the above five aspects, based on reliable and tamper-
proof data, blockchain technology is helpful to fulfill international trade contracts auto-
matically and irreversibly. It could execute some pre-defined rules and terms more
automatically than a traditional contract.

3 Evaluation of the Effect of Blockchain Technology Application


in the International Trade
This paper uses the analytic hierarchy process (AHP) to evaluate the application of
blockchain technology in international trade. Its basic working idea is to establish a
Evaluation of the Effect of Blockchain Technology Application 271

hierarchical structure model, construct judgment matrices at different levels by scoring,


test the consistency of single ranking and total ranking, and finally calculate the propor-
tion of each model index. The analytic hierarchy process can simplify multi-objective
and complex problems as a systematic analysis method, making its calculation simple
and results clear.

3.1 Build an Index System for Evaluating the Effect of Blockchain Technology
Application in the International Trade
According to the construction requirements of the AHP index system, this paper sets
up five primary indicators, including product traceability, trade payment, customs pro-
cedures, trade financing, and trade supply chain, to evaluate the effect of blockchain
technology in international trade. Product traceability includes three secondary indexes:
anti-counterfeiting ability, real-time traceability, and consumer trust. Trade payment
includes three secondary indexes: payment security, payment efficiency, and cost. The
customs procedures include three secondary indexes: time to pass the customs, probabil-
ity of fraud and handling fee. Trade financing includes three secondary indexes: length
of audit, human capital costs, financing expenses. The trade supply chain includes three
secondary indexes: information privacy, data transparency and information sharing. The
above index system is shown in Fig. 1.

Fig. 1. The evaluation system of the effect of blockchain technology application in international
trade.

3.2 Analysis on Index System for Evaluating of the Effect of Blockchain


Technology Application in the International Trade
The AHP evaluation scale constructs a judgment matrix in Table 1 to confirm the weight
of each index. Table 2 shows the weight of the primary index, and Table 3 shows the
comprehensive weight of the secondary index. As shown in Table 4, all indexes meet
the consistency test, i.e., CR < 0.1.
272 J. Zhang and Y. Fan

Table 1. AHP evaluation measurement

Paired comparison criteria Definition Content


1 Equally importance The two elements are of equal
importance
3 Moderately more importance One element is considered
slightly more important than the
other
5 Strongly more importance Strongly inclined to one
element based on experience
and judgment
7 Very strongly importance Very much in favor of one
element
9 Extremely importance The evidence establishes that
one is more important than the
other when comparing the two
2/4/6/8 Used as a median between the
above criteria
The reciprocal of the above When element A is compared
values with element B, if the above
scale value is given, the weight
of element B compared with
element A should be the
reciprocal of that scale

Table 2. Primary index weight

Primary index Weight


Product traceability Y1 0.0448
Trade payment Y2 0.3225
Customs procedures Y3 0.1289
Trade financing Y4 0.1903
Trade supply chain Y5 0.3134
Evaluation of the Effect of Blockchain Technology Application 273

Table 3. Comprehensive weight of secondary index

Secondary index Weight


Anti-counterfeiting ability X1 0.0131
Real-time traceability X2 0.0036
Consumer trust X3 0.0281
Payment security X4 0.1998
Payment efficiency X5 0.0917
Cost X6 0.0311
Time to pass the customs X7 0.0298
Probability of fraud X8 0.0858
Handling fee X9 0.0134
Length of audit X10 0.0891
Human capital costs X11 0.0420
Financing expenses X12 0.0592
Information privacy X13 0.1207
Data transparency X14 0.0897
Information sharing X15 0.1030

Table 4. Consistency test

Primary index CR Secondary index CR


Product traceability Y1 0.0323 Anti-counterfeiting ability X1 0.0915
Real-time traceability X2
Consumer trust X3
Trade payment Y2 Payment security X4 0.0834
Payment efficiency X5
Cost X6
Customs procedures Y3 Time to pass the customs X7 0.0836
Probability of fraud X8
Handling fee X9
Trade financing Y4 Length of audit X10 0.0036
Human capital costs X11
Financing expenses X12
Trade supply chain Y5 Information privacy X13 0.0370
Data transparency X14
Information sharing X15
274 J. Zhang and Y. Fan

3.3 Conclusion to Evaluation of Effect of Blockchain Technology Application


in the International Trade

According to the proportion of each aspect, trade payment accounts for 32.25%, trade
supply chain 31.34%, trade financing 19.03%, customs procedures 12.89%, and product
traceability 4.48%. This shows the effect on trade payment and trade supply chain are
better than others, while it has a less obvious effect on product traceability.
Trade payment is an important component of international trade. Based on a char-
acteristic of strong confidentiality, blockchain technology provides a more secure and
efficient environment and makes up for the defects of traditional international trade
payment. Therefore, trade payment is a key field of all applications of blockchain tech-
nology in international trade and should fully combine with blockchain technology to
innovate the mode of payment. There is still more development space for applying
blockchain technology in the trade supply chain, trade financing, and customs proce-
dures. Blockchain technology has the characteristics of openness, transparency, and
decentralization. However, current applications do not take full advantage of these mer-
its yet, and only solve partial problems existing in the trade supply chain, trade financing,
and customs procedures. Therefore, these three aspects should deepen integration with
blockchain technology and make full use of its characteristics. Blockchain technology
has a great space and a certain advantage of application in product traceability. However,
at present, it is not widely used in product traceability. Manufacturers have an insuffi-
cient understanding of the importance of product traceability. They should strengthen
their understanding of product traceability and further promote it.

4 Barriers and Solutions of Blockchain Technology Application


in the International Trade
Currently, there are still some barriers to applying blockchain technology in the field of
international trade. Firstly, there are still some technical risks in the current blockchain
technology, which is why blockchain technology is not widely used. If an agreement or
a system has technical vulnerabilities, once invaded, the whole chain will be affected,
and it isn’t easy to eliminate these impacts due to distributed blockchain. The situa-
tion that the Encryption algorithm is not updated in time will also bring risks [13].
Secondly, blockchain technology lacks legal supervision. As an emerging digital tech-
nology, there is no legal norms and technical standard for blockchain, and criminals may
take advantage of loopholes to make profits.
For the above problems, two measures are as follows. Firstly, the cultivation and
introduction of blockchain technical talents should be paid more attention to improve the
innovation ability to develop blockchain technology. It is helpful to carry out international
exchanges actively for promoting global blockchain technology development cooper-
ation. Enterprises should actively participate in the application and development and
improve their ability to use blockchain technology fully. Secondly, governments should
formulate relevant laws and regulations, build and improve the supervision mechanism
of blockchain technology, assign tasks and responsibilities to the designated person and
make the blockchain technology achieve faster and better development.
Evaluation of the Effect of Blockchain Technology Application 275

5 Conclusion
As blockchain technology emerges, many industries and fields set off a technological
revolution and bring about technological changes. Due to the limited application of
blockchain technology in international trade, this paper only researches the application of
blockchain technology in trade payment, customs procedures, product traceability, trade
financing, and trade supply chain. Because blockchain technology is gradually improved,
its application in international trade needs to be further explored. At present, there are
still some problems in applying blockchain technology in the field of international trade,
such as technical risks and legal application of blockchain, which look forward to further
study by scholars.

Subject Source
National Social Science Foundation of China (18BJL094).

References
1. Chen, W.G., Yuan, J.: Integration of blockchain technology into global economic governance:
paradigm innovation and regulatory challenges. Tianjin Soc. Sci. 06, 91–99 (2020)
2. Akash, T., Arun, S., Richa, K., Anand, N., Sudeep, T., Neeraj, K.: Blockchain-based efficient
communication for food supply chain industry: transparency and traceability analysis for
sustainable business. Int. J. Commun. Syst. 34(4), e4696.1–e4696.20 (2021)
3. Sheng, S.Y.: Research on construction of supply Chain information resource sharing model
based on blockchain technology. Inf. Sci. 39(07), 162–168 (2021)
4. Yanling, C., Eleftherios, I., Weidong, S.: Blockchain in global supply chains and cross border
trade: a critical synthesis of the state-of-the-art, challenges and opportunities. Int. J. Prod.
Res. 58(7a8), 2082–2099 (2020)
5. Lin, X., Chang, S.C., Chou, T.H., Chen, S.C., Ruangkanjanases, A.: Consumers’ intention to
adopt blockchain food traceability technology towards organic food products. Int. J. Environ.
Res. Public Health 18(3), 912 (2021)
6. Wang, X.D., Sun, Z.Y.: Discussion on using blockchain technology to carry out international
settlement. Foreign Econ. Trade Pract. 07, 63–65 (2017)
7. Liang, X., Zhang, H.: Analysis on the application mode of blockchain technology in
international payment. Friends Account. 02, 155–160 (2021)
8. Zhao, Z.K.: Blockchain creates a new cross-border payment model for international trade.
Enterp. Econ. 36(09), 163–168 (2017)
9. Hong, T., Cheng, L.: Application of blockchain in global trade and finance. Int. Trade 10,
63–67 (2018)
10. A decentralized and secure blockchain platform for open fair data trading. Concurr. Pract.
Exp. 32(7), e5578.1–e5578.11 (2020)
11. Li, J.J., Wang, Z.W.: Application mode, risks and challenges and policy suggestions of supply
chain finance based on blockchain technology. New Financ. 01, 48–55 (2021)
12. Jia, X.L.: Research on supply chain management under the development of blockchain and
big data. Bus. Econ. Res. 10, 49–51 (2020)
13. Zhang, Z.P., Ma, Y.G.: The development of blockchain + supply chain finance in China:
models, challenges and countermeasures. Res. Financ. Dev. 08, 48–54 (2020)
Fraud Network Identification Model
for Insurance Industry

Jiaqiu Wang1(B) , Yining Jin2 , Zaiyu Jiang1 , Xueyong Hu1 , and Peng Wang1
1 Beijing CHINA-POWER INFORMATION TECHNOLOGY Co., Ltd., Haidian District,
Beijing, China
wangjiaqiu@alu.hit.edu.cn
2 Harbin University of Commerce, Songbei District, Harbin, China

Abstract. Detection of fraudulent gangs plays an important role in the healthy


development of vehicle insurance. Existing insurance anti-fraud platforms still
use a lot of manual screening that mainly relies on accumulated experience, so the
accuracy of this insurance fraud screening method is unsatisfactory. Moreover, as
the insurance fraud changes from a single person to a gang, the entity-relationship
in the insurance fraud becomes complicated. Therefore, this paper proposes an
insurance fraud identification model based on a knowledge graph to assist insur-
ance personnel in identifying the emerging gang fraud. This model mainly pro-
cesses the data of the vehicle insurance industry and imports them into the attribute
graph database neo4J after merging them. Then, the two hundred thousand nodes
are aggregated through graph database query language into four fraud network
models: the abnormal human-vehicle relationship group, vehicle crossover group,
driver ring relationship group, and the target three-vehicle recurrence group. The
model can be used to observe abnormal vehicle insurance behavior effectively and
intuitively.

Keywords: Vehicle insurance · Fraud · Identification model · Knowledge graph

1 Introduction
In recent years, with the rapid development of the economy, diversified development
fields have emerged one after another. The insurance field has been recognized and
received wide popularity to reduce property losses—for example, medical insurance,
vehicle insurance, life insurance, financial insurance, and so on. However, because the
anti-fraud work in China started relatively late, the anti-fraud system in all aspects is not
perfect, insurance fraud occurs from time to time.
To reduce the economic losses caused by insurance fraud, some insurance compa-
nies have developed anti-fraud platforms to detect fraud. For example, China Insurance
Information Technology Management Co., Ltd. developed a vehicle insurance infor-
mation platform to detect insurance fraud. However, the existing insurance anti-fraud
platform cannot actively provide effective information and still rely on manual checking.
This situation leads to the low efficiency of anti-fraud work, and the practical applica-
tion cannot achieve good results. At the same time, with the individual fraud gradually

© The Author(s), under exclusive license to Springer Nature Switzerland AG 2022


A. E. Hassanien et al. (Eds.): BIIT 2021, LNDECT 107, pp. 276–287, 2022.
https://doi.org/10.1007/978-3-030-92632-8_27
Fraud Network Identification Model for Insurance Industry 277

identified, insurance fraud transforms into gang fraud. This makes it difficult for some
platforms to analyze fraudulent conspiracies that could cause huge losses effectively.
Considering that auto insurance fraud presents the trend of syndication, the detection
and identification of fraud gangs play an important role in the healthy development of
auto insurance.
Therefore, to cope with the gang fraud of vehicle insurance, this paper processes
the data of the vehicle insurance industry, merges them into the attribute graph database
Neo4j, and aggregates two hundred thousand nodes into four fraud network models
through Cypher language to assist insurance personnel in identifying the endless gang
fraud.
This article is organized as follows. The related work is described in Sect. 2. Section 3
introduces the background knowledge of insurance fraud identification. Section 4
describes the proposed model design in detail. Section 5 introduces the case and shows
the corresponding knowledge graph structure. Section 6 summarizes the paper and looks
forward to future work.

2 Related Work
This section introduces the work related to insurance fraud. At the same time, this section
introduces the relevant research of insurance fraud of vehicle insurance.

2.1 Research on Insurance Fraud

With the rapid development of the insurance industry, Insurance fraud is a growing
problem in different areas., but the detection technology of insurance fraud is relatively
backward. Therefore, there is an urgent need to identify insurance fraud-related technol-
ogy to improve the environment in the insurance field. For example, Li et al. [1] analyzed
the three new features of insurance fraud in detail from the three components: the fraud-
ulent behavior of insurance fraud, the subjective fault of the perpetrators, and the causal
relationship between the fraudulent conduct and insurance compensation. Xu et al. [2]
summarized the main risk types of large-sum insurance based on large-sum insurance
fraud. At the same time, based on the Apriori algorithm, a big data intelligent anti-fraud
system model for large insurance is proposed, focusing on the analysis of statistical
identification, insurance fraud law mining, insurance fraud behavior identification.
Graph database technology can also be effective in identifying mob-style insurance
fraud related to critical illness insurance. For example, Zhou and Huang et al. [3] use the
graph database method to analyze the compensation information of a large life insurance
company in the past ten years and explore and verify the feasibility of the graph database
technology in the identification of serious disease insurance gang fraud. Wu et al. [4] use
Logistic regression and K-means clustering analysis to identify claims risk from the two
aspects of category characteristics and individual characteristics and find fraud factors.
Nui et al. [5] constructed a medical insurance anti-fraud warning model of medical credit
rating from the perspective of fraud through the health level, income level, and other
specific data of medical reimbursement subjects.
278 J. Wang et al.

2.2 Research on Vehicle Insurance Fraud

In the era of big data, the entities involved in insurance fraud are increasingly com-
plicated. For example, Car insurance fraud from a single personnel’s accident, sudden
change to organized and planning criminal gang. Insurance companies often lack evi-
dence, the exploration of neglect or malicious complaints brought under the industry
regulatory pressure and the risk of litigation, the unreasonable compensation to the
deceiver, this brings the massive economic losses to the insurance company [6]. In this
case, a lot of scholars began to study vehicle insurance fraud. Joint Task Force on vehicle
Anti-fraud [7] analyzes the new trends and characteristics of auto insurance fraud, from
the insurance company management, information sharing, and the external environment,
laws, and regulations, and many other aspects analyze the problem. It also puts forward
targeted suggestions to build a multi-party cooperative prevention and control system
with big data as the core.
In addition, Macedo et al. [8] conducted on-site semi-structured interviews with a
convenience sample of 20 auto repair workshops in Portugal to explore the role of vehicle
repair workshops in auto insurance fraud and how they might contribute to the reduction
or increase of such crimes. The research found that auto repair shop, as one of the main
actors participating in insurance fraud, has the potency to prevent these crimes. Yang [9]
designs and implements a graph-based visual analysis system for auto insurance fraud,
which mainly analyzes fraudulent behavior from three perspectives: element analysis,
suspicious target, and fraud gang, and provides decision support for professionals. Yu
and Feng et al. [10] study the anti-fraud detection method of auto insurance. They apply
micro-gang modeling to Motor vehicle insurance fraud detection and adopt matrix-based
similarity calculation, rank ranking, and transformation algorithm to identify gangs with
minimal probability but highly suspicious. The experimental results show that the method
is more accurate and efficient than the traditional method.
To sum up, no matter in the field of financial insurance, medical insurance, or vehicle
insurance, insurance fraud is affected by a variety of subjects and factors. In order to
better identify insurance fraud, this paper takes vehicle insurance fraud as the research
object and establishes a network identification model based on vehicle insurance fraud.
The model consists of four parts: abnormal human-vehicle relationship group, vehicle
crossover group, driver ring relationship group, and vehicle recurrence group, which
fully analyzes the different relationships between people and vehicles, and analyzes and
identifies vehicle insurance fraud according to different situations.

3 Background Knowledge of Insurance Fraud Identification

This section describes the background knowledge of insurance fraud identification,


including an overview of insurance fraud and insurance fraud harm.

3.1 Overview of Insurance Fraud

With the development of information dissemination technology, the risk of insurance


fraud becomes increasingly prominent, and it is characterized by specialization and
Fraud Network Identification Model for Insurance Industry 279

collectivization. How to construct a scientific and effective anti-fraud system becomes


an urgent problem to be solved. According to the 2019 White Paper on Intelligent Risk
Control of China’s Insurance Industry (referred to as the Insurance Society of China)
[11], insurance fraud is accompanied by insurance since the emergence of insurance. To
protect the diagram to compensate or to protect profit has become a few policy-holder
or the deformed mentality of the insured, its purpose is to obtain additional benefits
through insurance. The manifestations of insurance fraud include the insured not telling
the truth, fabricating or forging the claim amount, deliberately exaggerating the claim
amount, repeating the claim and so on [12]. At the same time, there is also a lot of
intentional fraud within insurance companies and insurance intermediaries. According
to the International Association of Insurance Supervisors (IAIS) estimation, about 20–
30% of the global Insurance compensation is suspected of fraud every year. According
to conservative estimates, China’s auto insurance industry fraud leakage accounted for
at least 20% of the claims, the corresponding annual loss of more than 20 billion yuan.

3.2 Harm of Insurance Fraud

Insurance fraud seriously infringes on the legitimate rights and interests of insurance
consumers, destroys the normal order of the insurance market, and has a negative impact
on many aspects of society. It is mainly introduced from the following three aspects.

(1) Insurance Companies

Insurance fraud results in losses to insurance companies. The insured fraud claims would
be resulting in the insurance company spending unnecessary additional funds, which will
bring financial losses to the insurance company. With the increase of the cases of the
insured defrauding the insurance company, insurance fraud cases will inevitably bring
loss of reputation to the insurance company.

(2) The Whole Society

Harm spillover poses a threat to the whole society. The increase of insurance fraud will
corrupt social morals and disturb the stability and development of the national economy.

(3) Other Insured Persons

Other honest policyholders suffered losses. Insurance fraud will lead to an increase in
the insurance rate, the degree of protection of honest policy-holders will be weakened,
and the insurance interests will be damaged.

4 Model Designment

This section introduces the proposed fraud network identification model for the insurance
industry and the methods to identify fraud based on the model. This section contains
280 J. Wang et al.

six parts, which are the introduction of knowledge graph, insurance entity relationship
based on knowledge graph, model design, data introduction, data preprocessing and
node relationship import, data modeling and group generation.

4.1 Introduction to Knowledge Graph

Knowledge graph is proposed by Google on May 17, 2012. Its original intention is to
improve the capability of the search engine, improve the user’s search quality and search
experience [13]. The current artificial intelligence technology can be simply divided into
perceptual intelligence (image, video, speech, text recognition, etc.) and cognitive intel-
ligence (knowledge reasoning, causal analysis, etc.). Knowledge mapping technology
is regarded as the core technology in the field of cognitive intelligence and an impor-
tant part of artificial intelligence. Its powerful semantic processing and interconnection
organization ability provide the basis for intelligent information application [14].
As one of the important branches of artificial intelligence technology, knowledge
graph technology can display the relationship between entities and the attribute informa-
tion of entities and associations in a visual way [15]. Moreover, the knowledge graph can
discover new relational information, new unstructured data modes and new knowledge
faster and more accurately, achieving insight customer and reducing business transaction
risks.

4.2 Insurance Entity Relationship Based on Knowledge Graph


In the whole life cycle of insurance, including insurance, underwriting/correcting, claims
settlement processes. Based events involve multiple entities, and there are relationships
between these entities. The entity-relationship of insurance based on knowledge graph
is shown in Fig. 1.
In Fig. 1, the relationship between the insurance entity and the entity is established,
including the applicant, the target vehicle, the damage tester, the entity of the repair
shop, as well as various relationships such as car-car relationship, man-car relationship,
man-person relationship, damage tester-repair shop and so on. With the help of the graph
database tool which is used to store the knowledge graph, the relationship between the
parties and the related parties in the claim case can be understood in a very intuitive
form.
Clustering belongs to unsupervised learning, which classifies data by features rather
than labels of data itself. The fraud network identification in this paper is to aggregate
fraud cases through the node label and the relationship between nodes, which can visually
present the entity-relationship between different insurances. With the visualization of
the graph database, claims adjusters can readily discover the association between entities
hidden in multiple cases. In other words, the fact that a fraudulent organization can be
formed, or is forming process will be identified. However, the groups generated by the
fraud network model need to be further examined and verified by the insurance personnel
and cannot completely avoid manual inspection.
Fraud Network Identification Model for Insurance Industry 281

Report a
People hurt accident The case crime Informant

Car
own
accident accident owner

crash
policy-
Three car Mark car insure insure
holder
Insurance
policy
insurabl
The
insured
survey e benefit insured
survey repair

repair
drive
The
driver
Survey
member The garage

loss assessment

The person
Verify the
Part fee damage who verified
the damage

Fig. 1. Insurance entity relationships based on knowledge graph.

4.3 Model Designed


This model is constructed by graph database query language Cypher. The suspected
network fraud groups, such as abnormal person-vehicle relationship group, vehi-
cle crossover group, driver ring relationship group, target and three, and vehicle
repeat group, are constructed through various relationships, including vehicle-to-vehicle
person-to-vehicle relationship, person-to-person relationship, and person-to-garage
relationship.

Data Introduction. In this model, the insurance policy information, report information,
and repair shop information of a certain insurance company are used as the experimental
data, and different persons are combined with the certificate number. Different cars
are combined with the license plate number. After the data is imported into the graph
database, 29 kinds of nodes (people, cars, cases, etc.) and 21 kinds of relationships
(driving, insurance, reporting, etc.) are generated. Each node and relationship has its
attributes. Because natural persons may have multiple labels, we set up a specific entity
classification, as shown in Fig. 2. To protect the privacy of users, the process of data
modeling in this experiment adopts the method of encryption and desensitization.
282 J. Wang et al.

Fig. 2. Entity classification diagram.

Data Preprocessing and Node Relationship Import. Since this data is based on the
policy and report as the main key, it is necessary to merge the personnel information
according to the certificate number of the driver (subject, three, insured, insured), internal
personnel (survey, damage, damage) and other data. Finally, import it into the Neo4j
graph database.

Data Modeling and Group Generation. In Neo4j, the four fraud network models in
a stage of model design are constructed by Cypher. The claims adjuster can determine
whether the case is a fraud by using the model and the unfold node data.

The Implementation Process. With Neo4j and Cypher language, a program can be
designed to automatically detect all potentially abnormal vehicle insurance behaviors.
Implementing Apoc plug-in based on Neo4j and developing stored procedures through
Java-Neo4J. (1) All nodes of the fraud model are queried through Cypher, node rela-
tionship IDS in each subgraph are aggregated into a new group node through traversal,
and then are marked with the identification of the model and are stored in neo4j. (2)
The Cypher of this stored procedure can be automatically detected by executing the
scheduled task. (3) Storing the aggregated nodes into the node relation ID of the model,
and setting the model type identifier. You can expand the model through the group node
or summarize the classified groups. Figure 3 shows the stored procedure that queries,
aggregates, and stores the fraud model into Neo4j.
A fraud network model is a designed group according to the abnormal occurrence of
nodes and relationships, which discusses suspected fraud groups. Insurance personnel
is required to further verify group status, reducing most of the screening workload. In
addition, the fraud network model may give false-positive results.
Fraud Network Identification Model for Insurance Industry 283

A fraud network
model that queries a
model using Cypher

Traverse all nodes


and relationships in
the model

Iterate through all the


connected subgraphs

Is there a subgraph

Yes
The node ID, relationship ID, and
attribute value of the subgraph are
found

Check whether
Generate a new
the groupld exists on group ID
the node No

Yes

Determine if a Write the group node


merge is needed to the graph database
No

Yes

Combined group

End

Fig. 3. The fraud model queries, aggregates, and stores stored procedures in Neo4J.
284 J. Wang et al.

5 Cases
This section introduces the case of using the fraud network identification model and the
corresponding knowledge graph, which consists of four parts: abnormal human-vehicle
relationship group, vehicle crossover group, driver ring relationship group, and the target
three-vehicle recurrence group.

5.1 Abnormal Driver-Vehicle Relationship Group


The abnormal human-vehicle relationship group refers to when people and cars fre-
quently appear in the same or multiple cases. One person drives multiple cars or one car
is driven by multiple people.
The abnormal human-vehicle relationship constituted by human and vehicle is shown
in Fig. 4. Among them, Mr. Li and Mr. Cao are policy-holders, brown is the target car,
pink is the three cars, the two gentlemen are the driver of the target car, the driver of the
three cars, and the policy-holders. The two are driving three separate cars, which are in the
same case. Therefore, it can be judged that the two drivers constitute fraud. According to
these two people, insurance personnel can probe related vehicles, insurance policies, and
repair shop information and other nodes, through the information to determine whether
to cheat insurance and further safeguard the rights and interests of insurance companies
and other insurance personnel.

Fig. 4. An Abnormal human-car relationship formed by the human and car.

5.2 Vehicles Cross into Groups


Vehicles cross groups, that is, the target car and the three cars appear in the same case
several times. In Fig. 5, there are multiple relationships ()-[:SameIncident]-() between
the two three cars on the left and the two marked cars on the right. Insurance personnel
can start groups by abnormal vehicles, or analyze groups by linked cases to identify
fraud.
Fraud Network Identification Model for Insurance Industry 285

Fig. 5. Cars and cars formed by vehicles cross into groups.

5.3 Driver Ring Relationship Group

The driver loop relationship group is an abnormal group of people - people. The driver
(the driver of the target, three car drivers) is out of danger together, connecting into a ring.
For example, Fig. 6 shows possible gang fraud. In the picture, Mr. Tan and Mr. Liu are
doubt as gang frauds. Insurance personnel can expand the nodes of the knowledge graph
or through its associated cases, policies, and related institutions to determine whether to
be cheated for insurance.

Fig. 6. Driver ring relationship group formed by person-person.


286 J. Wang et al.

Fig. 7. Car-car group of the three vehicles repeats group.

5.4 The Three Vehicles in the Target Appear in Repeated Groups

The three vehicles in the target appear in repeated groups are the same car and all three
cars in more than two cases, so this is a car-to-car model. Figure 7 shows a sample of
this model. The blue target car and the yellow three cars appear at a time in multiple
cases, and the claims adjuster can further track the two vehicles to protect the rights and
interests of the company.
Aiming at fraud network detection in the insurance industry, this paper uses graph
database tool and knowledge fusion, merging entity methods to study related issues. The
detection processes can identify the group fraud through the relationship between the
entities. The case presented in this paper is the data of the actual insurance company,
including the insurance policy information, report information and repair shop infor-
mation of the insurance company. Because the data of people, cars, repair shops and
insurance policies are sensitive. Therefore, the data shown in this paper are encrypted,
but it does not affect the identification of the actual auto insurance gang fraud.

6 Conclusions and Future Work


This paper proposes an insurance fraud identification model based on knowledge graph,
which identifies fraud behaviors in the insurance industry and aims at protecting the
rights and interests of insurance companies and other policy-holders. This model mainly
processes the data of the vehicle insurance industry and imports them into graph database
Neo4j after merging them. Through graph database language Cypher, two hundred thou-
sand nodes are aggregated into four fraud network models, which are abnormal human-
vehicle relationship group, vehicle crossover group, driver ring relationship group, target
and three, and vehicle repeat group. The model can be used to observe abnormal vehicle
insurance behavior effectively and intuitively.
Fraud Network Identification Model for Insurance Industry 287

In future work, in terms of model improvement, identifying models that have identi-
fied fraud aims to reduce the cost and time of anti-fraud. As for the application function
method, based on the Apoc plug-in of Neo4j, the stored procedures and groups searched
by Cypher would be aggregated into a new node, and attributes such as the amount and
associated cases are added to the new group node. Then the group node development
can realize more customized visualization.

Acknowledgment. The authors thank the editor and the anonymous reviewer for their helpful
comments and suggestions, which improved this paper.

References
1. Li, Y.Q., Qiao, S.: New characteristics and risk prevention path of insurance fraud under the
background of big finance. Insur. Res. (04), 121–127 (2021)
2. Xu, Q.M., Zhang, M.R.: Design of intelligent anti-fraud system based on big data of large
insurance. Comput. Age (07), 117–120 (2021)
3. Zhou, X.N., Huang, L., Wang, F.Y., Chu, M., Huang, T.: Research on the application of graph
database in the identification of serious illness insurance gang fraud. Insur. Res. (09), 92–104
(2020)
4. Wu, J.T., Zhang, Y.L.: Research on anti-fraud identification model of insurance. Natl. Circ.
Econ. (26), 152–154 (2020)
5. Nui, X.F.: Research on the design of anti-fraud warning mechanism in medical insurance. J.
Account. (18), 104–108 (2020). (in Chinese)
6. Ji, W., Cheng, G.: Application of traffic accident trace identification in anti-fraud of auto
insurance. Law Soc. (17), 77–78 (2021)
7. Auto Insurance Anti-fraud Joint Research Group: A study on fraud and anti-fraud in auto
insurance. Insur. Res. (06), 3–10 (2021)
8. Macedo, A.M., Viana, C.C., Marques, N.J.S., da Costa Brás da Cunha C.A.: Car insurance
fraud: the role of vehicle repair workshops. Int. J. Law Crime Justice 65, 65–75 (2021)
9. Yang, S.: Design and Implementation of Visual Analysis System of Auto Insurance Fraud
Based on Graph. Beijing University of Posts and Telecommunications (2019)
10. Yu, W., Feng, G.F., Zhang, W.J.: Research on fraud detection system and gang identification
of motor vehicle insurance. Insur. Res. (02), 63–73 (2017)
11. http://xw.sinoins.com/2019-06/25/content_295813.htm
12. Stijn, V., Guido, D.: Insurance fraud: issues and challenges. Geneva Pap. Risk Insur. Issues
Pract. 29(2), 313–333 (2004). https://doi.org/10.1111/j.1468-0440.2004.00290.x
13. Liu, Q., Li, Y., Duan, H., Liu, Y., Qin, Z.G.: A review of knowledge graph construction. J.
Comput. Res. Dev. 53(03), 582–600 (2016)
14. Liu, J., Li, Y., Duan, H., Liu, Y., Qin, Z.G.: Knowledge graph construction: a review. J.
Comput. Res. Dev. 53(03), 582–600 (2016)
15. Du, R.S., Chen, H.Y.: Building knowledge graph to realize knowledge fusion: a review of
knowledge graph. J. Shanxi Univ. Finance Econ. 43(05), 127 (2021)
Global Supply Chain Information
Compensation Model Based on Free Trade Port
Blockchain Information Platform

Shaoqing Tian1(B) , Fan Jiang1 , and Chongli Huang2


1 College of Applied Science and Technology, Hainan University, Haikou 570228, China
2 Library, Hainan University, Haikou 570228, China

Abstract. In the research on the construction of the global supply chain of the
Hainan Free Trade Port (FTP), it is found that sharing supply chain information is
an important issue to improve the performance of the global supply chain built on
the FTP. With the help of blockchain technology, solutions to information sharing
problems can be proposed. During the construction of the FTP, it is necessary
to build a Free Trade Port Blockchain Information Platform (FTPBIP), in which
the information compensation service can improve the effectiveness of supply
chain information sharing. By discussing the operation mechanism of information
compensation in the Free Trade Port Supply Chain (FTPSC) based on blockchain,
a set of information compensation models is constructed to provide a theoretical
basis for further improving the construction of the FTPBIP.

Keywords: Data currency · Free Trade Port Blockchain Information Platform ·


Free Trade Port Supply Chain · Information compensation

1 Introduction
As the process of economic globalization continues to accelerate, more and more com-
panies have begun to rely on FTP to participate in the construction of global supply chain
networks. Many studies have shown that the construction of my country’s FTP has an
important impact on coordinating the global supply chain network. The FTP informa-
tion service plays a particularly prominent role in the coordination of the global supply
chain. However, the information service function of the FTP is still in the exploratory
stage. Therefore, this subject considers the use of blockchain technology to cut in from
the perspective of information sharing, trying to reveal the information compensation
operation mechanism of the FTP for the innovation and coordination of the global supply
chain.
This research will take the free trade zone with Chinese characteristics as the node
of the supply chain network as the research object. Based on the perspective of the inno-
vation of information compensation mechanism in my country’s free trade zone and the
expansion of global supply chain coordination methods under the blockchain technol-
ogy, combined with the investigation and analysis of specific case areas, With the help
of regional economics, management and system dynamics theories and methods, based

© The Author(s), under exclusive license to Springer Nature Switzerland AG 2022


A. E. Hassanien et al. (Eds.): BIIT 2021, LNDECT 107, pp. 288–300, 2022.
https://doi.org/10.1007/978-3-030-92632-8_28
Global Supply Chain Information Compensation Model Based on FTPBIP 289

on the coordination analysis of FTP information compensation to global supply chain


information sharing, analyze and construct the FTP information compensation operation
mechanism and its realization path. The research results have important theoretical and
practical significance for the construction and development of FTP and the optimization
of the information compensation mechanism.

2 Related Literature Research and Analysis

2.1 Information Sharing is the Focus of Research on Global Supply Chain


Coordination

The origin of the research on information sharing can be traced back to Srinivasan [1]
in the just-in-time environment, who studied the effect of electronic data interchange
technology on the supplier’s shipment performance. Hau et al. [2] researched the distor-
tion of supply chain information, who proposed the bullwhip effect. Domestic research
on supply chain information sharing began in the early 21st century. Wang et al. [3]
discussed the role of information flow in supply chain management for the first time.
Later, due to the country’s strategic considerations, a series of deepening reforms have
been carried out, gradually extending the domestic supply chain network. In recent
years, many scholars have researched supply chain information from the perspective of
different disciplines.
Since the supply chain network involves issues such as the internal connections and
interactions between stakeholders, information sharing research has become the focus of
attention in supply chain coordination research. Domestic and foreign-related research
mainly focuses on supply chain performance, coordination, and mechanism, empirical
research, etc.

2.2 Blockchain Has Gradually Become a Hot Technology in Global Supply Chain
Management Research

Research. The free trade zone’s role in coordinating global supply chain information
sharing is still in the exploratory stage in related research fields. Still, in recent years,
blockchain technology has been used as a decentralized and intelligent management
method, and scholars have used it to carry out related issues of supply chain coordi-
nation. Given the application of blockchain technology in supply chain information
sharing and coordination, some scholars at home and abroad have begun to study and
try. For example, Mitsuaki et al. [4] proposed a blockchain solution to solve information
asymmetry and double marginalization in supply chain management. Kuntal et al. [5]
discussed the prerequisites for adopting blockchain in the manufacturing supply chain.
Naif et al. [6] used blockchain technology to build a block supply chain to solve the anti-
counterfeiting problem in the supply chain. Vishal et al. [7] used blockchain and Internet
of Things technology to build an information-sharing supply chain management system
to solve information transmission efficiency. Li et al. [8] used the blockchain governance
mechanism system to develop supply chain intelligent governance mechanisms to solve
opportunistic risks and trust issues. Yang et al. [9] established a supply chain information
290 S. Tian et al.

platform through blockchain technology to solve information sharing issues. Zhang et al.
[10] use blockchain technology to build application scenarios to optimize supply chain
management. He et al. [11] believe that the data authenticity of information sharing in
the supply chain is worthy of attention, and blockchain technology can achieve coordi-
nation. These research results can help the free trade zone further explore the theory and
practice of global supply chain information sharing and coordination.
In summary, the current domestic research pays little attention to the role of free
trade zones in coordinating global supply chains. It seldom considers the influence of
free trade zones on the structure of global supply chains, interest relationships, and
information compensation.

2.3 Research on the Mechanism and Implementation Path of Information


Compensation in the FTP Needs Attention and In-Depth Research
Free trade zone information compensation is the application of government information
disclosure in global supply chain information sharing. It is also a coordinated approach
that adopts market-oriented operation methods and new information sharing.
From the perspective of foreign research results, only a few scholars have researched
information compensation. For example, Kai et al. [12] believes that information delay
has a serious impact on the performance of the supply chain, and a simple strategy
can compensate for the information delay; Some domestic scholars have discussed the
specific operation of supply chain information compensation. For example, Nie et al.
[13] believe that retailers can be motivated to share information by constructing an
information-sharing compensation mechanism. Shi et al. [14] believe that through the
construction of the Nash information compensation mechanism, payment information
fees will help upstream, and downstream information in the supply chain The realiza-
tion of sharing; More research on information compensation comes from the fields of
literature and computer science. For example, there is a phenomenon of information loss
due to cultural differences, and the missing information needs to be compensated [15].
Information compensation can balance the load of the original and translated informa-
tion [16], Solve the problem of information loss in mobile networks through information
compensation [17], low-frequency information compensation methods [18].
In summary, as a new field of research on the coordinated development of free
trade zones and global supply chains, the theory and method system of information
compensation research is still in the exploratory stage. Few studies are focusing on in-
depth revealing of the operating mechanism of information compensation in free trade
zones. Insufficient attention has been paid to the systematic analysis of the realization
path of information compensation.

2.4 Literature Review


Given the abovementioned research, it is necessary to carry out further systematic
research on the basic theoretical and practical issues of free trade zone information
compensation, a new type of global supply chain information sharing and coordination
approach. By selecting typical research objects and combining relevant case practices, in-
depth exploration of the adaptability of information compensation in the free trade zone
Global Supply Chain Information Compensation Model Based on FTPBIP 291

to global supply chain information sharing and coordination, the internal mechanism of
the information compensation operation process and its reasonable implementation path,
etc., can promote free trade Expansion and in-depth research on regional information
compensation and global supply chain optimization coordination.

3 The Operating Mechanism of Information Compensation


in the FTPSC Based on Blockchain
3.1 Elements of Information Compensation
Under the FTPBIP, related technologies to realize information compensation in the free
trade port’s global supply chain (FTPGSC) need to clarify the elements of informa-
tion compensation. Only in this way can we fully understand each element’s role and
the relationship between each element. Figure 1 shows the element system diagram of
information compensation.

Fig. 1. The element system diagram of information compensation.

(1) The main body of information compensation

Information is the main body of information compensation in the FTPGSC. Information


flow is an important part of the supply chain. The FTPGSC needs to realize information
sharing. At the same time, information should be compensated for the lack of infor-
mation based on information sharing to improve the overall supply chain performance.
Therefore, information is crucial to the FTPGSC companies and determines the com-
pany’s strategic layout, planning and operation. With the continuous increase in the
number of companies joining the global supply chain of the FTP and the development
of related businesses, massive amounts of data and information will continue to be gen-
erated and flowed. Effective information sharing can help supply chain companies to
match demand and supply. On this basis, appropriate information compensation will
provide basic support for the construction of the FTPGSC network.
292 S. Tian et al.

(2) Subject of information compensation

The FTPBIP service is the main body of the FTPGSC information compensation.
The FTPBIP service attracts and incorporates companies into the FTPGSC network
through relevant policies. The network members include manufacturers, manufacturers,
distributors, retailers, and consumers from various countries. These members will use
the platform to carry out related trade activities, generating an FTP blockchain data
warehouse. With the accumulation of data and information and the formation of trust in
the FTP’s blockchain, information services can provide necessary information compen-
sation for members of the FTPGSC network to achieve information sharing. Therefore,
the FTP blockchain information service will play an increasingly important role as the
number of participating companies increases.

(3) Object of information compensation

The global supply chain enterprises relying on the FTP are the objects of the infor-
mation compensation. The construction of the Hainan FTP is guided by Xi Jinping’s
thoughts on socialism with Chinese characteristics in a new era. China’s further integra-
tion into global economic cooperation is an important practice. Under the background
of the new round of reform and opening up, enterprises from various countries will use
the FTP platform to carry out cross-border trade with China. At the same time, the FTP
can also be used as an important transit point for trade with Asia. At this time, relying
on the FTPGSC companies will obtain the FTP’s blockchain information services and
help them obtain the necessary data and information for information sharing through
information compensation, thereby improving the performance of the FTPGSC.

(4) Information compensation technology

Blockchain technology is the core technology that supplements the information of


the FTPGSC. With the application of big data, the Internet of Things, and other high-
tech applications, data acquisition, and analysis of enterprises have been well solved.
However, information sharing between supply chains is still difficult to achieve due
to trust between enterprises. At this time, blockchain technology will provide a trust
guarantee for information sharing in the supply chain. Therefore, the FTP’s blockchain
information service can help the FTPGSC solve the trust problem between enterprises.
At the same time, through the information supplementary service, the FTPGSC can
better solve the information-sharing problem.

(5) Information compensation platform

The FTP Blockchain Information Service Platform is a platform for information


compensation in the FTPGSC. An important link in the construction of the FTP is
the construction of an information platform. The construction of the FTP blockchain
information service platform is the basis for the global supply chain information supple-
ment and information sharing. It can ensure the credible and safe flow of the FTPGSC
information flow and improve the performance of the FTPGSC.
Global Supply Chain Information Compensation Model Based on FTPBIP 293

3.2 Influencing Factors of Information Compensation


In order to improve the convenience, credibility, and safety of the FTPGSC companies
to conduct trade, improve the feasibility of global supply chain information sharing, and
improve the FTP’s information service system, it is necessary to build a blockchain-based
system A theoretical model of information sharing and information compensation for
technological decentralized credit services. The problems to be solved by the informa-
tion compensation of the FTPGSC mainly include the ontological source of information
compensation, the subject information management problem of information compensa-
tion, the object information sharing problem of information compensation, the technical
improvement of information supplement, and the construction of information compen-
sation platform problem. The FTP blockchain information service platform solves the
problems of isolation, credibility, and availability of information in various links between
the upstream and downstream of the FTPGSC through information compensation and
promotes the use of platform information to gradually form a market with informa-
tion use-value as currency Orientation mechanism. Through the blockchain information
platform, the brand effect of corporate information can be formed, which in turn can
enhance the reputation and brand value of the supply chain companies in the platform.
For downstream consumer clients, the information reliability of purchased goods is very
important, and the FTP blockchain information platform can enhance consumers’ trust
in enterprises and the credibility of products. The better the information compensation
service provided by the Hainan FTP’s blockchain information platform, the easier it
is to solve information sharing in the FTPGSC. The service quality of the blockchain
information platform can be evaluated and measured using indicators such as compen-
sation amount, compensation time, information utilization rate, and information supply
rate. Based on the blockchain technology and the construction of the FTP, this article
explores the factors that affect the information compensation effect of the FTPGSC in
the construction of the FTPGSC information-sharing model. Figure 2 shows the factors
that affect the information compensation of the FTPGSC.

Fig. 2. Factors Affecting Information Compensation of FTPGSC.

(1) The willingness of enterprises to build an FTPGSC based on the FTP

The construction of FTP is the beginning of China’s further opening up and high-
quality economic development. The FTP will become an important hub connecting
mainland China and overseas. With the support of the FTP policy, enterprises will bring
294 S. Tian et al.

huge profit margins through processing, storage, transportation, and sales of zero-tariff
imported goods. Come to the huge market space in Asia. Therefore, if the company is
strategically positioned in the Asian market, relying on the FTP to build a global supply
chain is of great significance to the company. At this time, companies need to make
reasonable plans and judgments. Faced with mainland China and even the “One Belt
One Road” market, they can consider using FTP as a supply chain node to reconstruct the
global supply chain, increase the level of supply chain information sharing, and improve
supply. Chain performance.

(2) The enterprise’s recognition of Xi Jinping’s socialist thoughts with Chinese


characteristics in the new era

China has entered a new period of historical development. The Communist Party of
China with Chairman Xi Jinping at the core, guided by Xi Jinping’s Thought on Socialism
with Chinese Characteristics for a New Era, has led China out of the path of socialism
with Chinese characteristics for a new era. Under the influence of the new crown virus
in the global economy, China has assumed its due responsibilities to support global
economic development and welcomes all countries to carry out win-win cooperation
with China to tide over the difficulties together. This requires that the enterprises joining
the FTP also have the spirit of responsibility, peace, and sustainable development and
mutually agree with the ideology of socialism with Chinese characteristics in the new
era. Only in this way can the FTP blockchain information service platform provide better
information compensation services.

(3) The consensus of the FTPGSC enterprises on the information on the chain

Since the construction of the FTP is China’s latest open model, companies from all
over the world can participate in it. At this time, due to differences in ideology, ideol-
ogy, politics, religious beliefs, policies and regulations, etc., there will be awareness of
appropriate information on the chain difference. After studying in the FTP, in order to
better play the role of the FTPGSC and achieve corporate strategic goals, the chain enter-
prises will use this as a basis to reach a consensus. The consensus that the appropriate
information in the blockchain is on-chain is the guarantee of the FTP’s blockchain infor-
mation platform for the stable progress of information compensation services between
enterprises in the FTPGSC. If there is no consensus on information on the chain when
joining the FTP, information compensation services will be difficult to develop, which
will affect the role of the FTPGSC.

(4) Improvement of the blockchain information platform of FTP

At present, in most cases, companies or alliances build their blockchain platforms,


while the construction of FTP is carried out under the overall leadership of the party.
Therefore, there are sufficient funds and technical strength to ensure the construction of
the FTP blockchain information platform. At the same time, with the steady progress
of the policy of “a million talents entering Hainan,” various talents and funds from
China and the world will be attracted to Hainan. At this time, the construction of the
Global Supply Chain Information Compensation Model Based on FTPBIP 295

blockchain information platform will inevitably lead the self-built enterprise platform
from the perspective of perfection, and the services launched will also reflect Chinese
characteristics, thereby ensuring the development of the FTPGSC information compen-
sation service and also improving The brand effect of the FTP blockchain information
platform. Therefore, the completeness of the Hainan FTP’s blockchain information plat-
form will directly affect the information compensation services enjoyed by the FTPGSC
enterprises.

(5) FTP blockchain information compensation method

Relying on the FTP to build a global supply chain will form a complex network, and
the network will include all walks of life, and the information flows formed by different
industries will also be intertwined. This requires measuring the degree of demand for
various types of information by supply chain companies. The information of the FTP
blockchain information platform originates from the enterprises that have joined the FTP,
so the degree of contribution of the enterprise to the information will become its bargain-
ing chip for information compensation. Here is a definition of an “FTP data currency”.
The data currency can be used to obtain corresponding information compensation, which
has reached the data exchange of the FTPGSC enterprises. The corresponding data cur-
rency will be given according to the big data 5V model to evaluate the company’s
information supply situation. At the same time, the data currency can also be traded to
give full play to the liquidity of data and currency to give full play to the role of data.

3.3 Mechanism of Information Compensation

The information in the blockchain information platform of the FTP is composed of global
supply chain companies that rely on the FTP to contribute their data to the appropriate
chain. With the support of blockchain technology, supply chain information sharing
can be better realized. The service of information compensation in the FTPGSC can
help companies optimize upstream supply and downstream distribution and improve the
overall efficiency of the company and the supply chain. Figure 3 shows a schematic
diagram of the information compensation mechanism of the FTPGSC.
Ensure fair transactions between enterprises. The FTPGSC includes suppliers, man-
ufacturers, manufacturers, distributors, and retailers. The information flow formed can
include supply information provided by suppliers, manufacturers, and manufacturers
and demand information provided by distributors and retailers. Because it is an informa-
tion platform built with blockchain technology, the storage and use of this information
are anonymous, encrypted, available and invisible. The uniqueness of the product can
be guaranteed through the information sharing of the data on the chain. When a global
supply chain company built on the FTP joins the blockchain information platform, the
companies on the chain can use the platform certification system to verify the validity
of the information on the supply chain, such as commodities, funds, and logistics.
296 S. Tian et al.

Fig. 3. FTPGSC information compensation mechanism.

The global supply chain formed between enterprises relying on the FTP can better
share data based on trust and cooperation. To obtain higher benefits for the FTPGSC, the
information compensation service provided by the FTP blockchain information platform
can be used to compensate for the missing information in upstream and downstream of
the FTPGSC. At this time, companies are required to make corresponding contributions
to the platform, that is, to analyze corporate data to obtain “FTP data currency.” After
obtaining a certain amount of data currency, you can enjoy the data compensation service
brought by the platform. These compensation data can be data on the supply chain,
industry data, or the supply chain network. In this way, the data of the entire platform
can flow quickly, forming an FTP data circulation system. This can expand the scale
of platform enterprise users and expand the scale of their information and realize the
common growth of the platform and users in a virtuous circle.
When the blockchain information service platform operates normally, it can be
extended to other related industries based on information sharing in the FTPGSC. For
example, the blockchain information service platform can provide information compen-
sation services for supply chain information sharing and provide financial service insti-
tutions with complementary information compensation services to ensure the credibility
and feasibility of corporate financing. For another example, the blockchain informa-
tion service platform can provide necessary information compensation for safe produc-
tion, allowing insurance institutions to obtain necessary safe production data based on
the information compensation service to determine whether the company’s production
behavior is compliant.
Therefore, when the FTP blockchain information platform gathers a sufficient num-
ber of companies and their related data, it can enter the self-generation mode to form vari-
ous information compensation paths and become a powerful assistant in the construction
and development of the FTP.
Global Supply Chain Information Compensation Model Based on FTPBIP 297

4 Model Construction of FTPSC Information Compensation Based


on Blockchain

According to the information compensation service content, this paper designs a three-
layer model of information compensation for the blockchain information service plat-
form. The three-layer structure of the model is the infrastructure layer, the core busi-
ness layer, and the entity application layer. Figure 4 shows the three-layer model of
information compensation.

Fig. 4. Information compensation three-layer model.

4.1 Infrastructure Layer

The infrastructure layer of the information compensation model is the data acquisition
and storage of business flow, capital flow, logistics, and information flow of each node in
the FTPGSC. The data source is mainly the data voluntarily shared by the enterprises on
the chain after joining the FTPBIP. The data to be shared is minimized on the chain using
the FTPBIP. The enterprise stores four streams of information as a distributed account
298 S. Tian et al.

to ensure the authenticity of the data. Because the blockchain has the characteristics
of information that cannot be tampered with, it can help consumers and enterprises to
verify products through the FTPBIP. It can enhance the credibility and brand value of
the enterprise.
Through the infrastructure layer, suppliers, manufacturers, manufacturers, distribu-
tors, and retailers can be effectively linked to form an FTPGSC and realize information
sharing.

4.2 Business Development Layer

The core business layer of the information compensation model is the integration of
data processing, data currency exchange, data transaction, and information compensa-
tion. For the FTPSC companies that have joined the FTPBIP, the information they share
mainly includes upstream and downstream logistics information, transaction informa-
tion, production information, credit information, etc. Some of the information is private
to the company. The above information can be effectively shared under the principle that
the data is appropriately placed on the blockchain without affecting the core competi-
tiveness of the enterprise. The FTPBIP will use data processing technology to process
the data as anonymous, immutable, and encrypted and then upload it to the blockchain.
The information provider uses blockchain data to exchange the corresponding data cur-
rency. Enterprises can use data currency for information transactions and information
compensation services. With the continuous accumulation of data in the supply chain
and industry data, the FTPBIP can develop data derivative services. For example, the
development of information compensation services in the financial industry can help
companies measure and judge the credibility of partners more accurately.

4.3 Physical Application Layer

The entity application layer of the information compensation model integrates supply
chain enterprises’ applications and related industries’ applications. The application of
supply chain enterprises is mainly to improve the performance of the supply chain
and provide information compensation services to the upstream and downstream of the
supply chain. Corresponding applications include information query, statistical informa-
tion analysis, data currency exchange, data transaction, and information compensation.
FTPSC companies can use corresponding apps to carry out specific businesses. The
applications of related industries are mainly aimed at the financial industry, insurance
industry, security management, human resource management, etc. The feature of the
application is to provide corresponding information compensation based on the needs
of different industries based on ensuring sufficient data and to help it better serve the
FTPSC. With the help of blockchain technology and information compensation services,
information sharing in the FTPSC will likely be resolved. By breaking through all sup-
ply chain links and inter-industry data barriers, FTPBIP can provide FTPSC enterprises
with more valuable services.
Global Supply Chain Information Compensation Model Based on FTPBIP 299

5 Conclusion
This paper proposes a blockchain-based information compensation model for the
FTPGSC under the construction of the FTP. In order to improve the overall perfor-
mance of the supply chain, information compensation can be used to compensate supply
chain companies for solving the problem of supply chain information sharing. Through
research, the factors and influencing factors of information compensation in the FTPGSC
are obtained. On this basis, the mechanism of information compensation is discussed.
Through the construction of a three-tier model of information compensation, it systemat-
ically describes the information compensation system of the FTPGSC. It provides some
theoretical basis for the reconstruction of the FTPGSC.
In the construction of the FTP, the application research of blockchain technology
is still in its infancy, and related applications are still being improved. The informa-
tion compensation model we constructed still has shortcomings, and further research is
needed in the future. For example, the issuance and use of data currency has not yet been
studied, and interested scholars can do in-depth research in this area.

Acknowledgments. This work was supported by the 2021 Philosophy and Social Science Plan-
ning Project of Hainan Province of Chain (HNSK(ZX)21-87, Research on the safe and orderly flow
mechanism of Hainan Free Trade Port data under the blockchain).This work was also supported
by Hainan Provincial Natural Science Foundation of China (720RC567, Research on Information
Compensation Mechanism of Free Trade Zone to Global Supply Chain Innovation Coordination
under BlockChain Technology, and 720RC569, Tourism Value Chain Distribution and Ecological
Optimization Mechanism of Hainan International Tourism Consumption Center Based on System
Dynamics). This work was supported by the Humanities and Social Sciences Research Innovation
Team of Hainan University, Hainan Free Trade Port Cross-border E-commerce Service Innovation
Research Team (HDSKTD202025).

Authors’ Contributions. Shaoqing Tian was responsible for proposing the overall idea and
framework of the manuscript. Fan Jiang was responsible for the writing of the first draft of the
manuscript and translation. Chongli Huang was responsible for the revision of the manuscript and
proofreading. Shaoqing Tian, Fan Jiang, and Chongli Huang contributed equally to this paper and
are regarded as the first authors.

References
1. Srinivasan, K., Kekre, S., Mukhopadhyay, T.: Impact of electronic data interchange technology
on JIT shipments. Manag. Sci. 40, 1291–1304 (1994)
2. Lee, H.L., Padmanabhan, V., Whang, S.: The bullwhip effect in supply chain. Manag. Sci.
38(3 R), 93–102 (1997)
3. Wang, C.G.: Logistics and information flow management in the supply chain. Chin. Manag.
Sci. (04), 17–24 (2000)
4. Nakasumi, M.: Information sharing for supply chain management based on block chain
technology. In: 2017 IEEE 19th Conference on Business Informatics, pp. 140–149 (2017)
5. Bhattacharyya, K., Smith, N.: Antecedents to the success of blockchain technology adoption
in manufacturing supply chains. In: 2018 2nd International Conference on Business and
Information Management, pp. 64–67 (2018)
300 S. Tian et al.

6. Alzahrani, N., Bulusu, N.: Block-supply chain: a new anti-counterfeiting supply chain using
NFC and blockchain. In: CRYBLOCK 2018-Part of MobiSys, pp. 30–35 (2018)
7. Naidu, V., Mudliar, K., Naik, A., Bhavathankar, P.: A fully observable supply chain man-
agement system using blockchain and IOT. In: 2018 3rd International Conference for
Convergence in Technology, pp. 84–89 (2018)
8. Li, X., L, Z.G.: Intelligent governance mechanism of supply chain based on blockchain
technology. China Circ. Econ. 31(11), 34–44 (2017)
9. Yang, H.Q., Sun, L., Z, X.C.: Building mutual trust and win-win supply chain information
platform based on blockchain technology. Sci. Technol. Prog. Policy 35(05), 21–31 (2018)
10. Zhang, X.Q.: Optimization of supply chain management model based on blockchain. China
Circ. Econ. 32(08), 42–50 (2018)
11. He, C., L, Y.F.: Research on a new supply chain model integrating blockchain. Manag.
Modernization 40(01), 84–87 (2020)
12. Hoberg, K., Thonemann, U.W.: Modeling and analyzing information delays in supply chains
using transfer functions. Int. J. Prod. Econ. 156, 132–145 (2014)
13. Nie, J.J.: The impact of retailer information sharing on the recycling mode of closed-loop
supply chain. J. Manag. Sci. 16(05), 69–82 (2013)
14. Shi, C.L., Nie, J.J.: The influence of network externalities on the information sharing of
dual-channel supply chain. Chin. Manag. Sci. 27(08), 142–150 (2019)
15. Zhang, Z.: From “information compensation” to “conforming translation”. Foreign Lang.
Stud. (01), 63–64 (1998)
16. Cao, M.L.: “Translation studies also need translation”——re-discussion on misreading and
mistranslation in the process of introducing western translation theories. Foreign Lang. Stud.
(03), 67–74 (2012)
17. Shang, W.G., Wu, H., Zhao, B.: Mobile user group mining algorithm with information
compensation. Comput. Eng. Appl. 46(11), 226–229 (2010)
18. Jin, Z.Y., Han, L.G., Hu, Y., Ge, Q.X., Zhang, P.: Data-driven Marchenko imaging based on
low-frequency information compensation. Chin. J. Geophys. 60(09), 3601–3615 (2017)
Multi-layer Intrusion Detection Method Using
Graph Neural Network

Ling Ma(B)

Department of Information Engineering, Heilongjiang International University,


Harbin 150025, China
lingmaml@sina.com

Abstract. Aiming at the problem that many unknown intrusion behaviors are dif-
ficult to detect, resulting in the low security of deep belief network, a multi-layer
intrusion detection method based on graph neural network is proposed. Firstly,
the elm classifier is designed to obtain multi-layer intrusion data with inconsistent
range and uneven distribution; secondly, the deep belief network intrusion detec-
tion model is constructed by combining the monitoring, collection, preprocessing,
classification, and response modules; finally, the multi-dimensional state vector is
set, and the model detection results are divided based on the graph neural network
to realize the classification of multi-layer intrusion data and optimize the deep
belief network The result of network multi-layer intrusion detection. Experimen-
tal results show that the proposed method can identify all unknown multi-layer
intrusion behaviors with low false alarm rates and improve network security. The
detection process is more targeted to optimize the false alarm rate and accuracy
of the detection results.

Keywords: Graph neural network · Multi-layer intrusion behavior · Detection


method

1 Introduction

As a public space, many personal information leakage events have occurred in recent
years. Therefore, deep intrusion detection of the network has become an important
premise for the development of information technology at this stage [1, 2]. Aiming at
the problem of network intrusion, the concept of intrusion detection was first proposed
abroad in 1980. Intrusion refers to all acts that attempt to destroy the privacy of network
resources and network security mechanism [3, 4]. Therefore, Dorothy et al. designed an
intrusion detection expert system ides from 1980 to 1990. Based on machine learning
algorithm and deep learning algorithm, a domestic network intrusion detection method is
established. Since the beginning of the 21st century, the network coverage has gradually
increased, and the network vulnerability data is also increasing year by year.
Literature [5] constructs a rotating forest algorithm based on SPCA, which increases
the integration diversity by introducing rotation, and solves the insufficient strength of the
base classifier. Literature [6, 7] combines the number of layers of neural network hidden

© The Author(s), under exclusive license to Springer Nature Switzerland AG 2022


A. E. Hassanien et al. (Eds.): BIIT 2021, LNDECT 107, pp. 301–308, 2022.
https://doi.org/10.1007/978-3-030-92632-8_29
302 L. Ma

layer and the number of neuron nodes in each layer to improve the weak adaptability
of the detection algorithm. It adopts the strategy of layer by greedy layer training to
establish a feature extraction model to detect network intrusion data. Literature [8]
discusses in detail the main progress and shortcomings of existing network intrusion
detection research from three aspects: undirected unit graph recommendation, undirected
binary graph recommendation, and undirected multivariate graph recommendation, and
defines the interpretability of GNN recommendation and other research directions of
GNN recommendation in the future.

2 Multi-layer Intrusion Detection Method


2.1 Design ELM Classifier
Elm classifier is a neural network supervised learning algorithm based on Moore Penrose
generalized inverse matrix theory, fully called limit learning machine (ELM) algorithm.
Before the network training, elm classifier obtains the input weight between the input
layer and the hidden layer by random generation. It obtains the random hidden layer offset
at the same time. At the beginning of training, only given the total number of actual nodes
contained in the hidden layer can the hidden layer output matrix be obtained. Therefore,
the training problem is transformed into a least-squares problem, and the output weight
is calculated using the generalized inverse matrix [9, 10].
Because the elm classifier only needs to calculate the output weight to obtain the
unique optimal solution, the elm classifier designed in this paper has good training speed
and generalization ability.
Suppose the number of nodes in the input layer and output layer is m and n respec-
tively, and the number of nodes in the hidden layer is k, The bias of the i hidden layer
neuron is represented by ui , then the calculation formula of sample input X and sample
target output Y is:
⎡ T⎤ ⎡ T⎤
x1 y1
⎢ T⎥ ⎢ T⎥
⎢ x2 ⎥ ⎢ y2 ⎥
X =⎢ ⎥
⎢ · · ·⎥ ,Y = ⎢ ⎥
⎢ · · ·⎥ (1)
⎣ ⎦ ⎣ ⎦
xNT N ×m yNT N ×n
Assuming that the actual output of the deep belief network is Z, there is:

k
zj = αi g ωi xj + ui (2)
i=1

Then the corresponding output matrix is obtained according to the above calculation,
and the formula is:
⎡ ⎤
g(ω1 x1 + u1 ), g(ω2 x1 + u2 ), . . . , g(ωk x1 + uk )
⎢ g(ω1 x2 + u1 ), g(ω2 x2 + u2 ), . . . , g(ωk x2 + uk ) ⎥
A=⎢ ⎣

⎦ (3)
...
g(ω1 xN + u1 ), g(ω2 xN + u2 ), . . . , g(ωk xN + uk ) N ×K
Multi-layer Intrusion Detection Method Using Graph Neural Network 303

2.2 Building Intrusion-Detection Model

The optimized elm classifier is used to identify the basic patterns of intrusion behavior,
and a deep belief network intrusion detection model is constructed with the goal of
synthesizing a few category samples, as shown in Fig. 1.

Fig. 1. Intrusion detection model

In Fig. 1, the intrusion detection model is composed of monitoring module, acqui-


sition module, preprocessing module, classification module and response module. The
data input port of the model is the monitoring unit to capture the intrusion behavior
information of the deep belief network. The collection unit collects intrusion behavior
data characteristics, converts the collected information into standard format, and then
transmits the standard data to other units of intrusion detection.
The volume of network behavior information is huge, the data dimension is high, and
normal behavior accounts for a large proportion in the overall data. The identification
and extraction of abnormal behavior data can be divided into five categories through the
preprocessing unit and with the help of elm classifier: normal state, denial of service
attack state, remote unauthorized attack state The status of the right lifting attack and
the status of the port scan data.
At the same time, the range of each data is inconsistent and the distribution is uneven.
Therefore, these different state data are numerically processed, in which the value of
transmission control protocol (TCP) is 0; The user datagram protocol (UDP) value is 1;
The value of Internet control message protocol (ICMP) is 2.
The numerical characteristic attributes are obtained after the numerical processing
of the data, but the numerical ranges are different at this time. The logarithm of the
numerical data is obtained by normalization processing to control the numerical range
between 0 and 1. The normalization formula is:
q − min
q= (4)
max − min
304 L. Ma

Aiming at the problem of uneven distribution of data sets, the classifier elm is used
to synthesize a few category samples, improve the category balance, and complete the
preprocessing and classification of intrusion data. The response module obtains feed-
back according to the data classification results, including active response and passive
response.

2.3 Partition Model Detection Results Based on Graph Neural Network


According to the model response data, each node Pi is represented by its feature Fi and
associated with the label output by the model.

1. Assuming that some markers are represented in graph I (P, E), the unlabeled nodes
are predicted according to the labeled nodes. Assuming that the vector of each node
is Hb under the v-dimensional state vector, there is:
Hb = f (Fb , Fa , Hc , Fc ) (5)
2. Let f (Fb , Fa , Hc , Fc ) be the mapping function to map the input node features to
the v-dimensional vector space. According to the above formula, the v-dimensional
state vectors of all nodes are obtained to describe the characteristics of connecting
nodes and the interaction information between other nodes.
3. It is calculated that the output signal of the interaction task Z of a single connection
node is:
⎡ ⎤
Z11 Z12 Z13 · · · Z1k
⎢ Z21 Z22 Z23 · · · Z2k ⎥
⎢ ⎥
⎢ ⎥
Z = Z1 Z2 Z3 · · · Zn = ⎢ Z31 Z32 Z33 · · · Z3k ⎥
Y
(6)
⎢. . . . ⎥
.
⎣. . .. . .
. ⎦
Zn1 Zn2 Zn3 · · · Znk

According to the above calculation, the control output signal is related to the output
signal of the functional section and the synchronization difference between adjacent dif-
ferent functional sections in the interaction process, so that the synchronization difference
between adjacent different functional sections is w.
assumptions:
Wn = {Wn1 , Wn2 , Wn3 , · · · , Wnk } (7)
4. At the same time, since formula (5) obtains the unique solution, iterative formula
(5) obtains.

3 Experimental Analysis and Results


3.1 Experimental Data and Environment
1. Data source: KDD99 data set is sorted out by DARPA’s intrusion detection project.
It is known that KDD99 data set contains 493957 pieces of data. This data set is used
as the basic condition of the experimental test.
Multi-layer Intrusion Detection Method Using Graph Neural Network 305

2. Experimental environment: build a simulation test environment. The operating sys-


tem is windows7 and Linux, and the processor is Intel Core i5-5200u CPU@2.2 GHz
The computer has 8 GB memory and 500 GB hard disk capacity. The software
environment is Python 3.5.3 and the depth learning tool is PyTorch 1.4.0.
3. Experiment standard: set experiment evaluation standard: where, T1 and T2 respec-
tively represent the number of samples correctly judged as normal category and
abnormal category; T3 , T4 represents the actual number of normal category samples
wrongly judged as abnormal category and the actual number of abnormal category
samples wrongly judged as normal category.

Then the evaluation results of the accuracy rate s1 , detection rate s2 and false alarm
rate s3 of the detection method,

3.2 Experimental Index

Two groups of traditional detection methods (literature [3], literature [4] and literature
[5]) were selected as the control object. Two test groups were established with the
proposed detection method. The above calculation formula determined the accuracy,
detection rate and false alarm rate of each group of methods, and the differences between
the two groups of methods were compared.

1. Accuracy rate

According to L =
p 
i=1 y log(G(x)). When the value of L is less than 0.4, the
feedback intrusion data is classified as class A; When the value of L exceeds 0.4, the
intrusion data is classified as class B, which corresponds to different detection mecha-
nisms respectively. The ratio of the number of relevant samples in the identification or
detection results of the three methods to the total number of samples in the results is
obtained for comparative test.

2. Classification effect

The classification data obtained from the above experimental test results are derived.
Among them, the L value obtained by the experimental group according to formula (9)
is 0.5745, which is a positive number. Therefore, the first quadrant classification data is
taken as the classification effect data sample, the classification effect comparison test is
carried out, and the comparison results are derived.

3.3 Results and Discussion

Taking the method of literature [3] as group A, the method of literature [4] as group B
and the method of literature [5] as group C, the test results of accuracy rate and iteration
times, classification effect and false positive rate were compared and analyzed.
306 L. Ma

1. Influence analysis of accuracy and iteration times.

Taking the proposed method as the experimental group, Fig. 2 shows the change of
accuracy rate of each test group under different iteration times.

Fig. 2. Effect of iteration times on accuracy

According to the above test results, the accuracy of the detection method in the
experimental group has stabilized at 91.37% since the 92nd iteration; In control group
A, the accuracy rate was controlled at 91.02% from the 167th time; group B and group
C failed to control the accuracy within a reasonable range within 250 iterative tests, and
their fluctuation was large. In order to obtain accurate experimental data, the number
of iterations was expanded to 500. It was found that the accuracy rate of group B was
stable at 90.7% at 307th time.
It can be seen that although the iteration times of the three groups of methods are
inconsistent, the accuracy is relatively close, and can be used as the same comparison
test group.

2. Comparison of false positive rate test results

According to statistics, it is found that the third round of test results of group A are
empty, which is caused by human operation errors after investigation. Therefore, after
removing the third round of test results of group A, the false alarm rate of the three
groups of methods is lower than that of the experimental group, which verifies that the
method in this paper has high practical applicability. It is shown in Table 1.
Multi-layer Intrusion Detection Method Using Graph Neural Network 307

Table 2. Test results of false positive rate/%

Inspection rounds The paper Group A Group B Group C


Round 1 0.75 1.79 1.94 1.74
Round 2 0.84 1.68 2.53 1.15
Round 3 0.79 0.00 2.45 1.70
Round 4 0.88 1.36 2.48 1.74
Round 5 0.87 1.54 2.54 2.48
Round 6 0.86 1.27 2.50 2.45
Round 7 0.75 1.99 1.89 1.54
Round 8 0.84 1.74 2.32 1.27
Round 9 0.76 1.15 2.16 1.36
Round 10 0.85 1.70 1.97 1.97

4 Conclusion
This paper constructs a deep trust network intrusion detection model, divides the detec-
tion results of the model, realizes the classification of multi-layer intrusion data, and
further optimizes the detection results of the model. The ELM classifier is used to ran-
domly generate the input weight and hidden layer neuron bias. The ELM classifier is
used to overcome the problems of easy falling into local optimization and slow training
speed, and improve the overall classification effect of the detection method.
However, when the data volume of deep belief network is further expanded, the accu-
racy decreases. Therefore, in the future, we can strengthen the analysis of the correlation
between nodes and get more accurate intrusion information classification results.

References
1. Tang, L., Li, H.X., Yan, C.Q.: Overview of deep neural network structure search. Chin. J.
Image Graph. 26(2), 245–264 (2021)
2. Liu, F., Yang, A.Z., Wu, Z.W.: Adaptive siamese network based UAV target tracking algorithm.
Acta Aeronautica ET Astronautica Sinica 41(1), 248–260 (2020)
3. Yang, T., Ye, X.: Network intrusion detection algorithm SPCA erof. Comput. Eng. Des. 42(2),
356–362 (2021)
4. Song, Y., Hou, B.N., Cai, Z.P.: Network intrusion detection method based on deep learning
feature extraction. J. Huazhong Univ. Sci. Technol. (Nat. Sci. Ed.) 49(2), 115–120 (2021)
5. Wu, G.D., Cha, Z.K., Tu, L.J.: Research progress of neural network recommendation. J. Intell.
Syst. 15(1), 14–24 (2020)
6. Ren, J.D., Liu, X.Q., Wang, Q.: An multi-level intrusion detection method based on KNN
outlier detection and random forests. J. Comput. Res. Dev. 56(3), 566–575 (2019)
7. Zhong, M., Zhou, Y., Chen, G.: Sequential model based intrusion detection system for IoT
servers using deep learning methods. Sensors 21(4), 1113–1116 (2021)
8. Yan, X.L., Wang, B.B.: Fault diagnosis of analog circuits based on elm optimized by an
adaptive wolf pack algorithm. Comput. Eng. Sci. 41(2), 246–252 (2019)
308 L. Ma

9. Wang, X.L., Xie, H.Y., Wang, J.J.: Prediction of dam deformation based on bootstrap and
ICS-MKELM algorithms. J. Hydroelectric Eng. 39(3), 106–120 (2020)
10. Zhang, X.M., Cao, G.Q., Chen, Z.Q.: Landslide displacement prediction based on AdaBoost-
PSO-ELM algorithm. Appl. Electron. Tech. 56(3), 566–575 (2021)
Research on an Image Hiding Algorithm
with Symmetric Means

Hongxin Wang1,2 , Hui Li1,2(B) , and Xue Tan1


1 Harbin University of Commerce, Harbin 150028, China
2 Heilongjiang Provincial Key Laboratory of Electronic Commerce and Information Processing,

Harbin 150028, China

Abstract. Under the premise of comprehensively researching various image hid-


ing algorithms, this paper proposes a new image hiding algorithm—the mean
symmetric algorithm with lower distortion and higher security. This algorithm
calculates the mean value and variance of the secret image block, then com-
presses the secret image block, uses the symmetry method to scramble the secret
image block and convex combination to hide the secret image in the carrier image,
making it a camouflage image. Through the analysis of experimental results, the
method is simple to implement, which greatly improves the security of the image
hiding system and the authenticity and robustness of the hidden image.

Keywords: Image hiding · Image block · Security · Symmetric means

1 Introduction
With the rapid development of network technology, multimedia technology, and com-
munication technology in the information age, information hiding technologies repre-
sented by steganography, digital watermarking technology, visual passwords, submarine
channels, and concealment protocols have been widely used in the protection of digital
intellectual property rights, Multimedia information authentication, data integrity verifi-
cation, data non-repudiation confirmation, and covert transmission and other information
security fields. In information hiding technology, images as the carrier for the concealed
transmission of information has become a popular, secure communication technology.
In recent years, people have proposed various image information hiding technologies
and methods, such as: to recover the embedded confidential images completely. Ranzan
Wang et al. proposed an image hiding method through the best LSB replacement and
genetic algorithm in 2001 [1]; YCHu’s image hiding scheme based on vector quan-
tization (VQ) in 2003; however, the secret image embedded in this method cannot be
completely extracted [2]. To improve the quality of the camouflage image, SukLing et al.
[5] proposed a simple LSB replacement method through the optimal pixel adjustment
program. Chengyun Yang et al. [4] proposed a lossless information hiding method based
on wavelet domain spread spectrum in 2004, and this method embeded data into the high-
frequency subband coefficients of the image after integer wavelet transforms according

© The Author(s), under exclusive license to Springer Nature Switzerland AG 2022


A. E. Hassanien et al. (Eds.): BIIT 2021, LNDECT 107, pp. 309–318, 2022.
https://doi.org/10.1007/978-3-030-92632-8_30
310 H. Wang et al.

to the principle of spread spectrum by embedding pseudo-information to solve the prob-


lem that it is impossible to determine which coefficients are embedded with watermark
information when extracting information [3]. Yizhan et al. [4] proposed a large-capacity
lossless data image hiding method based on a wavelet. This method is based on the
different sensitivity of human vision to each frequency sub-band in the wavelet domain.
Large-capacity data is embedded in the lower bit plane of the wavelet coefficients. This
method can ensure that the image after the embedded data is less visually distorted.
Still, the embedded data can be restored losslessly, and the carrier image can also be
restored losslessly [4]. Rongfu Zhang et al. [6] proposed a time-domain error conceal-
ment algorithm based on the mean and variance of inter-frame matching. The algorithm
first performs inter-frame matching with several rows or columns of pixels adjacent to
the error block’s upper, lower, left, and right. Then according to the matching, The mean
and variance of the results discriminate the motion status of the image in the error block
and select the appropriate hiding method for different motion conditions to hide the error
block [6]. Hongmei Wang and others proposed a triangle algorithm for image hiding in
2010. The algorithm cleverly used the relationship of right-angled triangles to hide the
secret image and achieved better results [7]. In 2012, Hongxin [8] proposed a complex
number algorithm for image hiding, which improved the security of the image hiding
system, the authenticity, and robustness [8].
This paper proposes a new method of image hiding. By exploring the relationship
between the cover-image block Ck and the secret-image block Sk, the quality of the
camouflage image Ck (stego-image) can be obtained. Firstly, the mean and variance of
the secret image block should be calculated, and the secret image should be calculated.
We use the symmetry method to scramble the secret image and use the convex combina-
tion to hide it in the carrier image Ck, making it a camouflage image Ck and rounding
and repairing the overlay image block, finally using a simple LSB replacement method
to embed the secret information into the overlay image. The experimental results show
that the scheme in this paper can more effectively improve the quality of the camouflage
image, and the secret image can be completely restored.

2 Design of Algorithm
2.1 Overview of Method

The image discussed in this article is an 8-bit standard gray value image. Let C be a
cover-image with a size of Mc × Nc pixel (cover-image), and S indicates a secret image
with a size of Ms × Ns pixel (secret-image), which will cover the image respectively C
and the secret image S are divided into image blocks with I × J pixels, and the carrier
image C and the secret image S can be expressed as:
   
×Nc ×Ns
C = Ck |0 ≤ k ≤ MIc×J , S = Sk |0 ≤ k ≤ MIs×J (1)

Let Ck be the camouflage image after embedding the secret image. The specific
process is shown in Fig. 1 and Fig. 2:
Research on an Image Hiding Algorithm with Symmetric Means 311

Divide the image

Camouflage image
Image block matching

Embed secret image Extract the secret image

Camouflage image Restore secret image

Fig. 1. Image embedding Fig. 2. Image extraction

2.2 Algorithm Introduction

Mean Compression. Let: Ck = ( Cij)mxn be the cover image; Sk = ( Sij) be the secret
image, (i = 1,2,…,m; j = 1,2,…,n).

Let: mean E( Sk) = N1 Sij,and ij = Sij— E(Sk).
If ij ≥ 0, then Sij = Sij/Max(Sij);
If ij < 0, then Sij = Min(Sij)/Sij.
So the secret image is compressed into a new image: Sk = ( Sij ) mxn (i = 1,2,…,m; j
= 1,2,…,n).

Symmetry.
(1) x-axis symmetry.
The image block Sk = (Sij ) mxn (i = 1,2,…,m; j = 1,2,…,n), expressed in a
rectangular coordinate system as follows:
➀ If m is an even number, let: a = m/2, and subscript i1 belongs to (1., a). The
symmetry point i2 of subscript i1 has a — i1 = i2 — (a + 1), and we get: i2 = 2a—i1
+ 1. (E.g:)
Therefore, the pixels S i1 j1  < —— > S i2 j1  , can be swapped. (i1 = 1, 2, ---, a; i2 =
2a — i1 + 1; j1 = 1, 2, …, n) (Fig. 3).
➁ If m is an odd number, let: a = (m + 1)/2, and the subscript i1 belongs to (1., a).
The symmetry point i2 of subscript i1 has a—i1 = i2—a, and we get: i2 = 2a—i1 + 1.
(E.g:)
Therefore, it is possible to swap pixels S i1 j1 ’ < —— > S i2 j1 ’ (i1 = 1, 2,…, a; i2 =
2a—i1 + 1; j1 = 1, 2,…, n).
➂ The subscripts i2 = m, m − 1,…,a correspond to the subscripts p = 1, 2,…, a
respectively.
Then subscript i1 = a—1, a—2,…, 2, 1 correspond to the subscript p = a + 1, a +
2,…, m respectively.
Then: S p j1  = S i2 j1  ( p = 1, 2,…, a; i2 = n, n − 1,…, a; j1 = 1, 2,…, n);
312 H. Wang et al.

Fig. 3. Symmetry points of image element subscript (i, j)

S p j1  = S i1 j1  ( p = a + 1, a + 2,…, n; i1 = a − 1, a − 2,…, 2, 1; j1 = 1, 2,…, n).


Then there are pixel blocks: (S p j1  ) (where: p = 1, 2,…, m; j1 = 1, 2,…, n).
(2) y-axis symmetry.
➀ If n is an even number, let: b = n/2, and subscript j1 belongs to (1., b). The
symmetry point j2 of subscript j1 has b—j1 = j2—(b + 1), and we get: j2 = 2b—j1 +
1. (E.g:)
Therefore, the pixels S p j1 ’ < —— > S p j2 ’ can be swapped. (p = 1, 2,…, a; j1 = 1,
2,…, b; j2 = n, n − 1,…, b).
➁ If n is an odd number, let: b = (n + 1)/2, and subscript j1 belongs to (1., b). The
symmetry point j2 of subscript j1 has b—j1 = j2—b, and we get: j2 = 2b—j1. (E.g:)
Therefore, the pixels S p j1 ’ < —— > S p j2 ’ can be swapped. (p = 1, 2,…, a; j1 = 1,
2,…, b; j2 = n, n−1,…, b).
➂ The subscript j2 = n, n−1,…, b correspond to the subscript q = 1, 2,…, b
respectively.
In the subscript j1 = b−1, b−2,…,2,1 correspond to the subscript q = b + 1, b +
2,…,n respectively.
Then there are pixel blocks: (S p q  ) = (S p j2  ) (p = 1, 2,…, m; q = 1, 2,…, b; j2 = n,
n -1,…, b).
Let: SS k = (S p q  ) (where: p = 1, 2,…, m; q = 1, 2,…, n).
Then another new secret image SS k = (S p q  ) is obtained by the symmetry method.

Convex Combination. Use convex combination to construct hidden image Ck = α*


Ck + β* SSk (where: α + β = 1).

Mean Symmetric Algorithm. The steps of algorithm are as follows:

(1) Divide the cover image C and the secret image S into:C 1 , C 2, …, C N ; S1, S2 , …,
SN ;
(2) Calculate the average value of the secret image block E(Sk );
(3) The secret image block Sk is compressed into a new secret image block Sk  , namely.

If ij≥0, then S ij  = S ij /Max(S ij );


If ij < 0, then S ij ’ = Min(S ij )/S ij .
Research on an Image Hiding Algorithm with Symmetric Means 313

Then a secret image block can be obtained: S k  = (S ij  ), (i = 1, 2,…, m; j = 1, 2,…,


n).

(4) Symmetry: swap the pixels S i1j1 ’ < —— > S i2j1 ’ and S i2j1 ’ < —— > S i2j2 ’. The
new pixel obtained is denoted as SS ij (i = 1, 2,…, m; j = 1, 2,…, n).
Then get the secret image: SS k = (SS ij )mxn.
(5) Order

Ck = α* Ck + β* SSk
Among them α + β = 1 (can choose α = 0.999, β = 0.001).
So C k ’ is a new hidden image with secret image information.

(6) Combine image blocks to get a hidden image C  = ∪ C k  .

Mean Symmetric Inverse Algorithm

Ck  = α ∗ Ck + β ∗ SSk
(1) Solve by
SSk = (Ckα ∗ Ckα)/β.
(2) Solve S k  from SS k .
(3) When ij ≥ 0, S ij = S ij ’Max(S ij ).When ij < 0, S ij = Min(S ij )/S ij ’. Then there is
S k = (S ij ).
(4) Restore image S = ∪ S k .

3 Numerical Examples
There are examples to illustrate the process of image hiding and restoration. Firstly 256
× 256 pixels Airplane is the overlay image, and 128 × 128 pixels Lena is the secret
image, dividing them into 8 × 8 pixels blocks. The overlay image will be divided into
256 × 256/8 × 8 = 1024 blocks, and the secret image will be divided into 128 × 128/8
× 8 = 256 blocks.

3.1 Instance Data

An 8 × 8 block of the cover image and the secret image is shown, respectively:
⎡ ⎤
127 176 176 176 177 191 177 191
⎢ 127 176 176 176 176 177 176 176 ⎥
⎢ ⎥
⎢ 127 176 177 177 176 176 177 177 ⎥
⎢ ⎥
⎢ ⎥
⎢ 127 159 158 159 159 158 159 159 ⎥
Ck = ⎢ ⎥
⎢ 132 159 159 151 152 152 149 152 ⎥
⎢ ⎥
⎢ 127 176 158 158 159 151 152 151 ⎥
⎢ ⎥
⎣ 121 177 176 159 176 177 159 159 ⎦
118 158 159 177 177 176 176 176
314 H. Wang et al.

⎡ ⎤
133 133 133 132 132 133 130 132
⎢ 133 133 131 133 133 133 133 133 ⎥
⎢ ⎥
⎢ 133 133 133 131 133 133 133 133 ⎥
⎢ ⎥
⎢ ⎥
⎢ 133 133 133 132 133 133 133 133 ⎥
Sk = ⎢ ⎥
⎢ 133 131 133 133 133 133 133 133 ⎥
⎢ ⎥
⎢ 131 133 133 133 133 133 133 133 ⎥
⎢ ⎥
⎣ 133 133 133 133 131 133 133 133 ⎦
133 133 133 133 133 132 133 133

3.2 Instance Data



The mean value E(Sk) = N1 Sij = 8495/64 = 132.734, the maximum value Max(S ij )
= 133, and the minimum value Min(S ij ) = 130.
Note: k = (ij), where ij = Sij-E(Sk), and the image block k is judged as
follows:
⎡ ⎤
0.266 0.266 0.266 −0.734 0.266 0.266 −2.734 −0.734
⎢ 0.266 0.266 −1.734 0.266 0.266 0.266 0.266 0.266 ⎥
⎢ ⎥
⎢ 0.266 0.266 0.266 −1.734 0.266 0.266 0.266 0.266 ⎥
⎢ ⎥
⎢ ⎥
⎢ 0.266 0.266 0.266 −0.734 0.266 0.266 0.266 0.266 ⎥
k = ⎢ ⎥
⎢ 0.266 −1.734 0.266 0.266 0.266 0.266 0.266 0.266 ⎥
⎢ ⎥
⎢ −1.734 0.266 0.266 0.266 0.266 0.266 0.266 0.266 ⎥
⎢ ⎥
⎣ 0.266 0.266 0.266 0.266 −1.734 0.266 0.266 0.266 ⎦
0.266 0.266 0.266 0.266 0.266 0.266 0.266 0.266

3.3 Mean Compression

According to the symbol of ij = Sij— E(Sk) in k, namely:


When ij ≥ 0, Sij = Sij/Max( Sij) = 1 (Sij = 133Max(Sij) = 133).
When ij < 0,

Sij’ = Min(Sij)/ Sij =1 (Sij =130, Min(Sij) =130);


Sij’ = Min(Sij)/Sij = 0.992 (Sij =131, MinSij) =130);
Sij’ = Min(Sij)/Sij =0.985 (Sij =132, Min(Sij) =130).

The new secret image Sk  is obtained as follows:


⎡ ⎤
1 1 1 0.985 1 1 1 0.985
⎢ 1 1 0.992 1 1 1 1 1 ⎥
⎢ ⎥
⎢ 1 1 1 1 ⎥
⎢ 1 1 0.985 1 ⎥
⎢ ⎥
 ⎢ 1 1 1 0.985 1 1 1 1 ⎥
Sk = ⎢ ⎥
⎢ 1 0.992 1 1 1 1 1 1 ⎥
⎢ ⎥
⎢ 0.992 1 1 1 1 1 1 1 ⎥
⎢ ⎥
⎣ 1 1 1 1 0.992 1 1 1 ⎦
1 1 1 1 1 0.985 1 1
Research on an Image Hiding Algorithm with Symmetric Means 315

3.4 Symmetrical Replacement


The secret image SSk obtained by symmetrical replacement is as follows:
(1) The image block obtained after horizontal symmetrical replacement is as follows:
⎡ ⎤
0.985 1 1 1 0.985 1 1 1
⎢ 1 1 1 1 1 0.992 1 1 ⎥
⎢ ⎥
⎢ 1 1 1 1 ⎥
⎢ 1 0.985 1 1 ⎥
⎢ ⎥
 ⎢ 1 1 1 1 0.985 1 1 1 ⎥
Sk = ⎢ ⎥
⎢ 1 1 1 1 1 1 0.992 1 ⎥
⎢ ⎥
⎢ 1 1 1 1 1 1 1 0.992 ⎥
⎢ ⎥
⎣ 1 1 1 0.992 1 1 1 ⎦
1 1 0.985 1 1 1 1 1
(2) The image block obtained after vertical symmetrical replacement is as follows:
⎡ ⎤
1 1 0.985 1 1 1 1 1
⎢ 1 1 1 0.992 1 1 1 1 ⎥
⎢ ⎥
⎢ 1 1 1 1 0.992 ⎥
⎢ 1 1 1 ⎥
⎢ ⎥
 ⎢ 1 1 1 1 1 1 0.992 1 ⎥
SSk = ⎢ ⎥
⎢ 1 1 1 1 0.985 1 1 1 ⎥
⎢ ⎥
⎢ 1 1 1 1 0.985 1 1 1 ⎥
⎢ ⎥
⎣ 1 1 1 1 1 0.992 1 1 ⎦
0.985 1 1 1 0.985 1 1 1

3.5 Convex Combination


Construct a hidden image Ck = α* Ck + β* SSk , (where: α + β = 1). Choose α =
0.999 and β = 0.001. Then the hidden image Ck is as follows:
⎡ ⎤
126.874 175.825 175.824985 175.825 176.824 190.810 176.824 190.810
⎢ 126.874 ⎥
⎢ 175.825 175.825 175.824992 175.825 176.824 175.825 175.825 ⎥
⎢ ⎥
⎢ 126.874 175.825 176.824 176.824 175.825 175.825 176.824 176.823992 ⎥
⎢ ⎥

⎢ 126.874 158.842 157.843 158.842 158.842 157.843 158.841992 158.842 ⎥
Ck = ⎢
⎢ 131.869


⎢ 158.842 158.842 150.850 151.848985 151.849 148.852 151.849 ⎥
⎢ ⎥
⎢ 126.874 175.825 175.825 157.843 158.841985 150.850 151.849 150.850 ⎥
⎢ ⎥
⎣ 126.874 176.824 175.825 158.842 175.825 176.823992 158.842 158.842 ⎦
117.882985 157.843 158.842 176.824 176.823985 175.825 175.825 175.825

3.6 Hypothesis Test


We make a hypothesis test on the new hidden image Ck. The new hidden image Ck
obeys the normal distribution N( μ,σ 2 ).
The mean value of the new hidden image Ck is μ = E( Ck ) = 162.3191, and the
estimated mean square error is S = 17.2613. The mean value of the hidden image Ck is
μ0 = E(Ck ) = 162.7656.
316 H. Wang et al.

Let H0: μ > μ0 (received); H1: μ μ0(rejected).


Set the significance level α = 0.005.
The statistics t = μ−μ0√ obey the T distribution, and the rejection domain of the test
S/ n
problem is: |t| = | μ−μ0
√ | tα(n−1).and t =
S/ n
μ−μ0

S/ n
= 162.3191−162.7656
√ = −3.568.
17.2613 64

From the significance level α = 0.005, the number of samples n = 64, the look-up
table has tα(n−1) = t 0.005 (64–1) = t 0.005(63) = 2.656. Then the result is shown in
Fig. 4. Because of |t|= 3.568 > 2.656 = t 0.005(63), and the hypothesis test H1: μ μ0
holds.

Fig. 4. Rejection domain of hypothesis testing

3.7 Compare Image

The mean value of the cover image block Ck,μ 0 = E(Ck) = 162.7656, the variance σ 2
= 300.849. The mean square error σ = 17.345.
The mean value μ = E( Ck ) = 162.3191 of the new hidden image block Ck, the
variance S 2 = 297.953, and the mean square error S = 17.2613. After calculation:

The mean errorμ = μ0 − μ = 162.7656–162.3191 = 0.4465.


The relative error of the mean is μ/μ = 0.4465/162.3191 = 0.00275.
The mean square error σ = σ –S = 17.345–17.2613 = 0.0837.
The relative error of the mean square error σ /σ = 0.0837/17.345 = 0.0048.

It can be seen from the above analysis:

– (1) The difference between the new hidden image block Ck and the covered block
Ck is less than 1 pixel.
– (2) From μ/μ = 0.00275 = 0.275%, it is known that the new hidden image block
Ck contains 99.725% of the information of the cover image block Ck and 0.275%
of the information of the secret image block Sk. Therefore, the newly hidden image
block Ck has a better hiding effect.
Research on an Image Hiding Algorithm with Symmetric Means 317

Fig. 5. Overlay image with 256 × 256 pixels

Fig. 6. Secret image with 128 × 128 pixels Lena Fig. 7. Overlay image after 256 × 256
pixels hidden image

4 Experimental Results
The quality of the camouflage image evaluates the value of PSNR based Eq. (1).
The experimental results are obtained with 256 × 256 pixels Airplane as the overlay
image and 128 × 128 pixels Lena as the secret image (Figs. 5, 6 and 7).
2
MAXI
PSNR = 10 × log10 (1)
MSE

1 m−1 n−1
where MSE = mn i=0 i=0 (I (i, j) − K(i, j)) , MAXI represents the maximum
2

value of image grayscale, I (i, j) represents the original pixel value of the image, K(i,j)
represents the pixel value after embedding information.
318 H. Wang et al.

The result of our calculation is PSNR = 45.01, which is much higher than the
PSNR value in SukLing et al. [5], Zhang et al. [6], and Wang et al. [7], the quality of
the camouflage image is improved. The difference between the cover image and the
camouflage image cannot be seen visually.

5 Conclusions
The algorithm presented in this paper has used the image as the carrier, and a new
image information hiding algorithm (mean symmetric algorithm) is created to realize
information hiding. Less information is embedded in the image, which effectively saves
information space and improves transmission efficiency. The impact on secret images
is reduced as much as possible, and the quality of disguised images is improved. The
difficulty of cracking algorithms is increased, and the security of information hiding is
improved.

Acknowledgments. This work is supported by the Natural Science Foundation of China’s Hei-
longjiang Province (No. YQ2020G002), University Nursing Program for Young Scholars with
Creative Talents in Heilongjiang Province (No. UNPYSCT-2020212).

References
1. Wang, R.Z., Lin, C.F., Lin, J.C.: Image hiding by optimal LSB substitution and genetic
algorithm. Pattern Recogn. 34(3), 671–683 (2001)
2. Hu, Y.C.: Grey-level image hiding scheme based on vector quantization. IEEE Electron. Lett.
39(2), 202–203 (2003)
3. Yang, C.-Y., Guo, E.X., Zheng, Y.: Reversible data hiding based on wavelet spread-spectrum.
Comput. Eng. Appl. 40(14) (2004)
4. Zheng, Y., Yang, C., Xuan, G.: Wavelet-based high-capacity lossless image hiding and its
application. Comput. Eng. 31(1), 56–59 (2005)
5. Li, S.-L., Leung, K.C., Cheng, L.M., Chan, C.-K.: A novel image-hiding scheme based on
block difference. Pattern Recogn. 39(6), 1168–1176 (2006)
6. Zhang, R., Ma, L., Zhang, J.: Images concealment based on interframe matching mean and
variance. J. Image Graph. 15(11), 1578–1582 (2010)
7. Wang, H., Qu, L.: A triangular algorithm of image-hiding. In: The 8th World Congress on
Intelligent Control and Automation, Jinan, Shandong, China, (2010)
8. Wang, H.: A complex number algorithm of image-hiding. In: The 10th World Congress on
Intelligent Control and Automation, Beijing, China (2012)
9. Zhang, X.: Reversible data hiding in encrypted images. IEEE Signal Process. Lett. 18(4),
255–258 (2011)
10. Ali, B.M., Babu, T.R.: A Novel image hiding algorithm using optimal band selection. Int. J.
Innov. Technol. Explor. Eng. 8(5), 1155–1159 (2019)
11. Li, X., Kim, S.-T., Lee, I.-K.: Robustness enhancement for image hiding algorithm in cellular
automata domain. Optics Commun. 356, 186–194 (2015)
12. Pang, H.W.: The technology of digital image hiding based on discrete wavelet transform.
Appl. Mech. Mater. 3822, 718–721 (2015)
13. Yan, Y.: Image information hiding algorithm research of network sensor based on visual
characteristic. Sensors Transd. 160(12), 315–322 (2013)
14. Feng, D., Chen, L.: An improved image hiding algorithm of singular value parity interval
quantization. Sci. Technol. Eng. 19(26), 283–287 (2019)
Image Analysis and Processing
A Detection Network United Local Feature
Points and Components for Fine-Grained Image
Classification

Yong Du1,2 , Bin Yu3 , Peng Hong4 , Wei Pan5 , Yang Wang2 , and Yu Wang6,7(B)
1 Tianjin University, Tianjin 300072, People’s Republic of China
2 Northeast Agricultural University, Harbin 150001, People’s Republic of China
3 Harbin Institute of Technology, Harbin 150001, People’s Republic of China
4 Lanzhou Jiaotong University, Lanzhou 730000, People’s Republic of China
5 Harbin Huade University, Harbin 150025, People’s Republic of China
6 Tianjin University, Tianjin 300072, People’s Republic of China
7 Northeast Agricultural University, Harbin 150001, People’s Republic of China

Abstract. In recent years, fine-grained image classification has been a new


research field in computer vision due to the characteristics of significant intra-
class differences and minor inter-class differences in fine-grained image clas-
sification tasks. Traditional image classification algorithms are still challenging
to obtain good classification results despite relying on manual annotation. Since
minor local differences can only distinguish the subcategories, accurate detection
of local details is the key to improving fine-grained classification accuracy. There-
fore, this paper proposes a joint detection network model of local feature points
and components for fine-grained image classification to effectively predict and
extract local feature positions. Experiments verify the effectiveness of the pro-
posed method on the public data set CalTech-UCSD Birds (CUB 200-2011) for
fine-granularity classification tasks.

Keywords: Landmark and parts detection · Spatial transformer networks · Joint


detection model · Base model modification · Image fine-grained classification

1 Introduction
Image classification is a fundamental research problem in computer vision, which has
attracted much attention in recent years. With the rapid development of deep learning
technology, this fundamental problem has been solved. Nevertheless, fine-grained image
classification gradually replaces image classification with increasing demand for retrieval
and classification. It has a broad application prospect such as the classification of dish,
clothing, and animal subcategory [1, 2, 3]. The fine-grained image classification is a
subdivision of subcategories based on the classification of essential categories, which is
more complex due to the natural similarity of subcategories. The simplest solution is to
use the general image classification model for training, but the classification performance
is relatively poor that cannot deal with practical situations. The following disadvantages
are as below:

© The Author(s), under exclusive license to Springer Nature Switzerland AG 2022


A. E. Hassanien et al. (Eds.): BIIT 2021, LNDECT 107, pp. 321–331, 2022.
https://doi.org/10.1007/978-3-030-92632-8_31
322 Y. Du et al.

1) The description of features is still too weak to be discriminated against.


2) Traditional fine-grained classification algorithms do not pay enough attention to the
local information, which is the key to the performance of fine-grained classifica-
tion. Therefore, designing a model to locate or align local regions accurately can
significantly improve classification accuracy.
3) Many algorithms rely heavily on manual annotation information to better carry out
local localization, which leads to the algorithm not being generalized to practical
applications.

The algorithm based on deep learning technology has gradually shown great poten-
tial in fine-grained classification [4–6]. The fine-grained image classification is very
flexible in manual annotation, component information, component location prediction,
and second-order information. Experts and scholars have proposed various solutions
to this problem. It can be generally divided into four categories: i) integrated network
method, ii) methods based on visual attention mechanism, and iii) feature representation
method based on the high order information.
The synthetic network method uses one or more neural networks to learn different
feature representations of images to achieve better classification performance [7]. The
method based on visual attention mechanism has been widely used in fine-grained image
classification, which does not directly process image information. Still, it completes the
classification by simulating human’s ability to focus on local areas and distinguish [8, 9].
The feature representation method based on high-order information is different from the
low-order description of standard manually constructed features, which construct high-
order details to promote fine-grained classification [10, 11]. In fine-grained classification,
the differences between categories are mainly reflected in minor local differences. There-
fore, the improvement of partial detectors with fine-grained subtask recognition ability
is essential.
From the traditional manual feature extraction method to the deep learning automatic
feature extraction method [12, 13], and then to the integration [14], attention mechanism,
second-order information statistics [15], and other methods [16], the accuracy of fine-
grained image classification is getting higher. However, the localization method of local
fine feature points and local parts is still worth studying and very important. Therefore,
this paper proposes a joint detection network model of local feature points and component
references from this point of view. Experiments on CUB 200-2011 data sets verify the
local feature detection performance of the improved model, which different feature
points and components achieve optimal detection accuracy.
The remaining sections are arranged as follows. Section 2 describes the database
and pre-processing. Section 3 describes the network structure proposed in this paper.
Section 4 is the experiment and analysis. Finally, conclusions and future work are given
in Sect. 5.

2 Database and Feature Engineering


By applying feature engineering technology and modifying the structure of the gen-
eral image classification model, the accuracy of large-scale general image classification
A Detection Network United Local Feature Points 323

is improved and even exceeds the accuracy of human recognition. This section will
introduce a fine-grained image database and the pre-processing of the data.

2.1 Public Fine-Grained Classification Data Set


The experimental data set Caltech-UCSD Birds (CUB-200-2011) contains 11,788
images from 200 different species of birds, which is the most widely used database
for fine-grained image classification tasks. Each image contains a subcategory tag, an
image master tag box, 15 prominent structural positions, 312 binary attributes, and
detailed manual annotations. Moreover, 5994 and 5794 images were selected as training
sets and test sets, respectively. For each subcategory, 30 images were used for training
and the rest for testing. All attributes relate to the color, pattern, or shape of a particular
part. In addition to using subcategory tags, the proposed algorithm uses five local feature
points and two body parts formed by 15 body structure positions in the training phase.
The corresponding names of 15 bird body parts are shown in Table 1.

Table 1. Names of 15 manually annotated parts in the CUB-200-2011 database.

1 2 3 4 5 6 7 8
Back Beak Belly Breast Crown Forehead Left eye Left leg
9 10 11 12 13 14 15
Left wing Nape Right eye Right leg Right wing Tail Throat

As shown in Table 1, some body parts are close, such as the left and right eye, so
detecting all 15 feature points is unnecessary. At the same time, each feature point cannot
exist in every image due to the different bird poses and the occlusion problems in some
images. If a feature point that does not exist in an image needs to be detected, we will
replace it with an adjacent feature point. In this paper, 15 body parts are divided into
five local feature points according to their relative positions, namely, head, wing, chest,
leg, and tail. The specific corresponding relationship is shown in Table 2.

Table 2. Five local feature points correspond to bird body parts.

Area of position Number of position


Head 2, 5, 6, 7, 10, 11, 15
Wing 1, 9, 13
Chest 3, 4, 10, 15
Leg 8, 12
Tail 14

In this paper, the bird’s body is divided into the head and the torso. For all points
on the head and torso area, take the smallest rectangle that surrounds these points. In
324 Y. Du et al.

this way, each image can be divided into three parts: head, torso, and background. The
corresponding relationship between the two categories of body parts and the body parts
of 15 birds is shown in Table 3.

Table 3. The serial numbers of the bird body parts corresponding to the two types of parts.

Area of the component Number of the component


Head 2, 5, 6, 7, 10, 11, 15
body 1, 3, 4, 8, 9, 10, 12, 13, 14,
15

2.2 Data Preprocessing


This paper uses the test set consistent with CUB-200-2011 to compare with other fine-
grained image classification algorithms. To ensure the effect of network training and
prevent network overfitting, all the pictures in the training set are flipped left and right to
enhance the data. After data enhancement, the amount of training data doubled to 11988
images. The input image size for training and testing is the same, with 448 or 224. At
the same time, zero mean and variance normalization processing is carried out for the
input image to make the average value of the pixel range near 0 and make the network
easier to train.

3 Joint Detection Network of Feature Points and Components


3.1 Basic Structure of Local Feature Points Detection Branch
It is well known that in various VGGNet structures, there are finally three fully connected
layers, of which the last layer is mainly used for matching classification, and the number
of parameters of the first two fully connected layers is far greater than the sum of all
the other layers [17]. Therefore, we choose VGG-16 as the base model and simplify
the parameters of its fully connected structure to reduce the network complexity and
improve the operation efficiency of the networks. All layers before the full-connection
layer are kept unchanged, and the full-connection layer is rebuilt. The structural changes
of the full-connection layer are shown in Fig. 1.

Fig. 1. Schematic diagram of VGG-16 full connection layer structure change.


A Detection Network United Local Feature Points 325

3.2 Basic Structure of Part Detection Branch


In the local feature point prediction network branch, the input is the image. The output
is the coordinate (x, y) of the feature points of five bird body parts predicted by each
image, so the number of output parameters is 10.
To realize the detection of components, this paper draws on the idea of Spatial
Transformer Networks (STN), which Jaderberg first proposed [8]. As shown in Fig. 2,
the STN network can easily be used as a branch of the leading network with structure. The
whole network consists of three parts: location network, grid generator, and sampling
network.

Fig. 2. Network structure of spatial transformation network STN.

(1) Location network


The part of the positioning network mainly generates the transformation param-
eter θ , namely the transformation matrix of size 2 * 3, through a sub-network. Local-
ization subnetworks are mainly composed of a full connection layer or convolution
layer.
(2) Parameterized sampling grid
The sampling network converts the input coordinates to the output coordinates.
Assuming that the coordinate of each pixel of input U is (xis , yis ), the coordinate of
each pixel of output V is (xit , yit ), and the spatial transformation function is the affine
transformation function, then the correspondence between (xis , yis ) and (xit , yit ) can
be written as:
⎛ t⎞ ⎛ t⎞
 s xi  xi
xi ⎝ yt ⎠ = θ 11 θ θ
12 13 ⎝ t ⎠
= τθ (G i ) = A θ yi (1)
yis i θ21 θ22 θ23
1 1

(3) Differentiable image sampling

Finally, different interpolation methods can be used to map the image input to the
output image. The following equation, k, represents different sampling kernel functions.
H W
Vci = c
Unm k(xis − m; x )k(yis − n; y )∀i ∈ 1...H  W  ∀c ∈ [1...C] (2)
n m
326 Y. Du et al.

3.3 Training Process of STN


The component detection network includes two processes: training and testing. And in
the training process, the specific network structure of the component detection branch
established based on the STN subnetwork structure is shown in Fig. 3.

Fig. 3. Training process diagram of component detection network based on STN substructure.

Firstly, the original image is pre-processed to make the size 224 * 224 or 448 * 448,
which is input into the VGG16 network, and the features of the conv5_3 layer, namely
the last layer of the fifth convolution module, are extracted. Then, the features of the
conv5_3 layer were input into the positioning network and grid generator of STN, and
the extracted regional location coordinates (X, Y ) were obtained. After that, the labeled
rectangular box is pre-processed and input into the grid generator to obtain the ground
truth region’s position coordinates (Xg, Yg). Finally, L2 loss was used to regression the
extracted position coordinates and ground truth coordinates, as shown in formula (3).
Some of the annotated information here is only used in the training phase and is not used
in the testing phase.
448 448
loss = 0.5 (xgroundtruth − xi,j )2 + (ygroundtruth − yi,j )2 (3)
i = 1 j=1

The specific structure of the positioning network is shown in Fig. 4. To obtain the six
parameters of the affine transformation, the positioning network uses a fully connected
structural design so that the number of channels is gradually reduced to 6.

Fig. 4. Detailed construction of the grid generator.

The process of using position coordinates and grid generators to generate position
regions is as follows: i) taking the points in the rectangle box’s upper left and lower right
A Detection Network United Local Feature Points 327

corners to get Xmin, Ymin, Xmax, Ymax, ii) scaling these four points to make them match
the pre-processed original image that was input into VGG16, iii) dividing Xmin-Xmax
into w parts and Ymin-Ymax into h parts according to the size of the output image preset
by STN, such as w and h, and iv) mapping the X and Y coordinates as shown in Fig. 4
to obtain a series of positions in regions.
When the training process is completed, the specific application mode of the
component detection branch is shown in Fig. 5.

Fig. 5. Bird head prediction using STN network.

To observe the effect of using partial annotation and spatial transformation network to
locate the local area, we sampled the bird’s head and torso using the location coordinates
generated by the trained network, as shown in Fig. 3.

3.4 Joint Inspection Network

Incorporating component information can effectively improve the detection accuracy of


bird local feature points in the data set. This paper constructs a joint detection network
infrastructure based on local feature points and components and designs three joint
detection schemes.
The model design of the first scheme is shown in Fig. 6. Feature point detection and
component detection share the convolution layer part of VGG-16’s parameters. Then,
the features of the Conv5_3 layer are input to two parallel fully connected layers which
has the same structure as the figure, one of which is used to detect the feature points.
The other fully connected layer is used to detect components. It should be noted that a
joint detection network needs to be trained for each component.

Fig. 6. Joint testing program one.


328 Y. Du et al.

The model design of the second option is shown in Fig. 7. Feature points and multiple
components are directly detected in parallel. Therefore, only one detection network
model needs to be trained, an upgraded version of scheme one.

Fig. 7. Joint testing program two.

The model design of the third scheme is shown in Fig. 8. Based on program one, the
input of the network is changed from three channels to six channels. The added three
channels are the RGB three channels of the original image extracted through the label
box. The original intention of this design is that the network pays more attention to the
central part of the image so that the multi-task model can predict the feature points and
parts more accurately.

Fig. 8. Joint testing program three.

In the next part of the experiment, we will analyze and compare the effect of these
three model designs on the improvement of local feature point detection performance.

4 Experiment and Analysis

To quantitatively compare the effect of joint detection model on feature point detection
with that of single detection, this paper calculates the Euclidean distance between the
predicted value and the actual value for each type of feature point and then calculates
the average of the sum of distances for the same type of feature points. The smaller the
value is, the smaller the distance between the prediction point and the actual value is,
and the more accurate the prediction result is. The experimental calculation results are
shown in Table 4.
A Detection Network United Local Feature Points 329

Table 4. Euclidean distance between the predicted value and true value of different schemes.

Method/part Head Wing Breast Leg Tail


Single 20.79 39.34 33.08 57.86 56.74
Union_1_body 17.89 33.60 27.22 50.95 44.43
Union_1_head 15.75 33.50 27.30 50.35 44.73
Union_2 16.31 34.44 27.44 51.14 44.45
Union_3 16.11 33.97 27.41 50.60 45.16

It can be seen from the table that the three joint detection models have a better
detection effect on feature points than the local feature point detection branch alone.
Compared with the three joint detection models, the detection effect of feature points
is similar. However, model 2 only needs to train one network model, model 1 needs to
train several network models, and model 3 also uses additional annotations. Therefore,
as shown in Fig. 9, only the feature point detection effect, individual detection effect,
and the truth value of model 2 are visualized on the original image.

Fig. 9. Comparison of prediction results of local feature points between single detection and joint
detection program two.

In Fig. 9, the green dots represent actual values, the red dots represent the results of
individual tests, and the blue dots represent the results of the joint detection model. It
can be seen from the visualization effect that the result of joint detection is closer to the
actual value. The detection effect of eyes and chest is relatively accurate whether it is
detected alone or in combination, but the detection effect of legs, tail, and wings needs
to be improved.
The experimental results show that the component information has a positive effect on
the network structure of detecting local feature points, and the network model designed
in this paper combining local feature points and component detection can obtain more
330 Y. Du et al.

accurate and detailed local location information, which is also suitable for other fine-
grained classification tasks. At the same time, with effective localization of local details,
this model will provide support and help for more extended tasks of fine-grained analysis.

5 Conclusion and Future Work


In fine-grained classification tasks, it is often necessary to obtain detailed description
features of the subject to be identified to classify effectively. Fine-grained classification is
a problematic classification task proposed in recent years. In this paper, a joint detection
network of local feature points and components integrated with component information
can extract more accurate local details and more subject description information, which
will have positive help and reference significance for more fine-grained analysis tasks.
The alignment and comprehensive description of detailed local features will be further
studied in the coming work.

Acknowledgments. This work is supported by 2020 Heilongjiang Provincial Natural Science


Foundation Joint Guidance Project (LH2020C001).

References
1. Liu, C., Liang, Y., Xue, et al.: Food and ingredient joint learning for fine-grained recognition.
IEEE Trans. Circuits Syst. Video Technol. 31(6), 2480–2493 (2021)
2. Seo, Y., Shin, K.S: Image classification of fine-grained fashion image based on style using
pre-trained convolutional neural network. In: 2018 IEEE 3rd International Conference on Big
Data Analysis, pp. 387–390. IEEE, China (2018)
3. Huang, H., Zhang, J., Zhang, J., et al.: Low-rank pairwise alignment bilinear network for
few-shot Fine-grained image classification. IEEE Trans. Multimedia 23, 1666–1680 (2021)
4. Liu, N., Han, J., Yang, M. H.: Picanet: learning pixel-wise contextual attention for saliency
detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern
Recognition, pp. 3089–3098. IEEE, USA (2018)
5. Ding, Y., Zhou, Y., Zhu, Y., et al.: Selective sparse sampling for fine-grained image recognition.
In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 6599–
6608. IEEE, USA (2019)
6. Chen, Z., Bei, Y., Rudin, C.: Concept whitening for interpretable image recognition. Nature
Mach. Intell. 2(12), 772–782 (2020)
7. Qian, Q., Jin, R., Zhu, S., et al.: Fine-grained visual categorization via multi-stage metric learn-
ing. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition,
pp. 3716–3724. IEEE, USA (2015)
8. Jaderberg, M., Simonyan, K., Zisserman, A., et al.: Spatial transformer networks. Adv. Neural.
Inf. Process. Syst. 28, 2017–2025 (2015)
9. Peng, Y., He, X., Zhao, J.: Object-part attention model for fine-grained image classification.
IEEE Trans. Image Process. 27(3), 1487–1500 (2017)
10. Cai, S., Zuo, W., Zhang, L.: Higher-order integration of hierarchical convolutional activations
for fine-grained visual categorization. In: Proceedings of the IEEE Conference on Computer
Vision and Pattern Recognition, pp. 511–520. IEEE, USA (2017)
A Detection Network United Local Feature Points 331

11. Wang, Q., Li, P., Zhang, L.: G2DeNet: global gaussian distribution embedding network and
its application to visual recognition. In: Proceedings of the IEEE Conference on Computer
Vision and Pattern Recognition, pp. 2730–2739. IEEE, USA (2017)
12. Girshick, R., Donahue, J., Darrell, T., et al.: Rich feature hierarchies for accurate object
detection and semantic segmentation. In: Proceedings of the IEEE Conference on Computer
Vision and Pattern Recognition, pp. 580–587. IEEE, USA (2014)
13. Zhang, N., Donahue, J., Girshick, R., Darrell, T.: Part-based R-CNNs for fine-grained category
detection. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds.) ECCV 2014. LNCS, vol.
8689, pp. 834–849. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-10590-1_54
14. Gao, Y., Beijbom, O., Zhang, N., et al.: Compact bilinear pooling. In: Proceedings of the
IEEE Conference on Computer Vision and Pattern Recognition, pp. 317–326. IEEE, USA
(2016)
15. Dai, X., Yue-Hei Ng, J., Davis, L.S.: FASON: first and second order information fusion
network for texture recognition. In: Proceedings of the IEEE Conference on Computer Vision
and Pattern Recognition, pp. 7352–7360. IEEE, USA (2017)
16. Zheng, H., Fu, J., Mei, T., et al.: Learning multi-attention convolutional neural network for
fine-grained image recognition. In: Proceedings of the IEEE International Conference on
Computer Vision, pp. 5209–5217. IEEE, USA (2017)
17. Sengupta, A., Ye, Y., Wang, R., et al.: Going deeper in spiking neural networks: VGG and
residual architectures. Front. Neurosci. 13, 95 (2019)
A Random Walks Image Segmentation Method
Combined with KNN Affine Class

Xiaodong Su1,2 , Shizhou Li1,2(B) , Guilin Yao1,2 , Hongyu Liang1,2 , Yurong Zhang1,2 ,
and Shirui Wu1,2
1 Harbin University of Commerce, Harbin 150028, China
suxd@hrbcu.edu.cn
2 Heilongjiang Provincial Key Laboratory of Electronic Commerce and Information Processing,

Harbin 150028, China

Abstract. In the image binary segmentation technology, accurately solving the


boundary between the foreground object and the background is a key issue, and the
typical unsupervised interactive binary segmentation algorithms include GrabCut,
random walk, k-means clustering, etc. Among them, the traditional random walk
algorithm has the advantages of strong interpretability, fast speed, and high effi-
ciency. The segmentation effect of the hard boundary is integrated into the user’s
experience knowledge and subjective requirements, and the outline of the target
object can be extracted accurately and meticulously. The segmentation effect of
images such as soft borders such as hair and complex background textures is
not ideal. Therefore, this paper proposes a random walk segmentation algorithm
combined with the KNN affine class. The algorithm is based on the random walk
method, but the neighborhood search space for unknown pixels is larger than
the traditional random walk algorithm, and the weight is calculated. At the time,
samples with similar color characteristics to the unknown pixels are selected, and
the foreground probability of the unknown pixels is obtained by solving a sparse
linear system, to realize the binary segmentation of the image. Experiments show
that the combination of random walk algorithm and KNN algorithm improves
the accuracy of foreground probability value and enhances the edge fineness of
image segmentation compared with traditional random walk algorithm and KNN
algorithm.

Keywords: Image segmentation · Random walks · KNN · Affine class ·


Foreground probability value

1 Introduction
In recent years, with the development of artificial intelligence, computer vision has
gradually become the direction of research. As the basis of machine vision, image seg-
mentation is widely used. Image segmentation is the technology and process of dividing
an image into several designated areas with unique properties and extracting objects
of interest. Compared with the current state-of-the-art supervised image segmentation,

© The Author(s), under exclusive license to Springer Nature Switzerland AG 2022


A. E. Hassanien et al. (Eds.): BIIT 2021, LNDECT 107, pp. 332–343, 2022.
https://doi.org/10.1007/978-3-030-92632-8_32
A Random Walks Image Segmentation 333

unsupervised interactive image segmentation does not require a long-time training pro-
cess and only incorporates the user’s experience and subjective requirements for some
interactive sparse inputs. Typical algorithms include Grab Cut, random walk, k-means
clustering, and other algorithms.
Xia Min [1] uses random walks to segment part of the edge of the image affected by
light changes and then uses an iterative segmentation algorithm to segment the image.
Guo Li [2] and other people use background difference, morphology, and other princi-
ples to obtain the skeleton structure, obtain marker points, and then use random walks
algorithm for image segmentation to solve the problem of slow interaction in the field
of vehicle detection. Mao Zhengchong [3] and other people preliminarily selected the
marker points for random walks through the pattern mining algorithm and then seg-
mented the input image using the random walks algorithm, which can accurately and
automatically segment the image. The random walks algorithm was first proposed by
Grady L and applied to image segmentation [4]. Its advantages are fast calculation speed,
high efficiency, and good noise elimination. The basic idea of the algorithm is to cal-
culate the weighted average of the neighborhood of unknown pixels. The greater the
weight, the greater the correlation between the two. However, when the background
texture is complex, the number of neighborhood sampling points of unknown pixels is
small, leading to inaccurate results and the edge fuzzy. Therefore, how to improve the
accuracy of the foreground probability value of the random walk algorithm is a problem
worthy of study.
The KNN algorithm was jointly proposed by Cover and Hart in 1968 [5]. It is a
very mature algorithm in the field of matting and segmentation. Bai Yang [6] proposed a
matting method combining the KNN algorithm with a general sampling-based method.
Compared with a random walk, when sampling unknown pixels, the KNN algorithm
pays more attention to using points with larger weights (closer colors) to the unknown
pixels to form a gradual relationship between pixel colors, thereby forming an affine
relationship. The calculated foreground probability value will be more accurate.
This paper proposes a random walk image segmentation algorithm combined with
KNN affine class. The neighborhood search space is larger than the original algorithm by
combining the KNN algorithm in the affine class with the random walk algorithm. More
effective sample points closer to the color characteristics of unknown pixels are collected.
The experimental results show that the new algorithm can accurately and meticulously
extract the contours of the target object while integrating the user’s experience knowledge
and subjective requirements and improve the segmentation accuracy. The method in this
paper improves the segmentation accuracy of the KNN algorithm compared with the
traditional method. The random walk algorithm can improve the segmentation effects
and can increase the research values.

2 Principle of Traditional Random Walks Algorithm

Random walk is an interactive image segmentation algorithm based on graph theory. In


the traditional interactive image segmentation model, the starting point of random walk
is each unmarked pixel, and the user gives the end point of the random walk. After the
whole random walk process is over, the seed label with the largest probability of reaching
334 X. Su et al.

the seed pixel is selected as the label of the current pixel to realize the segmentation of
the image. Its basic process can be roughly expressed as: Convert the image into a matrix
composed of individual pixels [7], calculate the weighted average of the unknown pixel
and its neighborhood to obtain the foreground probability value and determine whether
it belongs to the front scenic spot or the background point, weighted average The value
di is expressed as

di = wij (1)

In the formula (1), wij is the weight between the unknown pixel point i and its
neighboring point j. The greater the weight, the closer the relationship between the two.
Random walk algorithms are usually used in binary segmentation, color supplements,
and other fields with low boundary requirements. However, in the field that requires the
accurate calculation of the foreground probability value, there are obvious defects and
cannot reflect the details [8].

3 Affine Class Methods


The affine class method has a long history and is usually used in matting [9], covering two
main features of color and space. It experiences simple to complex calculation process in
the closed form, that is, the segmentation of the image edge from coarse to fine. Its biggest
advantage is to connect the characteristics of different pixels, and the most commonly
used is the KNN algorithm and matting laplacian class method. The affine class performs
a linear combination of multi-neighborhood pixels for the foreground probability value
of each unknown pixel in matting, assuming that the α value (foreground probability
value) of each unknown pixel is the linear combination of its K-neighborhood pixel α
value:

αi = ω1 α1 + · · · + ωj αj + · · · + ωk αk i, j = 1, 2, · · · k (2)

The αi of all unknown points i can be obtained by solving the following large sparse
linear equation

(L + θ D)x = θ b (3)

Among them, L is a sparse square matrix whose length and width are both the total
number of image pixels. The coefficient of the K neighborhood corresponding to the row
of the unknown point i is denoted as ωj , j = 1, ···, k, D are diagonal The position of the
diagonal element corresponding to the known point is 1, the unknown point is 0, and θ is
a larger and b is a column vector whose length is equal to the total number of pixels. x is
the foreground probability value of the unknown pixel [10]. Compared with the method
of calculating each point in the random walk algorithm, it emphasizes the connection
of adjacent pixels. Therefore, the result of the final calculated foreground probability
value and the smoothness and the visual experience to the user are obviously better than
random Sampling algorithms such as wandering. This is in sharp contrast with the result
of the foreground probability value obtained by the random walk algorithm [11], which
usually has more noise points and the overall distribution is more messy [12].
A Random Walks Image Segmentation 335

4 Algorithm Combination
4.1 Sparse Matrix of Random Walks

The user first enters the front sights and background points, sets the foreground prob-
ability to 1, and the background probability to 0. Figure 1 shows the front sights and
background points entered by the user. The algorithm judges the unknown points by
solving the unknown pixel points in the sparse matrix. It belongs to the former scenic
spot or the background spot.

Fig. 1. Red for the foreground, blue for the background

In the figure, red and blue are the front scenic spot and background point entered by
the user.
In a sparse matrix, if the number of elements with a value of 0 is far more than the
number of non-zero elements, and the distribution of non-zero elements is irregular, the
matrix is called a sparse matrix. In the random walk algorithm, 0 elements are input by
the user Background points and unknown pixel points, and non-zero elements are the
previous scenic spots entered by the user. Assuming that the image size is m × n, the
sparse matrix Arw of t × t(m × n = t) is constructed. The elements of t/2 rows in the
matrix are the weights between the unknown pixel i and its neighborhood point j, and
the diagonal elements are the sum of the weights between the unknown pixel i and its
four neighborhoods. The foreground seed point and the background seed point are set
corresponding to the rows of the matrix, and the other matrix elements are 0. The weight
formula of the random walks algorithm is
 
wij = exp −Ii − Ij /σ (4)

In the formula, wij is the weight between the unknown pixel point i and its neighbor
point j, Ii is the pixel value of point i, Ii − Ij is the Euclidean norm between point
i and point j, and σ is the only free parameter in the algorithm. The closer the gray
values of two adjacent pixels, the greater the weight between the two points, and the
greater the connection, when solving Eqs. (4), only the probabilities of foreground points,
background points and points to be solved need to be calculated. The larger the calculated
weight between two adjacent pixels, the closer the connection between the two pixels.
336 X. Su et al.

4.2 Sparse Matrix of KNN Algorithm


Selection of Neighborhood Points. The K-nearest neighbor algorithm [13] in the KNN
affine method can be easily extended to deal with high-dimensional data in SVBRDF
or non- RGB color space. More K points closer to the color feature of unknown point
pixel i can be searched in a larger space (15 or more points with similar color features
are preferred), thus producing higher quality segmentation results. The sampling steps
of the K-nearest neighbor algorithm are as follows:

(1) Select the first K samples with the smallest distance to search for the spatial distance
(K selected 10 in this article).
(2) The random k-d forest algorithm in the FLANN search library is used to select
points with similar features. The key of this algorithm is to find the closest point
in the neighborhood of the unknown pixel. The feature similarity defined by the
Euclidean distance is

d  
D(x, y) = Xi2 + Yi2 (5)
i=1

In the formula, Xi represents the feature of the unknown pixel, and Yi represents the
feature of the neighborhood point.
If the obtained D value is smaller, the features between the feature point pairs are
closer, and the degree of similarity is higher. An example of its spatial search is shown
(see Fig. 2).

Fig. 2. Spatial search for unknown point i

By searching for unknown pixels {Ri , Gi , Bi }, multiple feature points Xi (i = 1, 2,


3) similar to the color features of the unknown pixels are added [13]. The feature of the
unknown point i in KNN can be expressed as
   
 
Xi = Ri , Gi , Bi , Xi / Xi2 + Yi2 , Yi / Xi2 + Yi2 (6)

In the formula, Ri , Gi , Bi are color features, and Xi and Yi are the coordinates of
point i.
A Random Walks Image Segmentation 337

Construction of Sparse Matrix. When the KNN algorithm constructs a sparse


matrix, set the image size to m × n, and construct a sparse matrix AKNN of t × t(m ×
n = t), The element in the t/2 row of the matrix is the weight between the unknown
pixel i and its neighborhood j, and the diagonal element is the sum of the weights of
the unknown pixel i and its K neighborhoods with similar color characteristics. In the
matrix Foreground seed points and background seed points are set correspondingly on
the number of rows, and the weight calculation formula is

wij = 1 − xi − xik /c (7)

In the formula, xi is the feature of unknown point i; xik is the feature of K neighboring
points; c is a weight coefficient, which is the upper bound of xi − xik .

4.3 Solution of Foreground Probability Value


Combining the obtained sparse matrices Arw and AKNN into a new large sparse matrix,
the final solution uses the conjugate gradient method to solve the large sparse linear
equation to obtain the foreground probability value. The final result is obtained from the
solution of following sparse linear equation

(Arw + AKNN + θ D)x = θ b (8)

In the formula, Arw + AKNN is the sparse matrix after the addition of matrices Arw
and AKNN , D is the diagonal matrix, In the diagonal matrix, the diagonal element of the
known seed point (including the foreground and background seed points) is 1, otherwise,
it is 0. b is a column vector, its length is equal to the total number of vectors, 1 at the
foreground seed point, otherwise 0, θ is the constraint parameter, so that the value of x
is between 0 and 1. The x solved by the solution of the linear equation is the foreground
probability value. When x is greater than 0.5, it is judged as the foreground point, and
vice versa, it is judged as the background point, so as to complete the binary segmentation
of the image.

5 Analysis of Experimental Results


In order to verify the effectiveness of the algorithm, this paper uses Berkeley dataset
for experiments [14] which includes over fifty input images with their corresponding
ground truth segmentation values. In view of the characteristics of the algorithm in this
paper, the method is named RW_KNN. MSE and SAD are used as error criteria, the
formula is as follows:
1 n n 2
MSE = (E(i, j) − H (i, j)) (9)
m×n i=1 j=1
n n
SAD = (E(i, j) − H (i, j)) (10)
i=1 j=1

In this formula, m is the width of the image; n is the height of the image; E(i, j) is
the result of segmentation algorithm; H (i, j) is true results for training sets [15].
338 X. Su et al.

5.1 Qualitative Observation

This section evaluates the results of image segmentation from a visual perspective. First,
select the full moon image to analyze the segmentation results (see Fig. 3). Ground
Truth is the real foreground image of the full moon, Trimap is the front scenic spot
and background point manually marked (white is the front scenic spot, black is the
background point), the first image in the first line is the image to be segmented; The
three images in the second row are the results of gray images output by KNN, random
walks, and RW_KNN (the method in this paper) algorithm, respectively; The third line is
the binary segmentation image output by three algorithms. The color difference between
the foreground and background of Fig. 3 is large. The hard border of the full moon image
is rounded, and the image segmentation is not difficult. From the output results, it can
be seen that the three algorithms can extract the foreground segmentation of the object.
Still, in the outline for some external boundary details, the performance of the three
algorithms is not the same. KNN algorithm has the worst performance.
The segmented results show over-segmentation and under-segmentation, and the
target image cannot be seen. However, the target contour segmented by the random
walk algorithm is relatively clear, conforming to its fine segmentation effect on the hard
boundary. The RW_KNN algorithm in this paper has the best effect. Compared with the
previous two algorithms, it can accurately segment the edge of the full moon.

Fig. 3. Fullmoon Segmentation result


A Random Walks Image Segmentation 339

Then the bush image with complex background texture was selected for segmentation
analysis (refer to Fig. 4). The feature of this image is that the foreground and background
are both close to the green, and the segmentation difficulty is greater than that of a
full moon image. The trimap in the first row is the three-part image of the original
image, where the foreground and background point probabilities have been marked in
advance, and Ground Truth is the real foreground image of the bush; The three images
in the second row are grayscale images output by the KNN, random walk, and RW-
KNN algorithms; the third row is the binary segmentation image output by the three
algorithms. It can be seen from the figure that the three algorithms can segment the
foreground and background of the object very well. Still, for the segmentation results of
the image edge, the performance of the KNN algorithm and the random walk algorithm is
slightly worse than that of the RW-KNN algorithm. The random walk algorithm has the
worst effect among these three algorithms (the image in the middle of the second row in
Fig. 4). The bottom flowerpot of the bush image divides most of the background into the
foreground. This is also the calculation weight of random walk. The problem was caused
by the limited sampling point range of the unknown point neighborhood. Although the
KNN algorithm can segment the rough outline of this part of the foreground, there are
still many background pixels divided into the foreground, resulting in a decrease in
segmentation accuracy. The algorithm in this paper (RW-KNN) can accurately segment
the background. The algorithm in this paper combines the KNN algorithm in the affine
class when sampling to accurately segment the bushes, which improves the accuracy of
the foreground probability value.
Finally, the person image with the characteristics of soft hair boundary was selected
to analyze the segmentation situation (refer to Fig. 5). The feature of this image is that
there are soft borders of hair in the image, the edge details are more complicated than the
previous two images, and the segmentation is more difficult. The edge details are more
complex than Figs. 3 and 4, and the segmentation is more difficult than the previous
two images. The first line of trimap is a three-parted graph of the original image, and
Ground truth is the person’s true foreground image. The three images in the second
row are the output of the gray image by KNN, random walks and RW_KNN algorithm,
respectively. The third line is the binary segmentation image output by three algorithms.
It can be seen that the performance of the random walk algorithm is the worst, and the
image of the soft boundary causes the image segmented by the random walks algorithm
to be unable to recognize the segmentation target intuitively. The overall contour is
not completely segmented. KNN algorithm performs better and can roughly extract the
contour of the person, but the division error still causes over-segmentation phenomena.
The performance of the proposed algorithm is the best person head contour segmented by
the algorithm is more realistic, which improves the segmentation effect of soft boundary
and can completely divide the foreground region.
340 X. Su et al.

Fig. 4. Bush Segmentation result

5.2 Quantitative Analysis

Section 5.1 evaluates the image segmentation results from the visual point of view. This
section used mean MSE and SAD to evaluate the image segmentation results. The MSE
and SAD of the above three images are listed in Table 1 and Table 2. It could be seen
that the mean square error and absolute error of this algorithm were smaller than the
other two algorithms; that is, the results of this algorithm were closer to the real results
than the original algorithm.
In order to evaluate more objectively, this paper calculates the mean MSE and SAD
of the three images and calculates the average value. But to avoid contingency, this
experiment calculates the error of each algorithm for 50 images in the dataset with the
same trimap, and the results are listed in Table 3. The error caused by this algorithm was
less than the other two algorithms.
A Random Walks Image Segmentation 341

Fig. 5. Person segmentation result

Table 1. MSE of 3 images

Image KNN Random walks RW_KNN


Fullmoon 0.639 0.076 0.075
Bush 3.09 7.19 2.22
Person 0.345 7.31 0.243
Average value 1.358 4.85 0.846

Table 2. SAD of 3 images

Image KNN Random walks RW_KNN


Fullmoon 0.101 0.0214 0.0210
Bush 0.787 1.752 0.232
Person 0.143 1.826 0.108
Average value 0.343 1.19 0.120
342 X. Su et al.

Table 3. Errors of 50 images

Image KNN Random walk RW_KNN


MSE 1.993 2.156 1.067
SAD 0.405 0.642 0.134

6 Conclusions

Based on the random walk algorithm, this paper proposes a random walks image seg-
mentation method combined with KNN affine class. Using the KNN algorithm in the
affine class, you can search for more points in the space that are similar to the color char-
acteristics of the unknown pixel. Combine the random walk algorithm and the sparse
matrix formed by the KNN algorithm with the sparse random walk algorithm. After the
matrices are added, it can make up for the shortcomings of a small search space in the
neighborhood of unknown pixels and less pixel feature information when calculating
the weights of random walks. The experiment in this paper is compared with the KNN
algorithm and the random walk algorithm, which effectively improves the segmentation
accuracy of the random walk algorithm and has a better segmentation effect on the seg-
mented image with soft boundaries such as hair. Compared with the traditional random
walk and KNN algorithm, the segmentation results of the improved random walk algo-
rithm on the Berkeley data set have a greater improvement in vision and accuracy. The
experimental results show that the method in this paper has a good processing effect, has
research value in segmentation accuracy in the image field, and has a certain use-value
in batch processing of image segmentation projects.

Acknowledgment. This work is supported by Heilongjiang Provincial Natural Science Founda-


tion of China (No. LH2021F034).

References
1. Xia, M.: Iterative segmentation algorithm based on random walk. Digital Technol. Appl.
37(09), 120–122 (2019)
2. Guo, L., Gao, L., Liu, M., et al.: A new random walk multi vehicle detection algorithm.
Chinese J. Image Graph. 3, 392–396 (2010)
3. Mao, Z., Han, Y.: Automatic image segmentation algorithm based on random walk. Sens.
Microsyst. 37(06), 142–145 (2018)
4. Gradyl: Random walks for Image Segmentation. IEEE Trans. Pattern Anal. Mach. Intell.
28(11), 1768–1783 (2006)
5. Chen, Q., Shen, Q., Zhang, K., Liu, P.: Edge preserving KNN matting. Minicomput. Syst.
36(07), 1591–1596 (2015)
6. Bai, Y., Yao, G.: A robust matting method based on KNN post-processing. Comput. Appl.
Softw. 37(09), 170–175 (2020)
7. Zhou, Y., Chen, J., Kan, Y.: An automatic image segmentation algorithm based on random
walk. Inf. Technol. 2015(12), 75–79 (2015)
A Random Walks Image Segmentation 343

8. Chen, Q., Li, D., Tang, C.: KNN matting. In: Proceedings of the 2012 IEEE Conference on
Computer Vision and Pattern Recognition, pp. 2175–2188. IEEE (2012)
9. Wang, F., Qin, F., Jiang, D., Song, C.: Random walk image segmentation based on visual
attention. Chin. J. Sci. Instr. 38(07), 1772–1781 (2017)
10. Yao, G., Liu, S., Li, M., et al.: Application and analysis of post-processing in digital matting.
Acta Electronica Sinica 45(3), 719–729 (2017)
11. Yao, G., Yao, H.: Overview of image matting algorithm based on affine method. J. Comput.
Aided Design Graph. 28(4), 677–692 (2016)
12. Yu, X., Zhou, N., Zhang, F.: Research on Image automatic classification model based on
KNN. J. Libr. China 01, 74–76 (2007)
13. Nie, D., Wang, L.: Improved KNN matting technology. Minicomputer Syst. 36(06), 1316–
1320 (2015)
14. https://www2eecs.berkeley.edu/Research/Projects/CSision/bsds/.last. Accessed. The Berke-
ley Segmentation Dataset and Benchmark Homepage (2020)
15. Wang, F., Qin, F., Jiang, D., Song, C.: Random walk image segmentation based on visual
attention. J. Instrum. 38(07), 1772–1781 (2017)
Generative Adversarial Network Image
Inpainting Based on Dual Discriminators

Jianming Sun(B) , Jinpeng Wu, and Xuena Han

School of Computer and Information Engineering, Harbin University of Commerce, Harbin


150028, Heilongjiang, China
sjm@hrbcu.edu.cn

Abstract. A generative adversarial network with a double discriminator fusing


dilated convolution, and full residual blocks is proposed for existing deep learning
image inpainting methods with poor texture after inpainting, unclear pictures, and
poor picture semantic consistency. The generator of the model consists of a rough
network with skip connections and a refined network, where the skip connections
in the rough network fuse the contextual information of the image with higher-
level semantic information. To gain a larger perceptual field and deeper semantic
information, the rough and refined networks use dilated convolution. At the same
time, to avoid the gradients disappearing and convergence being too slow due to
the network being too complex, a convolutional layer with four residual structures
was used in the refinement network. Experimental results prove that the improved
network model in this paper generates more realistic images with greater detailed
semantic consistency than the existing network model.

Keywords: Image inpainting · Generative adversarial networks · Dilated


convolution · Skip connection · Residual blocks

1 Introduction
Image inpainting [1–3] refers to the use of known contextual information to repair
missing areas of the picture. It is widely used in restoring images and areas, such as
removing targets and dispersing them. Although many researchers have proposed many
image inpainting methods, both deep and non-depth methods, such as patch-based [4]
Image inpainting methods, context encoder [5] methods, etc., image inpainting is still a
hot task in computer vision and computer graphics today. This is because it is tough to
synthesize textures with more detail and visual clarity and make the computer understand
the semantic information in the image. Based on this phenomenon, we have considered
the skip connection in the network, which allows the contextual information of shallow
images to be fused with deeper semantic information.
Our work is based on the Context Encoder model put forward by Pathak et al. [5]
in 2016, which is the first model that uses generative adversarial network(GAN) [6]
regarding image inpainting. And the method can only process 128*128 images, which
has some limitations.

© The Author(s), under exclusive license to Springer Nature Switzerland AG 2022


A. E. Hassanien et al. (Eds.): BIIT 2021, LNDECT 107, pp. 344–355, 2022.
https://doi.org/10.1007/978-3-030-92632-8_33
Generative Adversarial Network Image Inpainting 345

We use a fully convolutional neural network as the basis of our model and propose
a dual generator that fuses dilated convolution [7] and residual blocks [8] as a way to
repair to obtain higher quality and visually better images. Our model from four networks:
a rough inpainting network, a refinement network, a local discriminator network, and
a global discriminator network. Two generators are used to complete the images, and
both global and local discriminators are auxiliary networks dedicated to training. These
discriminators are used to discriminate whether an image is real or not. The global
discriminator uses the whole image as input to identify consistency across the image,
thus ensuring consistency and continuity in and around the area to be repaired. In contrast,
the local discriminator looks at only a small repair area to discern whether the local details
are perfect.
We evaluate our models against other deep learning-based models in specific street
scenes and faces, and our models both have smaller loss values and higher peak signal-
to-noise ratios.
In summary, in this paper, we put forward that:

• Propose a generative model with skip connection to fuse the contextual information
of the image with higher-level semantic information so that our model can understand
the semantic information better than other models;
• To increase the perceptual field, both generators are using dilated convolution;
• The generator also uses a convolution layer with residual blocks to avoid gradient
disappearance and speed up convergence.

2 Related Work

Currently, image inpainting methods fall into two types: traditional algorithms and deep
learning-based approaches.
Traditional image restoration methods can be mainly classified into diffusion-based
and sample-based methods. Diffusion-based methods mainly include the partial differ-
ential equation-based methods proposed by Bertalmio. With these methods, the user
needs to specify the area to be repaired. The algorithm diffuses the information outside
the contour normal to the boundary of the area to be repaired to the middle pixel to
be repaired along the contour. The algorithm uses the smoothness of the local color to
diffuse along the contour, taking into account anisotropic diffusion to ensure that the
boundary is continuous at the edges, but the method is computationally unstable.
In contrast, sample-based methods assume that known samples can represent miss-
ing regions of an image. These methods mainly include texture synthesis-based image
inpainting algorithms. One way of thinking about texture synthesis-based image inpaint-
ing algorithms is to decompose the image into structural and textured parts. The structural
part is restored using a variational PDE-based method and the textured part filled with
a texture synthesis method. This method can achieve better inpainting results for large
areas of broken images by filling in missing regions of arbitrary size and repairing texture
details in broken parts.
Conventional algorithms tend to have better results when the inpainting area is small
and the texture structure is relatively simple. Once the missing area is relatively (30%
346 J. Sun et al.

or more) large, inpainting results tend to be particularly poor. This is because traditional
image inpainting methods often fail to understand the semantic information in the image
at a deep level. Therefore, with the development of deep learning after 2015, more and
more researchers working in image inpainting are adopting deep learning methods to
obtain a deeper understanding of semantic information and higher quality inpainting
results.
Image inpainting for deep learning is implemented using generative adversarial net-
works. For example, Pathak et al. proposed the Context Encoder (CE) model, which is
similar to the autoencoder network [9] in that it consists of an Encoder-Decoder, with
the generator consisting of the CE network, taking the first five layers of AlexNet [10]
as the encoder. The decoder is implemented by deconvolution to transform from high-
dimensional features to the true size of the image. The discriminator consists of a local
discriminator. Both the generator and discriminator structures of the Context Encoder
model are relatively simple, and the complementary results, although more realistic,
have very unsmooth boundaries and do not satisfy local consistency. Iizuka et al. [11]
improved the Context Encoder model by using double discriminators to overcome this
problem. Traditional image inpainting methods are good at sampling background images,
while CNN models generate new textures. To comprehensively utilize the advantages
of these two methods and make full use of redundant information in pictures, Yu et al.
[12] proposed a contextual attention method for image inpainting, which usually adopts
two steps from coarse to fine. The first step is rough inpainting, and the second step is
to find image blocks similar to those in occluded areas for improvement.
Although the above ways have shown good results in image inpainting tasks, there
are many shortcomings: inaccurate inpainting when the texture information is complex;
blurring at the edges; low semantic consistency.
Therefore, we propose a generative adversarial model incorporating dilated convo-
lution and residual blocks to address the problems of unclear edges, poor texture, and
semantic consistency after inpainting. The model consists of a rough repair model and a
refinement model. The image with the mask is used as the input. After the first repair by
the rough network, the image after the first repair is used as the input of the refinement
network to generate a more detailed and realistic image. The reconstruction loss and the
weighted loss function of the adversarial loss are used to optimize the training results of
the model. The skip connection in the rough network fuses the contextual information of
the image with higher-level semantic information, so the rough and refined networks with
skip connection proposed in this paper can generate images with clear edges, textural
structure, and visually realistic results.

3 Network Model

The model proposed in this paper (shown in Fig. 1) consists of two main components: a
generator and a double discriminator. The generator refers to the rough repair network
with skip connection and dilated convolution (shown in Table 1) and the refined net-
work with dilated convolution and residual blocks (shown in Table 2). In contrast, the
double discriminator refers to the global discriminator (shown in Table 3) and the local
discriminator (shown in Table 4) to ensure global consistency and local consistency. The
Generative Adversarial Network Image Inpainting 347

rough process is as follows: the image with the mask is used as input to generate a coarse
restored image through the first layer (coarse network). Then the coarse restored image
is used as input to generate a finer image through the second layer (refined network). The
global discriminator and local discriminator are then used to determine the integrity of
the whole image and small areas, respectively, to achieve global and local consistency.

Fig. 1. Inpainting models incorporating dilated convolution and residual blocks.

In the following four tables, C is convolution, DC is dilated convolution, and FC is


the full connection layer.

Table 1. The rough network architecture, in particular the up-sampling before the thirteenth and
fifteenth convolutional layers, and the convolutional kernel with 128/64 convolutional steps of 3*3,
is similar to the effect of deconvolution, both to increase the size of the image and the following
refinement network has a similar operation.

Type Kernel Dilation Stride Padding Inputs Outputs


C 5*5 1 1*1 2 3 64
C 3*3 1 2*2 1 64 128
C 3*3 1 1*1 1 128 128
C 3*3 1 2*2 1 128 256
C 3*3 1 1*1 1 256 256
C 3*3 1 1*1 1 256 256
DC 3*3 2 1*1 2 256 256
DC 3*3 4 1*1 4 256 256
DC 3*3 8 1*1 8 256 256
DC 3*3 16 1*1 16 256 256
C 3*3 1 1*1 1 256 256
C 3*3 1 1*1 1 256 256
Upsampling – – – – – –
C 3*3 1 1*1 1 256 128
C 3*3 1 1*1 1 128 128
Upsampling – – – – – –
(continued)
348 J. Sun et al.

Table 1. (continued)

Type Kernel Dilation Stride Padding Inputs Outputs


C 3*3 1 1*1 1 128 64
C 3*3 1 1*1 1 64 32
C 3*3 1 1*1 1 32 3

Table 2. Refinement of network architecture

Type Kernel Dilation Stride Padding Inputs Outputs


C 5*5 1 1*1 2 3 64
C 3*3 1 2*2 1 64 64
C 3*3 1 1*1 1 64 128
C 3*3 1 2*2 1 128 128
C 3*3 1 1*1 1 128 256
C 3*3 1 1*1 1 256 256
DC 3*3 2 1*1 2 256 256
DC 3*3 4 1*1 4 256 256
DC 3*3 8 1*1 8 256 256
DC 3*3 16 1*1 16 256 256
C 3*3 1 1*1 1 256 256
C 3*3 1 1*1 1 256 256
Upsampling – – – – – –
C 3*3 1 1*1 1 256 128
C 3*3 1 1*1 1 128 128
Upsampling – – – – – –
C 3*3 1 1*1 1 128 32
C 3*3 1 1*1 1 32 32
C 3*3 1 1*1 1 32 3

Table 3. Global discriminator architecture

Type Kernel Dilation Stride Padding Inputs Outputs


C 5*5 1 2*2 2 3 64
C 5*5 1 2*2 2 64 64
C 5*5 1 2*2 2 64 64
C 5*5 1 2*2 2 64 256
FC – – – – – 1
Generative Adversarial Network Image Inpainting 349

Table 4. Local discriminator architecture

Type Kernel Dilation Stride Padding Inputs Outputs


C 5*5 1 2*2 2 3 64
C 5*5 1 2*2 2 64 128
C 5*5 1 2*2 2 128 256
C 5*5 1 2*2 2 256 256
FC – – – – – 1

3.1 Skip Connection


The generator part of the Context Encoder model uses a context encoder similar to an
autoencoder (shown in Fig. 2). The encoder part uses the first five layers of AlexNet to
extract features, and the decoder uses deconvolution to up-dimension the feature map to
achieve feature reconstruction. However, during the encoding process, some contextual
information is often lost through convolution, and this information is often not recovered
through deconvolution.

Fig. 2. Context Encoder generates models

Therefore a skip connection (shown in Fig. 1) is added to the rough network to


fuse the contextual information of the image with higher-level semantic information.
The specific implementation is: the output after activation of each convolution block
of the encoder is input and spliced after activation of each convolution block at the
corresponding position of the decoder in the channel dimension. The spliced feature
map is used as the input to the convolutional layer.

3.2 Dilated Convolution


Generally speaking, increasing the receptive field reduces the image size by pooling and
then returning to the original image size by deconvolution. However, such operations tend
to have relatively large parameters, and some spatial information will be lost in reducing
the size first and then returning to the original image size. The dilated convolution can
avoid using pooling, and at the same time, it provides a larger receptive field, so each
convolution output contains a larger range of information.
350 J. Sun et al.

This paper uses four dilated convolutions in both generator networks to achieve
different sizes of perceptual fields by setting different void rates (2, 4, 6, 8) to obtain
multi-scale information.

3.3 Residual Blocks


Ordinary networks, like VGG [13], have no residual learning. Experience shows that
with the deepening of network depth, training errors will first decrease and then increase
(and it is proved that the increase of errors is not caused by overfitting but is difficult
to train due to the deepening of the network). Theoretically, the deeper the network, the
better. But, if there is no residual network, the deeper the network is, the more difficult
it is to train with an optimization algorithm. In fact, with the increase of network depth,
training errors will increase, which is described as network degradation.
Therefore, we add residual blocks to a complex generative network. This is achieved
by adding a short-circuit layer after the fourth convolutional layer and before the seventh.
The output of the fourth convolutional layer and the input of the seventh convolutional
layer are stitched together in the channel dimension as the input of the seventh convo-
lutional layer, forming a residual block. Similarly, three more residual blocks are added
after the dilated convolution. This ensures that the short-circuiting layer is disconnected
when the model is under-expressed to train more complex models. When the model
network degrades, the short-circuiting layer is encouraged to speed up the convergence
of the model.

3.4 Loss Functions


Adversarial loss is used in both discriminator sections to optimize the model training
results. A weighted loss function of reconstruction loss (L2 loss) and adversarial loss is
used in the generator section to optimize the model training results. Reconstruction loss
is used to capture the overall structure of the region to be repaired and the consistency of
its context. In contrast, the adversarial loss makes the images look as realistic as possible.
Reconstruction Loss. Assuming that the two images are X1 , X2 , the reconstruction loss
is calculated as follows:
1 
H 
W
Lrec (X1 , X2 ) = (X1 (i, j)−X2 (i, j)) (1)
H ×W
i=0 j=0

Where H and W correspond to the height and width of the image.


Adversarial Loss
Ladv = max Ex∼pr (x) [log(D(x)) + log(1 − D(G((1 − M)  x)))] (2)
D
Where pr (x) is the distribution of the real image, and M is the mask.
Joint Loss.
L = λ1 Lrec + λ2 Ladv (3)
where λ1 and λ2 are the parameters for reconstruction loss and antagonistic loss,
respectively, in the experiments, λ1 = 0.998, λ2 = 0.002..
Generative Adversarial Network Image Inpainting 351

4 Experiment
4.1 Datasets

To evaluate the effectiveness of the inpainting method in this paper, we used images
from two databases to train and evaluate our dual generator model: Paris StreetView and
CelebA. Paris StreetView contains 62,058 high-quality street view images with accurate
GPS coordinates and compass directions for each street view landmark. CelebA contains
202,599 face pictures of 10,177 celebrities. We set the mask to the central region and
the coverage to 25% of the whole image.

4.2 Parameter Setting

The experimental platform in this paper is a Windows 10 system, a combined program-


ming environment of Python 3.7 and PyTorch 1.8.1. The CPU configuration is Intel
Core i9-10900K at 3.7 GHz, the RAM is 32 GB, and the GPU configuration is NVIDIA
GeForce RTX 3080. The Adam [14] optimizer was used to optimize the model. This paper
sets a learning rate of 0.0002 to train the generator until convergence. The discriminator
is trained with a learning rate of 0.002.

4.3 Experimental Results and Analysis

In this paper, the effectiveness is evaluated on the Paris StreetView and CelebA datasets. It
is also compared with representative image inpainting methods based on GAN in recent
years (generative adversarial model with context encoder and generative adversarial
model with contextual attention). This paper uses PSNR and SSIM for quantitative
evaluation.
The PSNR is calculated as follows:
MAXX21
PSNR(X1 , X2 ) = 10 lg( ) (4)
MSE
Where X1 and X2 are the two images, MAXX21 is the maximum possible pixel value
of the image. Each pixel is represented by an 8-bit binary, so the maximum pixel is taken
as 256, and MSE represents the mean squared error of X1 and X2 .
The model is trained and evaluated on CelebA and Paris StreetView data sets. The
model is trained by a large amount of data. Finally, the average PSNR and SSIM of the
model on different data sets are tested and calculated and compared with the generative
confrontation model based on context encoder and the generative confrontation model
based on context attention. The average pairs of PSNR and SSIM of the above three
repair methods on data sets are shown in Table 5 and Table 6.
352 J. Sun et al.

Table 5. Comparison of SSIM, PSNR with different inpainting methods on PSV

Method SSIM PSNR Mean L2 Loss


Context Encoder 0.787 17.59dB 2.47%
Contextual Attention 0.805 18.87dB 2.21%
Our Model 0.816 20.93dB 2.18%

Table 6. Comparison of SSIM, PSNR of different inpainting methods on CelebA

Method SSIM PSNR Mean L2 Loss


Context Encoder 0.820 19.23dB 2.16%
Contextual Attention 0.824 19.70dB 2.08%
Our Model 0.844 22.43dB 2.01%

From Tables 5 and 6, it can be seen that the mean values of PSNR and SSIM of this
method are higher than those of the other two methods, and the experiments prove that
the restoration results of this algorithm on the CelebA and PSV datasets are better than
those of the other two restoration methods.
The restoration results of the way in this paper on the Paris StreetView and CelebA
datasets are shown in Fig. 3. From Figure., it can be seen that the method based on
Context Encoder can achieve good results on the textures but is too blurred due to the
addition of L2 loss for training. The method based on Contextual Attention with the
addition of an attention mechanism does not work well on the five senses after repair,
and even the glasses are not repaired. However, the image restored by the method in this
paper is the same as the real image in terms of color, structure, and style, and the texture
details of the image are built more accurately and clearly; the boundary between the
missing area and the background area is natural and clear, the visual effect is consistent,
and there is no trace of restoration. This shows that the method in this paper has a good
effect on the image restoration task.
Generative Adversarial Network Image Inpainting 353

Fig. 3. Comparison picture of different restoration ways


354 J. Sun et al.

5 Conclusion
We propose a GAN with dual discriminators fusing dilated convolution and residual
blocks. The network uses skip connection in the rough repair network after having
two generators and two discriminators, which makes the contextual information fused
with the high-level semantic information to improve the semantic consistency better;
proposes the use of multiple dilated convolutional layers with different dilated rates
in the two generators to obtain various perceptual fields and multi-scale information;
also proposes the use of residual blocks in the fine-tuning network, which To a cer-
tain extent, it avoids the problems of gradient disappearance and too slow convergence
due to the complexity of the model; the loss functions using reconstruction loss and
adversarial loss weighting are trained together with the generative network and the dual
discriminant network to enhance the global and local semantic consistency of the repair
region. Qualitative and quantitative analyses were conducted on the CelebA and Paris
StreetView datasets, respectively. The experimental results show that the model pro-
posed in this paper, after inpainting, results in clear images, well-structured textures,
and high semantic consistency.

Acknowledgments. This work is supported by the grants of the Natural Science Foundation of
Heilongjiang Province of China (No. LH2020F007), Heilongjiang philosophy and Social Sci-
ences project (Grant Nos.18GLB029), and Harbin University of Commerce young creative talents
Support project (Grant Nos. 2019CX02).

References
1. Bertalmio, M., Sapiro, G., Caselles, V.: Image inpainting. In: Proceedings of the 27th Annual
Conference on Computer Graphics and Interactive Techniques, pp. 417–424. New Orleans,
LA, USA (2000)
2. Shen, J.H., Chan, T.F.: Mathematical models for local nontexture inpaintings. SIAM J. Appl.
Math. 62(3), 1019–1043 (2002)
3. Criminisi, A., Pérez, P., Toyama, K.: Region filling and object removal by exemplar-based
image inpainting. IEEE Trans. Image Process. 13(9), 1200–1212 (2004)
4. Huang, J., Kang, S., Narendra, A.: Image completion using planar structure guidance. ACM
Trans. Graph. 33(4), 1–10 (2014)
5. Pathak, D., Krähenbühl, P., Donahue, J.: Context encoders: feature learning by inpainting.
In: Proceedings of 2016 IEEE Conference on Computer Vision and Pattern Recognition, Las
Vegas, NV, USA, pp. 2536–2544 (2016)
6. Goodfellow, I.J., Pouget-Abadie, J., Mirza, M.: Generative adversarial nets. In: Proceedings of
the 27th International Conference on Neural Information Processing Systems, pp. 2672–2680.
Montreal, QC, Canada (2014)
7. Yu, F., Koltun, V.: Multi-scale context aggregation by dilated convolutions. In: ICLR (2016)
8. He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: 2016
IEEE Conference on Computer Vision and Pattern Recognition, pp. 770–778 (2016). https://
doi.org/10.1109/CVPR.2016.90.
9. Rumelhart, D.E., Hinton, G.E., Williams, R.J.: Learning internal representations by error
propagation. In: Rumenhart, D.E., McCelland, J.L. (eds.) Parallel Distributed Processing:
Explorations in the Microstructure of Cognition, pp. 318–362. MIT Press, Cambridge (1986)
Generative Adversarial Network Image Inpainting 355

10. Krizhevsky, A., Sutskever, I., Hinton, G.: ImageNet classification with deep convolutional
neural networks. In: NIPS. Curran Associates Inc. (2012)
11. Iizuka, S., Simo-Serra, E., Ishikawa, H.: Globally and locally consistent image completion.
ACM Trans. Graph. 36(4), 107 (2017)
12. Yu, J.H., Lin, Z., Yang, J.M.: Generative image inpainting with contextual attention. In:
Proceedings of 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition,
Salt Lake City, UT, USA, pp. 5505–5514 (2018)
13. Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image
recognition. Computer Science (2014)
14. Kingma,D., Ba, J.: Adam: a method for stochastic optimization. Computer Science (2014)
GrabCut Image Segmentation Based on Local
Sampling

Guilin Yao1,2(B) , Shirui Wu1,2 , Huixin Yang1,2 , and Shizhou Li1,2


1 Harbin University of Commerce, Harbin 150028, China
glyao@hrbcu.edu.cn
2 Heilongjiang Provincial Key Laboratory of Electronic Commerce and Information Processing,

Harbin 150028, China

Abstract. In traditional image segmentation, the GrabCut image segmentation


algorithm is a popular and effective method. The current GrabCut image segmen-
tation algorithm is based on the Gaussian mixture model of the global foreground
and background of the image. Still, it cannot achieve good results when the fore-
ground and background are similar. In this paper, we propose a method for local
sampling of the foreground and background. This method samples the foreground
and background around unknown pixels based on the distance. It has an advantage
when the foreground and background are similar. The experimental results show
that the GrabCut method with local sampling can achieve good results when many
colors appear in the foreground and background at the same time.

Keywords: Image segmentation · Local sampling · GrabCut

1 Introduction
Image segmentation is an important part of the field of computer vision. It is a process
of dividing several pixels of an image into several specific regions. The purpose of
image segmentation is to classify the pixels of the image. With the development of
artificial intelligence, image segmentation technology based on deep learning has been
widely used in image segmentation. However, traditional image segmentation methods
still occupy a place in image segmentation due to their characteristics. Traditional image
segmentation technology has the following methods: direct segmentation based on image
threshold, segmentation based on the growth of seed pixel area, segmentation based
on edge detection, segmentation based on graph theory, and so on. GrabCut image
segmentation algorithm is an image segmentation technology based on graph theory.
The basic principle of graph theory will be introduced below.

2 Related Work
The image segmentation method based on graph theory [1] maps the image into a
weighted undirected graph. Each pixel in the image is a node of the directed graph,

© The Author(s), under exclusive license to Springer Nature Switzerland AG 2022


A. E. Hassanien et al. (Eds.): BIIT 2021, LNDECT 107, pp. 356–365, 2022.
https://doi.org/10.1007/978-3-030-92632-8_34
GrabCut Image Segmentation Based on Local Sampling 357

and the adjacent pixels are the nodes of the undirected graph connected by edges. The
weight of the edge represents the relationship between pixels. The cutting capacity is
represented by the energy function, which is the sum of the weights of all edges in the
cutting. A certain cutting algorithm cuts the graph, and then the graph is divided into a
number of different sets to minimize the energy function. In Graph Cut, two terminal
vertices are additionally added, as shown in Fig. 1.

Fig. 1. A graph-cut has two terminal vertices, a source point and a sink point. Each vertex of the
image has an edge connected to the adjacent vertices and an edge connected to the source point
and the sink point. The blue line is the connection between each vertex and the source point. The
red line is the connection between each vertex and the sink point.

The source point can be expressed as the foreground and background segmentation,
and the sink point can be represented as the background. The weight of the connection
between each vertex and the terminal vertex on the graph can be expressed as the cost of
a vertex belonging to the foreground or the background. The connection between each
vertex reflects the color relationship between pixels. If we can find a cut to break these
edges, the undirected graph will be divided into two disjoint subsets: the foreground and
the background. We need to find a minimum cut set with the smallest energy to minimize
the sum of the weights of the two sets of edges.
In 2001, Boykov [2] proposed the Graph Cut, a segmentation method based on graph
theory. In 2004, Rother [3] proposed the GrabCut algorithm based on Graph Cut. GrabCut
is an interactive image segmentation technology based on Graph Cuts. It transforms the
gray histogram processing into RGB image processing based on graph cut. Firstly, the
GMM model [4] is used to describe the pixel distribution of the image’s global known
foreground and background, and the max flow min cut [5] algorithm is used to cut the
image. Finally, multiple iterations are performed to minimize the energy function to
obtain the final segmentation result. The algorithm can achieve better results when the
foreground and background color distribution is different, but the effect is poor when
the color difference is small. Others improved the algorithm for this problem. Yi Cong-
cong [6] first downsamples the image, calculated GMM parameters with representative
pixels, and then performed watershed segmentation [7] to solve the problem of similar
foreground and background colors. However, morphological operations are still needed.
Ling Bin [8] used the image’s depth information to construct super-pixels [9] with a
linear iterative clustering algorithm to generate a structural graph model. The algorithm
358 G. Yao et al.

improved the segmentation efficiency and result, but it is not suitable for the image with
less depth information.
This paper proposes GrabCut, which collects foreground and background samples
based on local sampling and then conducts local clustering. When collecting samples, we
collect samples according to the distance from unknown pixels to the known foreground
and background pixels so that we can pay more attention to the local information around
unknown pixels and avoid the problem of color information loss to a certain extent when
the foreground and background colors are similar.

3 The Proposed Approach

When calculating the GMM model of foreground and background, the GrabCut algo-
rithm obtains global foreground and background samples. In the GMM iteration process,
each pixel is assigned to a Gaussian component of the foreground or background GMM.
The weights and parameters of the GMM are updated after the iteration. However, in the
process of clustering and iteration, once some background pixel clusters or foreground
pixel clusters are too close, unknown pixels are easily assigned to the wrong pixel clus-
ters, which leads to inaccurate classification results. In the clustering process, if some
background pixels are not significantly different from other background pixels or are
similar to foreground pixels, the difference information is easily lost in the iterative pro-
cess. Therefore, the Gaussian mixture model will focus on the overall color distribution
and ignore the color information.
In this paper, a sampling method in [10] is proposed. We propose a GrabCut method
based on local sampling. It is considered that when judging whether an unknown pixel
belongs to the foreground or the background, the pixel closer to itself should be found as
the sample. Therefore, the unknown pixel collects surrounding foreground samples or
background samples as a sample set for local clustering instead of constructing a GMM
of global foreground and background pixels. The calculation of the region term remains
unchanged, and the max flow min cut can obtain the image segmentation result without
multiple iterations to minimize the energy.
The search range can be determined by the distance between the sample points and the
unknown points and the number of sample points to be collected when collecting sample
points. Since the color change between adjacent pixels will not be very drastic, adjacent
sample points can be collected at intervals. If the color change is not severe, fewer sample
points represent the local color distribution. The color components in the clustering can
better reflect the details of local color changes in the image. The missing local color
information of the original global GMM model is captured in the local clustering. At the
same time, this also means that foreground and background clustering is established for
each unknown pixel, instead of sharing a foreground GMM and a background GMM for
all unknown pixels like the original GrabCut algorithm. The Gaussian clustering model
based on local sampling does not need to calculate parameters many times, and only one
iteration can obtain good results.
GrabCut Image Segmentation Based on Local Sampling 359

3.1 Implementation of Local Sampling


The specific implementation of local sampling is in [11]. The range of collected samples
varies according to the distance from unknown pixels to known foreground and back-
ground. Intuitively, the closer the unknown sample is to the known area, the more likely
it is related to the known sample. When searching the nearest sample set, we must first
determine a search range. We use the known foreground or background sample points
within the rectangular range around each unknown pixel. The search range is determined
by the number of sample points to be collected. In the experiment, we set it to 100. We
also set the interval to collect sample points. In order to eliminate the limitation of sam-
pling in the known area, our search scope includes sampling near the boundary and deep
into the known area, so as to generate a more comprehensive sample set. Figure 2 shows
the global sampling of GrabCut and the local sampling to be used in this article.

Fig. 2. This is a schematic diagram of global and local sampling, where the red dots represent
the sampling points. The left figure shows the global sampling adopted by the GrabCut algorithm,
which samples all known sample points in the global range. The right figure is the local sampling
adopted by our method, and it samples in the local range around unknown pixels without using
all known sample points.

Local sampling ensures that each color distribution within the sampling range can
be represented in the sample set. We can also weigh the sample points according to the
distance. This is because for unknown pixels close to the boundary, generally, the closer
the boundary, the higher the correlation between pixels in the known area. In addition,
the farther away the known region is from the boundary, the more difficult it is for its
pixels to establish a correlation with any unknown pixels. Therefore, when we want to
collect more comprehensive samples, we can follow the above sampling principles.

3.2 Implementation of Color Clustering


The binary tree color matching method of M. T. orchard [12] is referred to when clustering
the collected sample colors. The color samples obtained by local sampling are divided
into different color clusters through the binary tree structure.
360 G. Yao et al.

First, divide the sample color as the parent node to get its own two child nodes, and
then divide the child nodes. Like its parent node, the child nodes of any node are also
divided into two groups. By generating such a binary tree, the pixels in the sample color
can be represented by the pixels in the leaf nodes. The following describes the specific
color separation method.
Suppose that the pixel group of the initial sample is represented as Cn , xs is the sample
pixel, and by the following equation, Rn is a 3 × 3 matrix, mn is a vector of 3 × 1.

Rn = xs xst (1)
s∈cn

mn = xs (2)
s∈cn

Nn = |Cn | (3)

Since the mean value of the group is the point deviating from the least square, it is
assumed that the quantitative representation of the group is the mean value of the group
qn .
mn
qn = (4)
Nn
Define the covariance of this pixel group as
1
Rn = Rn − mn mtn (5)
Nn
According to the structure of binary tree, the pixel group is divided into two groups,
which are similar to the two children of the parent node. The intuitive idea is to select a
plane that can best divide the color group to divide it. Therefore, to determine the direction
of the largest color change, divide it into two subsets with a plane perpendicular to that
direction and passing through the average value of the group. In order to obtain the
direction vector e of this pixel group, there is the following equation
  2
(xs − qn )t e = et Rn e (6)
s∈Cn

Since the direction of the main eigenvector of the matrix is the direction in which
it changes the most. Therefore, the eigenvector solution en of this pixel group Rn is its
principal eigenvector λn .
  2
λn = (xs − qn )t en (7)
s∈Cn

When en is solved, Cn is further divided into two subsets C2n and C2n+1 by the
following equation.
 
C2n = s ∈ Cn : ent xs ≤ ent qn (8)

 
C2n+1 = s ∈ Cn : ent xs > ent qn (9)
GrabCut Image Segmentation Based on Local Sampling 361

So Cn has its two child nodes C2n and C2n+1 . R2n , m2n and N2n is calculated according
to the Eqs. (1), (2), (3). R2n+1 , m2n+1 and N2n+1 is calculated according to the following
equation.
R2n+1 = Rn − R2n (10)

m2n+1 = mn − m2n (11)

N2n+1 = Nn − N2n (12)


According to the above method, the sample color is divided into two most disjoint
pixel groups. Repeat the above steps and more pixel groups are divided. In this way,
Gaussian clustering of different pixel colors is formed, as shown in Fig. 3.

Fig. 3. After the above steps, the pixel groups with similar colors will be divided into the same
Gaussian cluster. The red points represent sample points, and the blue ellipses represent different
Gaussian clusters.

Finally, all leaf nodes of the binary tree are clusters of sample pixels and the local
Gaussian clustering of sample colors are obtained.

3.3 Implementation of GrabCut Algorithm


In the previous section, local Gaussian clustering is used to obtain the Gaussian mixture
model of sample color, and then the energy function Gibbs energy is defined in GrabCut
E(α, k, θ, z) = U(α, k, θ, z) + V(α, z) (13)
362 G. Yao et al.

In Eq. (13), U represents the data term, represents the cost of an unknown pixel
belonging to foreground GMM or background GMM. The cost is obtained by the Gaus-
sian mixture model. It can be specifically expressed as the probability that each unknown
pixel is assigned a color component belonging to the foreground GMM or background
GMM.
  
U(α, k, θ, z) = D αn, kn , θ, zn (14)
n

D(αn , kn , θ, zn ) = −logπ (αn , kn ) + 21 logdet (αn , kn )
 (15)
+ 21 [zn − μ(αn , kn )]T (αn , kn )−1 [zn − μ(αn , kn )]

θ = π(α, k), μ(α, k), (α, k), α = 0, 1, k = 1 . . . K (16)

In the above equation, π is the weights,μ is the mean, and the is the covariance
of the Gaussian components.
V in Eq. (13) is the smooth term, which represents the neighborhood relationship
of pixels, and it is obtained from the Euclidean distance between the color spaces of
adjacent pixels.
⎛ ⎞

V(α, z) = γ⎝ [αn = αm ]exp − βzm − zn2 ⎠ (17)
(m,n∈C)

After several iterations, the parameters of Gaussian mixture model are optimized,
and the energy function is minimized by max flow min cut.

min min E(α, k, θ, z) (18)


{αn :n∈TU } k

At this time, the minimum energy function assigns a label belonging to the foreground
or background to each unknown pixel, so as to obtain the result of image segmentation.

4 Experimental Results and Analysis

Our Experimental platform is Win 10 Home edition with Microsoft Visual Studio 2019
using OpenCV 4.3.0. CPU is Intel(R) Core(TM) i7-10750H 2.60 GHz. Refer to [13],
our method is implemented by C++ and named GrabCutLocal. In this paper, the exper-
iment is conducted on the Berkeley data set [14]. In order to verify the effect of the
improved algorithm and the original algorithm, an intuitive comparison was made after
the experiment. The experimental results are shown in Fig. 4.
It can be seen that GrabCut has some defects in the segmentation of the corners
in (a) image. GrabCutlocal has a relatively complete split on the edges and corners. In
(b), GrabCut omits part of the tail, and GrabCutLocal splits the entire tail. For (c), the
segmentation effect of GrabCut in helmet and shoes are not very good. In (d), part of the
background is retained in the nose, and part of the brain is regarded as the background
in (E). Therefore, when segmenting these images, GrabCutLocal can clearly distinguish
GrabCut Image Segmentation Based on Local Sampling 363

the foreground from the background. Since the foreground and background colors are
similar in the above images, the GrabCutLocal method shows obvious advantages.
Objectively, the Sum of Absolute Differences (SAD) is used to evaluate the error,
where m and n are the width and height of the image. E(i, j) is the result of the algorithm
segmentation, and H(i, j) is the true value of the segmentation provided by the data set.


m 
n
SAD = (E(i, j) − H (i, j)) (19)
i=1 j=1

Fig. 4. Among the five typical images in the experimental results, (a), (b), (c), (d), and (e) are five
sets of images. The first line is the original image in the dataset. The second line is the trimap. The
third line is the segmentation result of the original GrabCut. The fourth line is the segmentation
result of our GrabCutLocal. The red circle is a clear contrast between the two algorithms.

According to the results obtained by running GrabCut algorithm and GrabCutLocal


algorithm on the experimental data set, the Sum of Absolute Differences (SAD) is calcu-
lated by Eq. (19). The Sum of Absolute Differences (SAD) of the above image and the
average Sum of Absolute Differences (SAD) of the entire data set are shown in Table 1.
In Table 1, the Sum of Absolute Differences (SAD) between GrabCut and true value
is significantly larger than our GrabCutLocal method. The foreground and background
colors of the above images are very similar, and there are many overlapping colors, so
we can find that GrabCutLocal algorithm is more accurate. In summary, the data in Table
1 shows that under the same trimap type, the Sum of Absolute Differences (SAD) of the
GrabCutLocal method is smaller, and the segmentation effect is better than the original
GrabCut method.
364 G. Yao et al.

Table 1. Images SAD and average SAD of datasets.

Image GrabCut GrabCutLocal


Book 6.88 6.31
326038 5.70 3.09
376043 6.48 2.91
Ceramic 2.59 0.95
Average 3.30 2.72

5 Conclusion
Based on the improvement of the GrabCut image segmentation algorithm, this paper
proposes a GrabCut image segmentation algorithm based on local sampling, that is,
foreground and background samples come from local sampling around unknown pixels,
instead of selecting global foreground and background as samples. Since the global
foreground and background are used for modeling, the GrabCut algorithm pays more
attention to the overall color information. It is easy to ignore the local color information.
GrabCut can achieve good results when the foreground and background are significantly
different. Still, it is not ideal when the foreground and background are similar, and there
is much color overlap. This paper proposes a GrabCut image segmentation algorithm
based on local sampling, which uses local samples for clustering to detect more color
details and avoid losing color information. The experimental results show that when
the foreground and background sampling accuracy is higher, and the foreground and
background colors are more similar, the local sampling GrabCut algorithm can achieve
better results than the original algorithm. This method effectively improves the accuracy
of GrabCut image segmentation and has a certain application value.

Acknowledgment. This work is supported by Heilongjiang Provincial Natural Science Founda-


tion of China (No. LH2021F034), and the Youth Innovation Talent Support Program of Harbin
University of Commerce (No. 2020CX39).

References
1. Srinivas, B., Manjunathachari, K.: New approach for image segmentation based on graph
cuts. Int. J. Signal Process. Image Process. Pattern Recogn. 10(1), 119–130 (2017)
2. Boykov, Y., Veksler, O., Zabih, R.: Fast approximate energy minimization via graph cuts.
In: Proceedings of the Seventh IEEE International Conference on Computer Vision, vol. 1,
pp. 377–384. IEEE, Kerkyra (1999)
3. Rother, C., Vladimir, K., Blake, A.: “GrabCut”: Interactive foreground extraction using
iterated graph cuts. ACM Trans. Graph. 23(3), 309–314 (2004)
4. Blake, A., Rother, C., Brown, M., Perez, P., Torr, P.: Interactive image segmentation using
an adaptive gmmrf model. In: Pajdla, T., Matas, J. (eds.) ECCV 2004. LNCS, vol. 3021,
pp. 428–441. Springer, Heidelberg (2004). https://doi.org/10.1007/978-3-540-24670-1_33
GrabCut Image Segmentation Based on Local Sampling 365

5. Leighton, T., Rao, S.: Multicommodity max-flow min-cut theorems and their use in designing
approximation algorithms. J. ACM 46(6), 787–832 (1999)
6. Yi, C., Wu, B., Zhang, H.: An improved grabcut image segmentation method. J. Chin. Comput.
Syst. 35(5), 1164–1168 (2014)
7. Xu, Q., Guo, M., Wang, Y.: Fast color image segmentation based on watershed transform and
Graph cuts. Comput. Eng. 35(19), 210–212 (2009)
8. Ling, B., Guo, Y.: GrabCut image segmentation combing color and information. Comput.
Appl. Softw. 37(8), 188–193 (2020)
9. Hu, Z., Guo, M.: Fast segmentation in color image based on SLIC and GrabCut. Comput.
Eng. Appl. 52(2), 186–190 (2016)
10. Yao, G., Zhao, Z., Liu, S.: A comprehensive survey on sampling-based image matting.
Comput. Graph. Forum 36(8), 613–628 (2017)
11. Shahrian, E., Rajan, D., Price, B., Cohen, S.: Improving image matting using comprehensive
sampling sets. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 636–
643. IEEE, Portland (2013)
12. Orchard, M., Bouman, C.: Color quantization of images. IEEE Trans. Signal Process. 39(12),
2677–2690 (1991)
13. Sun, F., Zhang, H.: Studying and implementing GrabCut-digital matting algorithm. J.
TIANJIN Univ. Technol. 24(2), 42–45 (2008)
14. The Berkeley Segmentation Dataset and Benchmark Homepage https://www2.eecs.berkeley.
edu/Research/Projects/CS/vision/bsds/. Accessed 27 Oct 2020
Multi-angle Face Recognition Based on GMRF

Sun Huadong1,2 , Zhao Pengfei1(B) , and Zhang Yingjing1


1 Harbin University of Commerce, Harbin 150028, China
2 Heilongjiang Provincial Key Laboratory of Electronic Commerce and Information Processing,

Harbin 150028, China

Abstract. The general face recognition methods mostly use positive face images
or with a small angle of deflection. Such databases are well established. But in
real life, face recognition problems often do not happen to be positive face state,
especially in monitoring and public security monitoring. The existing effect of
multi-angle face recognition is not ideal, and this paper proposes a face recognition
method based on Gaussian Markov Random Fields (GMRF). GMRF model is a
statistical probability model that can effectively extract image texture information,
which first divides the face image into several sub-blocks in different chunks,
and then, for each sub-block under the block mode, extracts the GMRF feature
after wavelet transformation; finally combines the GMRF features of different
blocking methods, and then classifies the SVM of the Gaussian nuclear function.
Experiments were carried out on the self-built data set, and the proposed method
reached 98.83% for face recognition.

Keywords: Multi-angle face recognition · Gaussian Markov Random Fields ·


Image block · SVM

1 Introduction
With the development of computer vision technology and pattern recognition, face recog-
nition has become a research hotspot. Because the face is a relatively insensitive type of
non-contact recognition; however, in the extreme environment of face recognition, such
as monitoring and security and other unconstrained environments, the recognition effect
needs to be improved.
Traditional face recognition algorithms are mostly based on artificially designed
features and machine learning algorithms, such as Local Binary Patterns (LBP) [1],
Histogram of Oriented Gradient (HOG), Principal Component Analysis (PCA) [2], Lin-
ear Discriminant Analysis (LDA), etc. At the same time, the number of face databases
obtained is limited, and the face type is too single. Therefore, the accuracy of early face
recognition algorithms in practical applications is not high. The rapid development of
computer hardware technology and software technology provides a new method to solve
face recognition -- a deep learning method based on a convolutional neural network. A
convolutional neural network (CNN) uses a multi-layer neural network composed of
basic mental elements to learn target features by simulating the learning way of the
human brain [3].

© The Author(s), under exclusive license to Springer Nature Switzerland AG 2022


A. E. Hassanien et al. (Eds.): BIIT 2021, LNDECT 107, pp. 366–378, 2022.
https://doi.org/10.1007/978-3-030-92632-8_35
Multi-angle Face Recognition Based on GMRF 367

Although deep learning Uncle XI has improved the accuracy of the face recognition
algorithm, there are still many problems in practical application. In the uncontrolled
environment, especially in the surveillance video, most of the face images obtained have
problems such as low resolution, face occlusion, large illumination changes, expression
changes, head posture changes, etc. Facial gestures of the diversity of face recognition
algorithm have great interference effects, under the condition of reality monitoring video
captured most of the face image is the pitch change, spin around or partial first-class
attitude change, even if the depth of using a neural network to extract the face “abnormal”
characteristics, compared with the characteristics of the “standard face” gap is bigger,
will fall into the wrong location in feature space, This, in turn, leads to misjudgment
in identification and verification [4]. These interference factors lead to the decline of
the accuracy of face recognition algorithms, which brings a difficult problem for the
application of face recognition technology. Therefore, the study of multi-angle face
recognition algorithms has a strong practical prospect and research value.
This paper proposes a multi-angle face recognition method based on the combination
of Gaussian Markov Random Field (GMRF). At present, Gaussian Markov random field
model is widely used in the field of image processing. It can express the texture informa-
tion of the image well and has good spatial correlation. This paper adopts the Gaussian
Markov random field feature extraction method combining multi-block features, and
the Gaussian kernel SVM is used for face recognition classification. The recognition
results on self-built data sets show the effectiveness of this method in multi-angle face
recognition.

2 Face Dataset and Preprocessing

These databases are shot in the laboratory environment. The postures include −90°,
−60°, −30°, 0°, 30°, 60° and 90°, including different clothes, lowered head, wearing
glasses, hats, and expressions, as shown in Fig. 1.

Fig. 1. Attitude database from −90 to 90°.

At the time of the establishment of the self-built database, there were 10 different
faces, each with 30 pictures of different angles and different postures (as shown in Fig. 2)
by a certain proportion of the division. After extracting the characteristics according to
their respective labels, two-thirds of each training session is as a training set and one-third
is a test sample set.
368 S. Huadong et al.

Fig. 2. Faces in datasets.

First, according to the nature of Gaussian Markov random field characteristics, the
face image is divided into 9 pieces, 16 pieces, 25 pieces (Fig. 3); then take the R, G, and
B three channels of the image (as shown in Fig. 4).
Multi-angle Face Recognition Based on GMRF 369

Fig. 3. Faces divided into 9 pieces, 16 pieces in 25 pieces (right to left).

Fig. 4. Three-channel example of R, G, and B (from right to left).

3 Feature Extraction
3.1 Gaussian Markov Random Field Model
The Gaussian Markov, Random Field model, is one of the two branches of the Markov
random field. Assuming that the distribution of excitation noise is Gaussian, a differential
equation can be obtained, and the grayscale of the spatial pixel represents the equation.
Therefore, this model becomes a Gauss Markov random field model. In practical appli-
cations, compared to the Markov random field and Gibbs distribution model, the Gauss
Markov random field model has the advantage of a small amount of calculation, so it
has been widely used. Based on the real-time nature of multi-angle face recognition
technology, this paper selects the Gauss Markov random field feature of the extracted
image to classify and recognize the multi-angle face image [5].
The Gauss Markov Random Field Model is a stationary autoregressive process. The
covariance matrix is positive definite, the neighborhood system is symmetric, and the
parameters of the symmetric neighborhood points are equal. When using Gauss Markov
random field features to express the texture characteristics of an image, it can be expressed
in the form of the conditional probability of formula:

P(y(s)|all : y(s + r), r < Nei ) (1)

Nei Represents asymmetric neighborhood with s as the center and r is the radius (the
symmetric neighborhood does not include the center point s). This formula expresses
that the gray scale y(s) of any pixel s in the image is a function of the grayscale of
neighboring points in all directions of s. The neighborhood relationship of the Gauss
370 S. Huadong et al.

Markov random field model is represented by the structure diagram of the Gauss Markov
random field model, as shown in Fig. 5 [6].

5 4 3 4 5

4 2 1 2 4

6 3 1 S 1 3 6

4 2 1 2 4

5 4 3 4 5

Fig. 5. Gauss Markov random field structure diagram.

Suppose S is the point set on the M × M network, S = {(i, j), 1 ≤ i, j ≤ M {}}, and
assuming that the given texture y(s), s ∈ S, S = {(i, j), 1 ≤ i, j ≤ M } is a Gaussian
random process with zero means, the GMRF model can be represented by a linear
equation containing multiple unknown parameters.

y(s) = θr max(y(s + r), y(s − r)) + e(s) (2)
r∈NS

Where, NS represents the GMRF neighborhood of the point S, and θr represents the
coefficient, e(s) is a Gaussian noise sequence and the mean value is zero. Write (2) as:
  
y(s) = θr y1 (s + r) + e(s) (3)
r∈NS

Among them, y1 (s + r) is the point with the larger value in the closed ring area of S.
Applying formula (3) to each point in the area of S, getting M 2 equations about {e(s)}
and {y(s)}:

y(1, 1) = θr y1 ((1, 1) + r) + e(1, 1)
r∈NS

y(1, 2) = θr y1 ((1, 2) + r) + e(1, 2)
r∈NS

y(1, M ) = θr y1 ((1, M ) + r) + e(1, M )
r∈NS

y(M , 1) = θr y1 ((M , 1) + r) + e(M , 1) (4)
r∈NS

y(M , M ) = θr y1 ((M , M ) + r) + e(M , M )
r∈NS
Multi-angle Face Recognition Based on GMRF 371

Use the form of a matrix to represent all the equations formed by y1 (s + r), which
can be written as:
y = QT θ + e (5)

Equation (5) is the linear model of the Gauss Markov random field, QT is the matrix
about all y1 (s + r), and θ is the eigenvector of the model to be estimated. Using the least
square error criterion, the following estimated solution formula can be obtained.
∧  −1
θ = QT Q QT y (6)

The linear autoregressive GMRF model, when the order is relatively low, although it
is convenient to analyze and calculate, has certain limitations when describing complex
image features. As the order increases, the amount of calculation becomes larger, but
it can describe the rich texture information of the image. Choosing the best order can
effectively reflect the texture characteristics of the image. This paper selects the second-
order, fourth-order, and fifth-order models of Gaussian Markov Random Field for multi-
angle face recognition.

3.2 Face Feature Extraction


Daubechies wavelet is a wavelet function constructed by the world-renowned wavelet
analysis scholar Ingrid Daubechies (generally transliterated as Ingrid Dobessie). We
generally abbreviate it as dbN, where N is the order of the wavelet. The support area
in the wavelet function (t) and the scaling function ϕ(t) is 2N – 1, and the vanishing
moment of (t) is N. The dbN wavelet has good regularity, that is, the smooth error
introduced by the wavelet as a sparse basis is not easy to be noticed, which makes the
signal reconstruction process smoother. The characteristic of dbN wavelet is that as
the order (sequence N) increases, the vanishing moment order is larger, and the higher
the vanishing moment, the better the smoothness, the stronger the localization ability
in the frequency domain, and the better the frequency band division effect, but it will
weaken the tightness of the time domain, at the same time the amount of calculation will
be greatly increased, and the real-time performance will be worse. The followings are
diagrams of the db4 wavelet: [7].
First, break down the chunked image into R, G, B three-channel diagrams, then
the wavelet transform of the image will get a total of four coefficients of different
scales; one low-frequency signal and three high-frequency signals; this article takes
three low-frequency signals of the image, namely the horizontal coefficients (Horizontal
detail coefficients), the vertical coefficient (Vertical detail coefficients) and diagonal
coefficients (Diagonal detail coefficients). Then perform Gauss Markov random field
feature extraction for each coefficient and integrate it into a large feature vector. This
completes the feature extraction.

4 Classification and Recognition


In 1995, Cortes proposed a support vector machine (SVM) for limited samples. SVM
was developed based on statistical learning theory and minimum structural risk. It has
372 S. Huadong et al.

Fig. 6. Scale equations phi, wavelet function psi.

Fig. 7. Decomposition low-pass filter, decomposition high-pass filter.

Fig. 8. Reconstruction low-pass filter, reconstruction high-pass filter.

good generalization ability and is extremely suitable for small sample classification.
The significance of the method is to find the optimal solution under limited sample
data and convert the task into a binary classification problem to obtain the theoretically
optimal solution. Compared with other traditional algorithms, SVM obtains the optimal
solution from a global perspective, thereby avoiding local extreme values, and it is less
sensitive to the curse of dimensionality. The main explanation behind this attractive
feature is that because of the principle of margin maximization they are based on, it is
not necessary to explicitly estimate the statistical distribution of classes in the super-
dimensional feature space to perform classification tasks. Another important feature is
Multi-angle Face Recognition Based on GMRF 373

their good generalization ability, supported by the r sparse representation of the decision
function. SVM algorithm has a wide range of applications in image processing, computer
vision, speech recognition, etc. [8]. The basic idea of the SVM algorithm: first map the
data to be classified to a high-dimensional space through the constructed kernel function,
and then find a linear function in the high-dimensional space, that is, a hyperplane, so that
it can accurately classify the two types of samples. The point closest to the hyperplane
is the support vector, and the support vector is used to classify the sample.

4.1 Linear Support Vector Machine


SVM basic principle: given training data set: D = {(x1 , y1 ), (x2 , y2 ), . . . , (xn , yn )}, i =
1, 2, . . . , N , yi ∈ {−1, 1}. Where D contains n samples, (xi , yi ) represents a sample, xi
is the i-th feature vector, and yi is the class label. SVM needs to solve the hyperplane in
the data space to achieve accurate division of the training set. The calculation formula
of the hyperplane:

f (x) = wT x + b = 0 (7)

Where w defines the direction of the hyperplane, which is the normal vector, and b
defines the distance between the origin and the hyperplane, which is the displacement
term. The distance from any point x in the sample space to the hyperplane f (x) is:
 T 
w x+b
= w
(8)

If the hyperplane f (x) can accurately classify the training samples, for (xi , yi ) ∈ D,
let
wT x + b ≥ +1, yi = +1;
(9)
wT x + b ≤ −1, yi = −1.

When some sample vectors in the d-dimensional sample space satisfy Eq. (9) and
the distance from the hyperplane is the smallest, then these vectors are support vectors,
and all support vectors will form a positive class support surface and a negative class
The supporting surface, the sum of the distances from the two surfaces to the hyperplane
f(x) is:

= 2
w (10)

The algorithm is to find the optimal classification interface, that is, satisfy:
2
max (11)
w,b w

From the above formula, the maximum separation hyperplane is found min 21 w2 .
w,b
In order to solve it, it is transformed into a Lagrange function:
1 m   
L(w, b, α) = w2 + αi 1 − yi wT xi + b (12)
2 i=1
374 S. Huadong et al.

Where α is the Lagrange factor, and the partial derivatives of w and b are respectively
calculated:

W = m i=1 αi yi xi (13)

m
0= i=1 αi yi (14)

Incorporating formula (21) into formula (20), the dual problem can be obtained:
m 1 m m
max αi − αi αj yi yj xiT xj (15)
α i=1 2 i=1 j=1

The constraints are:


m
αi yi = 0 (16)
i=1

And αi ≥ 0, i = 1, 2, . . . , m, we finally get:


m
f (x) = αi yi xiT x + b (17)
i=1

The above formula needs to meet KKT (Karush-Kuhn-Tucker) conditions:



⎨ αi ≥ 0;
yi f (xi ) − 1 ≥
 0;
(18)

αi yi f(xi ) − 1 = 0.

For samples (xi , yi ), αi = 0 or yi f (xi ) = 1.

4.2 Non-linear Support Vector Machine

The non-linear problem in the original space cannot be solved in the normal way. In
order to make the original function linearly separable, only the method of mapping is
used to convert the non-linear original function in the low-dimensional space into the
sample function of the high-dimensional space, and then use The kernel function and
the transformed function equation are solved. The commonly used kernel functions are
shown in Table 1:
Let K be the inner product of the mapping function, and meet K (x, z) = (x)·(z),
the functional equation of the regression support vector machine is:

s.y = f (x) = (w∗(x)) + b (19)

The established Lagrange function is:


 2 K ∗  K
 
L w, b, ξ, ξ∗ , α, α∗ , γ, γ∗ = w 
2 + C i=1 ξ + ξ − i=1 αi ξi + ε − yi + ω*xi + b

(20)
K ∗
 ∗ ∗
  K
− i=1 αi ξi + ε + yi − (ω*xi + b) − i=1 ξi γi + ξi γi

among them, α, α ∗ , γ , γ ∗ ≥ 0, i = 1, 2, . . . , k
Multi-angle Face Recognition Based on GMRF 375

Table 1. Common kernel functions.

Kernel function Expression Parameter


  
Linear kernel k xi , xj = xiT , xj None
  d
Polynomial kernel [8] k xi , xj = xiT , xj d>1
  
 x −x 
Gaussian kernel k xi , xj = exp − i 2 j σ >0

  
 x −x 
Laplace kernel k xi , xj = exp − i σ j σ >0
  
Sigmoid kernel k xi , xj = tanh βxiT xj + θ β > 0, θ < 0

The dual problem is:


 k ∗
 k  
− α∗i − 21
i=1 yi αi − αi − ε  i=1 αi 
max k k 
∗ α − α∗ K(x *x ) (21)
i=1 j=1 αi − αi j j i j

Therefore, the nonlinear regression function is:


k 
fx = αi − αi∗ K(xi ∗x) + b (22)
i=1

5 Experimental Results and Analysis


A self-built multi-angle face data set was used in the experiment to carry out a face
recognition classification experiment. The self-built database has a total of 300 face
images, which are divided into ten categories. Each of them has different profile images
from 15° to 90° for each expression, and the image size of the self-built dataset is
244*244 pixels—select 200 pictures from the self-built face data set for training and
100 pictures for experimental testing. In order to ensure the rationality and accuracy of
the experiment, the 200 of them are used for training and the rest are used for testing,
each contains faces from different angles. The ratio of training data to test data is 2:1,
which is in line with the normal test ratio of the experiment.
In this experiment, the second-order, fourth-order, and fifth-order Gauss Markov ran-
dom field models have different feature dimensions: four-dimensional, ten-dimensional,
and twelve-dimensional. Therefore, the dimensionality of the feature vector extracted
under different block methods and different orders of Gauss Markov random field fea-
tures is different. For example, the image is divided into 9 sub-blocks, each of which
extracts the characteristics of the second-order Gauss Markov random field, which are
stitched together into a 108-dimensional vector; the image is divided into 16 sub-blocks,
each of which extracts the characteristics of the second-order Gauss Markov random
376 S. Huadong et al.

Fig. 9. The recognition rate of different orders and different blocks.

field, which are stitched together into a 480-dimensional vector; Divided into 25 sub-
blocks to extract the features of the fifth-order Gauss Markov random field, the feature
vector is 900 dimensions. The experimental results are as follows:
Experiments have proved that for blocks, the more blocks, the worse the effect, and
the reason is also obvious: the features obtained from more blocks are not obvious, and
the redundant feature extraction will have an impact on the recognition effect. For the
order of Gauss Markov Random Field, order 5 has the highest recognition rate.
Since the previous experiments were performed under the condition of grayscale
images, the color channel information was ignored; therefore, the RGB three-channel
experiment was added to the following experiment; that is, the wavelet transform is per-
formed on the block images of different channels, and after the low-frequency channel,
Gauss Markov random field feature extraction is performed, and then the various fea-
tures are combined. Therefore, the dimensionality of the feature vector extracted under
different block methods and different orders of Gauss Markov random field features is
different. For example, the image is divided into 9 sub-blocks, each of which extracts
the characteristics of the second-order Gauss Markov random field, which are stitched
together into a 324-dimensional vector; the image is divided into 16 sub-blocks, each of
which extracts the characteristics of the fourth -order Gauss Markov random field, which
are stitched together into a 1440-dimensional vector; Divided into 25 sub-blocks and
each of which extracts the characteristics of the fifth-order Gauss Markov random field,
the feature vector obtained is 2700 dimensions. The experimental results are shown in
Table 2:
Multi-angle Face Recognition Based on GMRF 377

Table 2. The recognition rate of different orders under 9 blocks of RGB.

Order of GMRF The recognition rate


2 0.6348
4 0.9540
5 0.9844

The graph shows that in the case of GMRF features of different orders of 2, 4, and
5, the face recognition rate of the algorithm in this paper reaches the highest 98.44%,
which shows the method’s effectiveness.

6 Conclusion
In this paper, the Gauss Markov random field algorithm is used to extract the face image
features. The Gauss Markov random field feature can effectively describe the local texture
information and the spatial position relationship of the image and can effectively describe
the spatial position characteristics of the face. To obtain more detailed information of
the face, this paper also performs block processing on the image, performs wavelet
transformation on the RGB three channels of different blocks, extracts the low-frequency
accessible and extracted GMRF features, and combines them. Finally, the SVM classifier
is used to conduct multi-angle face classification experiments. The results confirm the
method’s validity proposed in this paper.

Acknowledgment. This paper is supported by Heilongjiang Provincial Natural Science Founda-


tion of China (LH2020F008).

References
1. Lu, Z., Jiang, X., Kot, A.: A novel LBP-based color descriptor for face recognition. In:
2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP),
pp. 1857–1861 (2017)
2. Abbas, E.I., Safi, M. E., Rijab, K.S.: Face recognition rate using different classifier methods
based on PCA. In: 2017 International Conference on Current Research in Computer Science
and Information Technology (ICCIT), pp. 37–40 (2017)
3. Tsai, A.C., Ou, Y.Y., Wang, J.F.: Efficient and effective multi-person and multi-angle face
recognition based on deep CNN architecture. In: 2018 International Conference on Orange
Technologies (ICOT), 4 (2018)
4. Yin, X., Liu, X.: Multi-task convolutional neural network for pose-invariant face recognition.
IEEE Trans. Image Process. 27(2), 964–975 (2017)
5. Coşkun, M., Uçar, A., Yildirim, Ö., Demir, Y.: Face recognition based on convolutional neural
network. In: 2017 International Conference on Modern Electrical and Energy Systems (MEES),
pp. 376–379 (2017)
6. Wang, Y., Wang, H.: Application of rank GMRF in textural description and recognition.
Comput. Eng. Appl. 47(25), 202+204 (2011)
378 S. Huadong et al.

7. Zhang, L., Zhang, L., Zhang, L.: Application research of digital media image technology based
on wavelet transform. J. Image Video Proc. 138 (2018)
8. Yang, L., Chang, H.: Face recognition based on the combination method of multiple classifier.
Int. J. Sign. Process. Image Process. Pattern Recogn. 9(4), 151–164 (2016)
Multi-scale Object Detection Algorithm Based
on Faster R-CNN

Xiaodong Su1,2 , Yurong Zhang1,2(B) , Chaoyu Wang3 , Hongyu Liang1,2 ,


and Shizhou Li1,2
1 Harbin University of Commerce, Harbin 150028, China
suxd@hrbcu.edu.cn
2 Heilongjiang Provincial Key Laboratory of Electronic Commerce and Information Processing,
Harbin 150028, China
3 No.1 Oil Production Plant of Daqing Oilfield Co., Ltd, 163000 Daqing, China

Abstract. Most object detection methods have low detection accuracy for small
and dense objects and susceptibility to noise interference. In response to these
problems, this paper designs an improved multi-scale object detection algorithm
based on Faster R-CNN, aiming to improve the detection accuracy of small and
dense objects. This paper introduces depthwise separable Convolution to reduce
the number of network parameters. The amount of calculation uses dilated Con-
volution to improve the feature extraction of small objects and dense objects by
increasing the receptive field, uses Convolution instead of 3 × 3 conventional Con-
volution to solve the problem that the feature extraction of small objects caused
by Convolution is easy to be lost, distorted or the feature extraction redundancy
of large objects is too high. The spatial attention mechanism is introduced to effi-
ciently screen features that are more beneficial to object detection and improve
the model’s ability to deal with large object scale differences in multi-scale object
detection. The network model in this paper was trained and verified on the datasets
VOC2007 and VOC2012, and the mean average precision reached 86.3%.

Keywords: Multi-scale object detection · Faster R-CNN · Involution · Spatial


attention mechanism

1 Introduction
Object detection technology is an important research direction in computer vision, widely
used in military, transportation, medical, and other fields. The main task of object detec-
tion is to classify and locate objects. Early object detection technologies mainly include
Haar features [1], Histogram of Gradient [2], Local Binary Pattern [3], and classifiers,
such as Support Vector Machines (SVM) [4] and AdaBoost [5]. The traditional object
detection method uses a sliding window for the region selection, which wastes time and
produces many useless features. Secondly, the traditional object detection method uses
the operator for feature extraction to be manually set and unchanged. Hence, it is impos-
sible to perform effective feature extraction on diversified objects and have occlusion or
complex lighting problems.

© The Author(s), under exclusive license to Springer Nature Switzerland AG 2022


A. E. Hassanien et al. (Eds.): BIIT 2021, LNDECT 107, pp. 379–391, 2022.
https://doi.org/10.1007/978-3-030-92632-8_36
380 X. Su et al.

In summary, the traditional object detection methods are relatively poor in robustness
and real-time performance [6]. Since the birth of AlexNet [7], the use of Convolution
neural networks for object detection is increasing, and various object detection models
have emerged one after another. The detection performance has also been continuously
improved. However, compared with large-scale objects, small and dense objects have
the characteristics of small size, low resolution, and are easily affected by noise, which
can easily cause missed and incorrect detection. Therefore, improving the detection
capability of multi-scale objects and meeting the requirements for detecting small objects
and dense objects has become the main problem in the object detection field at this stage.
Before classification and positioning, the region proposal is first generated, and then
the generated region proposal is fused with the feature map, and finally the softmax and
regression are used for classification and positioning. The current object detection tech-
nology is mainly divided into two categories, and one is the two-stage object detection
algorithm represented by Faster R-CNN [8]. Therefore, its detection accuracy is high,
but its two-step strategy leads to a substantial increase in the complexity of the model
and reduces the detection speed of the model. The other is the one-stage object detection
algorithm represented by YOLO [9] and SSD [10], which uses a separate CNN model to
implement end-to-end object detection, with great speed advantages. Still, if more than
one object in the same grid is divided, the objects will be missed so that the detection
accuracy will suffer a certain loss.
At present, there are many existing types of research on the use of Faster R-CNN in
multi-scale object detection. However, the current research results show that small and
dense objects’ detection accuracy is still not very high, and the detection speed is slightly
slow. For example, an improved Faster R-CNN algorithm with a split mechanism, which
uses the difference in classification and positioning performance between convolutional
structure and fully connected structure, and the Region Of Interest (ROI) features are
respectively input to the classification network using the fully connected structure and
the bounding box classification network using the convolution structure. By introducing
the convolution structure, the spatial information in the ROI feature is more fully utilized.
ROI Align is used to replace ROI pooling, which cancels the quantization in ROI pooling
and eliminates the bounding box deviation of the Faster R-CNN model itself. Although
this model improves the detection accuracy, the detection speed decreases due to the
increase in the calculation [11]. There is also an improved Faster R-CNN ship surface
multi-scale object detection algorithm. This algorithm adds a multi-scale feature region
proposal generation network based on the original algorithm feature extraction network.
And perform cluster analysis on the ship surface object data set to generate the anchor size
suitable for the object detection in this article, which improves the algorithm’s accuracy
for the detection of ship surface multi-scale objects, especially small objects. Still, the
low detection rate is also a disadvantage of this algorithm. [12] In summary, to improve
the detection ability and detection rate of small and dense objects, this paper uses Faster
R-CNN as the basic model to carry out work. This paper introduces depth-wise separable
Convolution, dilated Convolution, Involution, and spatial attention mechanisms, which
improve the feature extraction ability of the model and increase the model’s detection
rate by reducing the amount of parameters the amount of calculation.
Multi-scale Object Detection Algorithm Based on Faster R-CNN 381

2 Faster R-CNN Algorithm


2.1 Performance Analysis of Faster R-CNN

In 2016, Ross Girshick and others proposed Faster R-CNN based on Fast R-CNN, inte-
grating feature extraction, candidate region acquisition, regression, and classification
into a deep network. The speed and accuracy are much higher than Fast R-CNN. How-
ever, Faster R-CNN is an object detection algorithm based on a two-step strategy, it
needs to obtain the region proposal first and then classify each proposal. The amount of
calculation is still relatively large. And due to the limitation of the number of parameters
and calculation, conventional convolution operations generally use a kernel with a size
of 3 × 3 or 5 × 5, which greatly affects the feature extraction of small objects and dense
objects.

2.2 Model Structure Analysis of Faster R-CNN

The classic Faster R-CNN model comprises the Convolution, Region Proposal Net-
work (RPN), ROI pooling, and classification regression. The convolution layer uses the
VGG16 network model. The main function is to extract the feature maps of the input
image. The convolution layer is composed of convolution, pooling and activation func-
tions. The RPN is the core part of the Faster R-CNN, which replaces using selective
search to obtain region proposals. This method uses the CNN network quickly and more
efficiently when generating a region proposal; anchors are generated. The network uses
a discriminant function to determine whether the anchors are foreground or background
and then uses bounding box regression to make the first anchor adjustment to obtain an
accurate region proposal. Adding the ROI pooling layer mainly solves the problem of
the different sizes of the feature maps that are finally input to the fully connected layer
and obtain a fixed size through upsampling. Classification and regression are used to
determine which class the object belongs to and fine-tune the region proposal’s location
to obtain the final result of object detection.
The Faster R-CNN model can be divided into three steps for object detection. First,
the VGG16 network is used to extract the input image features, and the extracted feature
maps are input to the RPN layer. The RPN layer uses a 3 × 3 sliding window to ergodic
each pixel of the feature map and generates nine anchors of different scales around each
pixel on the feature map. The area of the anchors is set to 1282 , 2562 , and 5122 pixels
of three different sizes, and each anchor with a different area has three ratios of 1:1, 1:2,
and 2:1 [8]. The 256 region proposals are extracted according to the set IoU value. Then
the region proposal is divided into 128 positive samples and 128 negative samples in a
1:1 ratio by the IoU value, and the loss function is optimized through these 256 region
proposals. The 128 positive samples generated by RPN are projected onto the feature
map to obtain the corresponding feature matrix, and then each feature matrix is scaled
into a 7 × 7 feature map through the ROI polling layer, and then the feature map is
flattened, and the prediction result is obtained through a series of fully connected layers.
The model structure of Faster R-CNN [8] is shown in Fig. 1.
382 X. Su et al.

Fig. 1. Model structure of classic Faster R-CNN

3 Establishment of Multi-scale Object Detection Network


Structure

3.1 The Overall Structure of the Model


The backbone network of the model in this paper is divided into two parts. The first
part uses ResNet50 [13] to solve the problems of gradient disappearance or gradient
explosion and network degradation caused by the increase of network depth; the second
part is the network designed in this paper——DSI-FPN. The output of ResNet50 is
the input of DSI-FPN. The combination of the two networks forms the backbone of
the multi-scale object detection network. In DSI-FPN, depthwise separable Convolution
[14] is introduced to reduce the number of network parameters and calculations. And
the use of dilated Convolution [15], by increasing the receptive field to improve the
prediction ability of small objects and dense objects while retaining spatial features and
enhancing contextual connections. The spatial attention mechanism [16] is introduced
to efficiently screen out features that are more beneficial to object recognition.
The feature map extracted by the backbone network is input to the region generation
network RPN. In this paper, the original nine region proposals of the RPN are expanded
to 15, and two small-scale anchors with an area of 322 and 642 are added, each area still
has the same three ratios, which is conducive to the feature extraction of small objects
and dense objects, thereby improving the recognition rate. Use Involution [17] instead
of original FPN’s 3 × 3 Convolution, and output feature maps of different sizes. Input
the obtained feature map into the ROI polling layer to fix the size, and finally use the
regression to adjust the region proposal. The model structure is shown in Fig. 2.
Multi-scale Object Detection Algorithm Based on Faster R-CNN 383

Fig. 2. Improved model structure of Faster R-CNN

3.2 Involution Operator


Generate the Kernel. The kernel of Involution is generated based on a single pixel.
Take the feature vector of a pixel in the input feature map as an example to generate the
corresponding kernel. The mathematical expression is shown in formula (1)-(4):

σ = ReLU + BN (1)

  C
yreduce = σ W 0 X i,j yreduce ∈ R1×1× /r (2)

yspan = W 1 yreduce yspan ∈ R1×1×k×k×G (3)


 
Hi,j = φ X i,j = yspan (4)

Among them, W 0 is a linear transformation matrix, realized by a 1 × 1 convolution


operation, reducing the channel dimension to C/r (C is the number of channels of the
input feature map, r is the compression ratio), and the purpose is to reduce the amount of
calculation by reducing the number of parameters. σ indicates that a batch normalization
(BN) operation is performed on W 0 Xi,j first, and then a ReLU activation function is used,
and finally yreduce is obtained, yreduce is the output after dimensionality reduction. W 1 is
also a linear transformation matrix, and the number of channels is expanded through a
1 × 1 convolution operation, and then get yspan , which is the output after expanding the
number of channels. φ represents the above series of operations performed on X i,j , and
Hi,j is the generated kernel.
It should be noted that in the specific implementation, not all input channels share
a kernel, but the input channels are divided into G groups. Channels in the same group
share a kernel, and different groups use different kernels.
Operation Process of Involution. Since each feature point of the feature map corresponds
to K × K elements of the kernel and cannot be directly calculated, so we unfold the
feature map and expand each feature point to its K × K neighborhood first, and then
perform a Multiply-Add operation on the feature map and the kernel to obtain the output
feature, as shown in formulas (5) and (6). A schematic diagram of the operation flow is
shown in Fig. 3 [17].
  C
Xunfold = unflod Xi,j Xunfold ∈ R1×1×G× /G (5)
384 X. Su et al.

  
Y = sum mul Hi,j , Xunfold Y ∈ R1×1×C (6)

Among them, Xunfold represents the output feature map obtained after unfolding the
feature map. mul represents the product operation of the obtained kernel and the output
feature map. sum represents the sum of each channel of the feature map obtained after
the operation. C represents the number of channels of the input feature map. And G
represents the number of groups into which the input feature map is divided.

Fig. 3. Operation process of Involution

Comparison with Conventional Convolution


 
(1) The parameter amount of Involution is C 2 + C × G × K 2 /r, and the amount of
calculation is H × W × K 2 × C, and it has a linear relationship with the number
of channels C. However, the parameter amount of Convolution is K 2 × C 2 , the
amount of calculation is H × W × K 2 × C 2 , and it has a square relationship with
the number of channels C. In contrast, the number of parameters and calculations
of Involution are significantly less than that of Convolution.
(2) Because the channels of Involution are shared, even if a larger kernel is used, the
amount of calculation will not increase significantly. Using a large-scale kernel
can increase the receptive field, which is more conducive to summarizing context
information. Convolution generally only uses a 3 × 3 or 5 × 5 kernel for operation
due to the limitation of calculation, which largely limits the ability of information
interaction.
(3) Although Involution does not share the kernel parameters in the spatial position, it
uses Convolution when generating the kernel, potentially introducing knowledge
sharing and migration.
(4) Involution’s kernel is dynamically generated by Convolution based on the feature
map, so the kernel corresponding to each feature map in the same batch is different,
which is more conducive to feature extraction.
Multi-scale Object Detection Algorithm Based on Faster R-CNN 385

3.3 Construction of DSI-FPN Network

On the basis of Feature Pyramid Networks (FPN) [18], depthwise separable convolu-
tion, dilated convolution, spatial attention mechanism (SAM) and Involution operators
are introduced. The resnet50 feature extraction network obtains c2, c3, c4 and c5 feature
maps with different scales from bottom to top, as shown in Fig. 4. After inputting these
feature maps into the FPN network, the DDS network module performs a series of con-
volution and pooling operations on the four feature maps, and performs 2x upsampling
on the obtained feature maps through the nearest neighbor interpolation algorithm. The
obtained feature map is input downward and merged with the feature map processed by
the DDS network in the upper layer, and the fused feature map is then output through a
3 × 3 involution operation. The DSI-FPN network structure is shown in Fig. 5.

Fig. 4. ResNet50 network structure

Fig. 5. DSI-FPN network structure

3.4 Construction of DDS Network

Take a feature map c2 output by ResNet as an example to introduce the DDS network
module. Perform two-way depth-wise separable Convolution on the c2 feature map
simultaneously. One way uses the dilated Convolution with kernel_size = 1, dilation
rate = 1 and stride = 1, and the other uses the dilated Convolution with kernel_size
= 3, dilation rate = 3 and stride = 1. And channel fusion is performed on the feature
maps after two convolutions, and then do a 1 × 1 convolution to change the number of
channels to 256. Finally, the two feature maps with 256 channels are subjected to pixel
multiplication and output. The DDS network structure is shown in Fig. 6.
386 X. Su et al.

Fig. 6. DDS network structure

4 The Experiment of Multi-scale Object Detection


4.1 Dataset, Evaluation Index and Training

The dataset used in this article is a fusion of the VOC2007 and VOC2012 officially
released by PASCAL, and data enhancement through methods such as horizontal flip-
ping. The dataset used by object detection has one background category and 20 object
categories. It contains 14,895 training images and 5823 verification images, and both
the training set and the verification set disclose the corresponding label files.
In this experiment, Mean Average Precision (mAP) is used as a measurement index,
and mAP is a commonly used index to measure the performance in object detection.
The calculation method is formula (7) and (8):
1 
AP = 11 r∈(0,0.1...1) Pint erp(r) (7)

sum(AP)
mAP = N
(8)

Among them, Pint erp(r) is the interpolated precision, AP is the average precision of
the 11-point interpolated precision of this category, and the AP values of the 20 categories
are summed and averaged to obtain the mAP value of the experiment.
This experiment uses the method of transfer learning to freeze the weights of the
ResNet network, a total of 40 rounds of iterative training, the batch-size is set to 2, the
lr is set to 0.005, the gamma of lr is set to 0.33, the attenuation strategy is set to lr =
lr × gramma, and a new lr is generated every five rounds of attenuation. The model
optimization method is the stochastic gradient descent method.

4.2 Analysis and Comparison of Experimental Results

This experiment was performed on the data sets VOC2007 and VOC2012. The exper-
imental results show that the mAP value reached 86.3%, which is 15.9% higher than
the classic Faster R-CNN algorithm and 8.7% higher than the R-FCN algorithm which
Multi-scale Object Detection Algorithm Based on Faster R-CNN 387

the highest accuracy among the classic algorithms. It can be seen from Table 1 that
the model has the highest detection accuracy on the three categories of bus, person and
car, reaching 93%. It can be seen from the comparison with the AP value of the Faster
R-CNN + FPN model in different categories that the AP values of all categories of the
model are higher than the Faster R-CNN + FPN model. Comparing with YOLO v3, it
can be seen that, except for the two categories of train and horse, the AP values of the
experimental model in the other categories are all higher than YOLO v3 [19]. Compared
with YOLO v4 [20], it can be seen that there are 11 categories of AP values in this
experimental model that are higher than YOLO v4. It can be seen from Table 2 that
because the model in this paper introduces the dilated Convolution, involution operator
and spatial attention mechanism on the basis of Faster R-CNN, it improves the model’s
ability to extract image features, so the mAP value of the model in this experiment is
higher than other models.

Fig. 7. Change of parameters, where (a) represents the change of mAP during the experiment,
(b) represents the change of loss and lr during the experiment

4.3 Experimental Environment


In this paper, the framework used for model building is Pytorch, and the program-
ming language is Python3.7. The processor of the experimental equipment is Intel Core
i7-10700KF @ 3.08GHz, the memory is 16GB, the operating system is Windows10
Professional 64-bit, and the graphics card is Nvidia GeForce RTX 2080 Ti.
388 X. Su et al.

Fig. 8. P-R curve of the model

Fig. 9. Effect of detection

Table 1. Comparison of AP values in different categories

Class Faster R-CNN + FPN YOLO v3 YOLO v4 Ours


Bus 0.90 0.89 0.91 0.94
Person 0.88 0.91 0.85 0.94
Car 0.86 0.82 0.82 0.93
Aeroplane 0.89 0.89 0.94 0.93
Motorbike 0.87 0.89 0.83 0.92
Bicycle 0.83 0.84 0.68 0.90
Sheep 0.80 0.73 0.94 0.90
Train 0.84 0.91 0.97 0.89
(continued)
Multi-scale Object Detection Algorithm Based on Faster R-CNN 389

Table 1. (continued)

Class Faster R-CNN + FPN YOLO v3 YOLO v4 Ours


Cat 0.84 0.85 0.98 0.89
Horse 0.87 0.92 0.81 0.88
Tvmonitor 0.80 0.79 0.85 0.87
Cow 0.82 0.73 0.91 0.87
Dog 0.84 0.85 0.96 0.87
Bird 0.82 0.85 0.94 0.86
Bottle 0.69 0.71 0.80 0.83
Chair 0.58 0.70 0.64 0.79
Boat 0.58 0.60 0.81 0.79
Pottedplant 0.54 0.60 0.71 0.78
Sofa 0.66 0.75 0.77 0.76
Diningtable 0.63 0.68 0.62 0.71

Table 2. Comparison with other models

Model mAP/%
R-CNN 53.3
Fast R-CNN 68.4
Faster R-CNN 70.4
YOLO v2 73.4
YOLO v3 74.6
SSD 74.9
R-FCN [21] 77.6
Faster R-CNN + FPN 81.8
Ours 86.3

5 Conclusion
This experimental model is based on the classic Faster R-CNN model and mainly
improves the FPN network. It combines depth-wise separable and dilated Convolu-
tion to replace the 1 × 1 Convolution in FPN, and uses Involution to replace the 3 × 3
volume in FPN. At the same time, the spatial attention mechanism is introduced, which
reduces the number of parameters and calculations and improves the feature extraction
ability for small and dense objects. In addition, the anchors generated by the RPN model
based on each pixel are expanded from nine to 15, adding two small scales of 322 and
642 , and each scale is still in three ratios of 1:1, 1:2 and 2:1, it is helpful to improve the
390 X. Su et al.

detection ability of small objects and dense objects. The test results of this model on the
dataset VOC2012 show that the detection accuracy of small objects and dense objects
have been significantly improved. Although the model improves the detection speed by
reducing the number of parameters and the amount of calculation, it still cannot real-
ize the function of real-time detection. Therefore, how to further improve the detection
speed of the model is a further research topic.

Acknowledgment. This work is supported by Heilongjiang Provincial Natural Science Founda-


tion of China (No. LH2021F034).

References
1. Panning, A., Al-Hamadi, A.K., Niese, R., Michaelis, B.: Facial expression recognition based
on Haar-like feature detection. Pattern Recogn. Image Anal. 18(3) (2018)
2. Dalal, N., Triggs, B.: Histograms of oriented gradients for human detection. In: IEEE
Computer Society Conference on Computer Vision & Pattern Recognition. IEEE (2005)
3. Ojala, T., Pietikainen, M., Maenpaa, T.: Multiresolution gray-scale and rotation invariant
texture classification with local binary patterns. IEEE Trans. Pattern Anal. Mach. Intell. 24(7),
971–987 (2002)
4. Burges, CJC., Fayyad, U.: A Tutorial on Support Vector Machines for Pattern Recognition.
Data Mining and Knowledge Discovery (1998)
5. Zhu, J., Arbor, A., Hastie, T.: Multi-class AdaBoost. Stat. Interf. 2(3), 349–360 (2006)
6. Ma, Y., Zhao, Z.H., Yin, Z.Y., Fan, C., Chai, A.Y., Li, C.M.: Segmented deconvolution
improves the object detection algorithm of SSD [J/OL]. Minicomputer System, pp. 1–7 (2021)
7. Technicolor, T., Related, S., Technicolor, T., et al.: ImageNet Classification with Deep
Convolutional Neural Networks
8. Ren, S., He, K., Girshick, R., et al.: Faster R-CNN: towards real-time object detection with
region proposal networks. IEEE Trans. Pattern Anal. Mach. Intell. 39(6), 1137–1149 (2017)
9. Redmon, J., Divvala, S., Girshick, R., et al.: You Only Look Once: Unified, Real-Time Object
Detection. IEEE (2016)
10. Liu, W., Anguelov, D., Erhan, D., et al.: SSD: Single Shot MultiBox Detector. Springer, Cham
(2016)
11. Wang, X.B., Zhu, X.Y., Yao, M.H.: Object detection method based on improved Faster RCNN.
High Technol. Lett. 31(05), 489–499 (2021)
12. Fan, J.L., Tian, S.B., Huang, K., Zhu, X.D.: Multi-scale target detection algorithm for air-
craft carrier surface based on Faster R-CNN [J/OL]. System Engineering and Electronic
Technology, pp. 1–10 (2021)
13. He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: 2016
IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 770–778 (2016)
14. Chollet, F.: Exception: deep learning with depth wise separable convolutions. In: 2017 IEEE
Conference on Computer Vision and Pattern Recognition (CVPR). IEEE (2017)
15. Yu, F., Koltun, V.: Multi-scale context aggregation by dilated convolutions. In: ICLR (2016)
16. Zhu, X., Cheng, D, Zhang, Z., et al.: An empirical study of spatial attention mechanisms in
deep networks. In: 2019 IEEE/CVF International Conference on Computer Vision (ICCV).
IEEE (2020)
17. Li, D., Hu, J., Wang, C., et al.: Involution: Inverting the Inherence of Convolution for Visual
Recognition (2021)
Multi-scale Object Detection Algorithm Based on Faster R-CNN 391

18. Lin, T.Y., Dollar, P., Girshick, R., et al.: Feature pyramid networks for object detection.
In: 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). IEEE
Computer Society (2017)
19. Redmon, J., Farhadi, A.: YOLOv3: An Incremental Improvement. arXiv e-prints (2018)
20. Bochkovskiy, A., Wang, C.Y., Liao, H.: YOLOv4: optimal speed and accuracy of object
detection (2020)
21. Dai, J., Li, Y., He, K., et al.: R-FCN: object detection via region-based fully convolutional
networks. In: Advances in Neural Information Processing Systems. Curran Associates Inc.
(2016)
Persistent Homology Apply in Digital Images

Sun Huadong1,2(B) , Zhang Yingjing1 , and Zhao Pengfei1


1 Harbin University of Commerce, Harbin 150028, China
2 Heilongjiang Provincial Key Laboratory of Electronic Commerce and Information Processing,

Harbin 150028, China

Abstract. We use the method of persistent homology and complex filter homol-
ogy group calculation to study digital images. 2D digital images are composed
of pixels, 3D images are composed of voxels. Both of them own a natural cubi-
cal construction. The complex cubical numbers are more suitable than simple,
complex numbers in the field of digital images. At the same time, the way to
compute simplicial complex can also apply to study cubical complex and its per-
sistent homology. We construct a cubical complex of this space under different
parameters, then calculate the homology of the cubical complex to obtain the per-
sistent diagram, and use it to get the corresponding geometric structure message
and the image’s topological characteristics. In the last, we discuss the application
of persistent homology to image classification on the support vector machine and
random forest method.

Keywords: Persistent homology · Cubical complex · Persistence diagram ·


Image classification

1 Introduction
Image similarity analysis is the basic research goal of computer vision and a key step
in image processing research. Recently, many scholars have made a lot of progress in
image retrieval, image classification, target recognition, and other research works. At
present, the more traditional image retrieval, classification, and recognition generally
have two research directions: one is the research method based on image features, and
the other is the research method based on gray correlation.
In the past, the research methods of extracting image features based on image con-
tent mainly extracted the color feature, texture feature, shape feature of the image, and
the local feature of the image adopted in recent years. When the above features are
processed, the image has certain limitations if they undergo rotation transformation,
non-linear transformation, and projection transformation. We study the method of per-
sistent homology to extract the topological features of the image and the geometric
structure features of the image shape and combine the extracted features to perform
image classification in this paper.
There are many methods for topological data analysis (TDA) [1]. Among them, we
use persistent homology (PH) [2] to study the qualitative characteristics of data that
persist across multiple scales. Persistent homology is robust to the perturbation of input

© The Author(s), under exclusive license to Springer Nature Switzerland AG 2022


A. E. Hassanien et al. (Eds.): BIIT 2021, LNDECT 107, pp. 392–400, 2022.
https://doi.org/10.1007/978-3-030-92632-8_37
Persistent Homology Apply in Digital Images 393

data. It offers a concise expression of the qualitative features of the input data. Persistent
homology is very useful for applications. It is based on algebraic topology. It provides a
more comprehensible academic structure to learn the qualitative characteristics of data
with complex structures. It can be calculated by linear algebra. When encountering small
input data, interference is powerful. This concept was first proposed by ROSINI and
LNDI in 1999 when they were studying scale theory [3]. Later, they applied persistent
homology to analyze various data and achieved good results. Now it has been widely
used in various fields. Carlsson G studies the pixel blocks of 3 × 3 images by selecting
the local structure of natural images [4]. Moo and Kim offer a newfangled framework
that uses computational algebraic topology techniques to characterize signals in images
[5]. Paul and his team used the robustness of persistent homology and employed relevant
tools to study 3D images of plant roots [6].

2 Relate Work
2.1 Homology

If the given data is located in the metric space, for example, a subset of the Euclidean
space with inherited distance functions. Under many circumstances, we don’t want to
know the accurate geometry of spaces. Still, we try to figure out the existence of voids,
holes or compositions, and some other basic characteristics. Algebraic topology catches
essential features by computing them or correlating vector spaces or more complex
algebraic structures. We want to understand homology, which correlated one vector
space Hi (X) to a space X for each number i ∈ {0, 1, 2, …}. The dimension of H0 (X)
computes the amount of path components in X, the dimension of H1 (X) is a count of
the number of holes, and the dimension of H2 (X) is a count of the number of voids.
These algebraic structures don’t transform when the underlying spaces are deformed by
stretching, bending. In other words, the spaces are homotropy invariant and robust in
professional terms. Calculating the homology of arbitrary topological spaces may be a
little difficult. Therefore, we apply a combination structure called ‘simplicial complex’
to approach our spaces, and then use related algorithms to calculate its homology.

2.2 Persistent Homology

Consider that the experimental data we give is in the form of a set S. points or vectors are
denoting the measured values in this set. Some distance functions on the points or vector
sets given by correlation or dissimilarity measures. Here, we do not consider whether the
set S belongs to a sample of the underlying topological space. If you want to restore the
attributes of the above-mentioned underlying space, you can use a method that is robust
to little disturbances in the dataset S. If S is a part of Euclidean space, you can consider
“thickening” the trumpet of S according to the combination of a sphere with a certain
fixed radius to give ε around its point, the calculation of Čech is complex. We attempt to
construct a Čech complex [7] for the selected value  and compute its simple homology
to compute the qualitative characteristics of the finite metric space S. There is a problem
in this method. The parameter  that we choose to construct the Čech complex is not
394 S. Huadong et al.

explicitly chosen a priori. Using the knowledge of persistent homology, we may need to
consider several possible values of the parameter  to be able to extract the information
we want from the data. As the value of  increases, the simplex is gradually added to the
complex. Use persistent homology to study how the homology of the complex changes
with changes in parameters. Finally, detect which features “continue to exist” even if
the parameter values are changing.

2.3 Simplicial Complex and Its Homology

Simplicial complex [8] is a set K of nonvoid subsets of a collection K0 , {v} ∈ K for all
v ∈ K0 , and τ ⊂ σ, σ ∈ K ensures that τ ∈ K. The elements of K are called simplices,
the elements of K0 are called vertices of K. If a simplex has a cardinality of p+1, we call
that it is a p-simplex or it has dimension p.
The set of p-simplices can be represented by Kp . The k-skeleton of K is the union
of the collections Kp for all p ∈ {0, 1,…,k}. We say τ a face of σ, if τ, σ are simplices,
and τ ⊂ σ. In the case that the dimensions of τ and σ differ by k  , we call that τ is a face
of σ of codimension k  . The dimension of K is the maximum of the dimensions of its
simplices. A map of simplicial complexes, f : K → L, f :K0 → L0 , f (σ ) ∈ L and σ ∈ K.
We now define homology for simplicial complexes. 2 represents a domain with two
elements. Given a simplicial complex K, let Cp (K) denote the F2 -vector space with basis
given by the p-simplices of K. p ∈ {1, 2, …}, we define the following map:

dp : Cp (K)σ → Cp − 1(K)

σ → τ
τ ⊂ σ,τ ∈ Kp −1

Since the borders of the borders are usually empty, the map dp has the following
property: for p ∈ {0, 1, 2, …}, dp ◦ dp+1 = 0.

3 Image Representation Based on Persistent Homology

3.1 Barcodes

In the barcode [9], the starting point of the straight line indicates the time when the
hole turns up, and the endpoint indicates the time when the hole fades away. In each
dimension, the length of straight lines is various. If the length of a straight line is longer,
the topological feature is more stable; the shorter the length, the topological feature has a
shorter duration, and it may also represent random noise. Furthermore, there is a straight
line with arrows in the figure below. We name it “infinite interval”. It means that this
topological feature is generated from a certain point in time and will not die out in the
end (Fig. 1).
Persistent Homology Apply in Digital Images 395

Fig. 1. H0 , H1 , H2 respectively means 0, 1, 2-dimension.  represents time.

3.2 Persistence Diagram


2
The persistence diagram [10] is the union of a finite multiset of points in R with the
multiset of points on the diagonal  = {(x, y) ∈ R2 |x = y}.The points on the diagonal
are unstable and they have infinite multiplicity.
Here, we contain all of the points on the diagonal in R2 with infinite multiplicity.
Simply put, the persistence diagram must be set to have the same cardinality, so that we
can compare the persistence diagrams by learning the bijection between the elements
(Fig. 2).

Fig. 2. In the persistence diagram of beer mug, the red part represents 0 dimension, and the blue
part represents 1 dimension.
396 S. Huadong et al.

3.3 Betti Curve


 
The betti curve [11] Bn : I → R of a barcode D = bj , dj jI is the map that return for
 
each step i ∈ I, the elements of bars bj , dj that contains i (Fig. 3):
   
i → # bj , dj , i ∈ bj , dj .

Fig. 3. An example of Betti curves.

4 Experiment
4.1 Data

Provided that given an image made up of N voxels or pixels, we treat the digital image
as a C × N dimensional space, and a vector of length c may be stored at each coordinate.
The vector stands for the color of the pixel or the voxel. We define a suitable distance
function in such a space and allow a group of images to be treated as limited metric space.
These images have N pixels or voxels. Therefore, the way to learn a limited metric space
is to use persistent homology. It can also apply in the research about image data sets.
The digital image has a cubical structure. Simply, a cubical complex is a space made
up of corners, edges, squares, cubes, and some other things. For a two-dimensional digital
image, building cubical complexes involves specifying a vertex for each pixel, then using
edges to connect the vertices corresponding to adjacent pixels, filling the resulting square
at the end. We can conduct a similar technique to deal with three-dimensional images.
Label each vertex with an integer relating to the pixel’s gray value, and then use the
maximum value of adjacent vertices to mark the corresponding edge.

4.2 Cubical Complex

The cubical complex [12] is a structured complex. It is helpful in computational


mathematics and image processing and analysis.
Persistent Homology Apply in Digital Images 397

For n ∈ Z, [n, n+1] is called non-degenerate interval, [n, n]is called a degenerate
interval. A boundary of a elementary interval is a chain ∂[n, n+1] = [n+1, n+1] − [n,
n] in case of non-degenerated elementary interval and ∂[n, n] = 0 in case of degenerate
elementary interval. An elementary cube C is a product of elementary intervals, C =
I1 × ... × In . Embedding dimension of a cube is n, the number of elementary intervals
(degenerate or not) in the product. A dimension of a cube C = I1 × ...×In is the number
of non degenerate elementary intervals in the product. A boundary of a cube C = I1 ×
...×In is a chain obtained in the following way:

∂C = (∂I1 × ... × In ) + (I1 × ∂I2 × ...In ) + . . . + (I1 × I2 × ... × ∂In )

The boundaries of each cube are in the set. If the cube C in the cubic complex number
K is not within the boundary of any other cube in K, then the cube C is the largest. The
support of the cube C is the collection occupied by C in Rn .
In the case where the cube may be equipped with filter values, we filter the cubical
complex. Although the scope of filtering could be a collection of two elements, we still
have to consider that all cubical complexes are filtered cubic complexes (Figs. 4 and 5).

Fig. 4. A cubical complex in a map f : [−2, 2]2 → R. f(x, y) is a distance from (x, y) to a unit
circle x2 + y2 = 1.

Fig. 5. Obtain the filter of the cubical complex through the grayscale image and related barcode.
Since all the new pixel is invariably concatenated to the previous pixels, we know the only one
connected component which emerges at first time and doesn’t disappear till the end.
398 S. Huadong et al.

4.3 Classification Experiment

The MNIST data set is composed of 28 × 28 handwritten digital images. It has a total
of 50,000 images. In this paper, we use topological knowledge to study data set, and
classify the numbers in it.
According to the number of rings in the data set, we can divide the numbers into three
classifications: {1, 2, 3, 5, 7}, {0, 4, 6, 9}, {8}. Use the position and scale information
of topological features that persistent homology can provide to classify the MNIST data
set.
In the following experiments, we use two methods to classify the numbers according
to topological features (Tables 1 and 2).

Table 1. MNIST classification use SVM

0 1 2 3 4 5 6 7 8 9
0 0.966 0.795 0.939 0.924 0.937 0.616 0.961 0.855 0.629
1 0.749 0.560 0.580 0.593 0.940 0.525 0.991 0.939
2 0.698 0.677 0.681 0.711 0.726 0.871 0.747
3 0.535 0.537 0.905 0.533 0.970 0.896
4 0.537 0.892 0.548 0.950 0.882
5 0.900 0.562 0.966 0.892
6 0.933 0.847 0.565
7 0.988 0.928
8 0.856
9

Table 2. MNIST classification use random forest

0 1 2 3 4 5 6 7 8 9
0 0.975 0.795 0.941 0.922 0.930 0.629 0.968 0.869 0.612
1 0.750 0.584 0.583 0.602 0.950 0.519 0.992 0.950
2 0.711 0.692 0.698 0.691 0.745 0.856 0.750
3 0.522 0.516 0.904 0.563 0.965 0.902
4 0.515 0.887 0.573 0.941 0.891
5 0.899 0.586 0.964 0.894
6 0.939 0.846 0.578
7 0.987 0.945
8 0.872
9

From the above experiments, we can see that the classification effect of (1, 7) is the
worst, the accuracy is 51.9%. The classification effect of (1, 8) is the best, the accuracy
is 99.2% (Fig. 6).
Persistent Homology Apply in Digital Images 399

Fig. 6. Grayscale the numbers 1, 7, 8 in the MNIST data set, and then we get their persistence
diagrams.

From the persistence diagrams, we can see that numbers 1 and 7 have only 0-
dimensional topological features. They have similar dimensions, so this group has bad
classification results. The number 8 has a one-dimensional feature, the group (1, 8) has
the best result. Similarly, the classification effect of the number 8 and other numbers
has the best classification accuracy among other groups, except one group (0, 8). In this
paper, the classification method of random forest is better than the method of SVM.

5 Conclusions

In this paper, we introduced the concept of persistent homology, related properties, and
the tools that use topology to describe: barcode, persistent diagram, betti curve. We
use the persistent homology to construct a series of cubical complexes to approximate
the image space, then obtain the geometric structure and the homology information of
the image. SVM and random forest are used to classify the images. The experiments
indicate that the combination of random forest and TDA technology can improve the
image classification effect in the MNIST data set.
There are many things for improvement in this work. We can use persistent homol-
ogy to classify other images, such as commodity images. It’s not just the MNIST data
set. TDA is well known for its robustness to noise, so different types of noise can be
studied. Since the persistence diagram is affected by the image scale, we can study image
classification at different scales.

Acknowledgments. This paper is supported by the Heilongjiang Provincial Natural Science


Foundation of China (LH2020F008).
400 S. Huadong et al.

References
1. Wang, B., Wei, G.-W.: Object-oriented persistent homology. J. Comput. Phys. 305, 276–299
(2016)
2. Zomorodian, A.: Topology for Computing. Cambridge University Press, Cambridge (2018)
3. Rosini, P., Landi, C.: Size theory as a topological tool for computer vision. Pattern Recogn.
Image Anal. 9(4), 596–603 (1999)
4. Carlsson, G., Ishkhanov, T., de Silva, V., Zomorodian, A.: On the local behavior of spaces of
natural images. Int. J. Comput. Vis. 76(1), 1–12 (2008)
5. Chung, M.K., Bubenik, P., Kim, P.T.: Persistence diagrams of cortical surface data. In: Prince,
J.L., Pham, D.L., Myers, K.J. (eds.) Information Processing in Medical Imaging, pp. 386–397.
Springer, Heidelberg (2009). https://doi.org/10.1007/978-3-642-02498-6_32
6. Bendich, P., Edelsbrunner, H., Kerber, M.: Computing robustness and persistence for images.
IEEE Trans. Vis. Comput. Graph. 16(6), 1251–1260 (2010)
7. Eda, K., Kawamura, K.: The surjectivity of the canonical homomorphism from singular
homology to Cech homology. Proc. Am. Math. Soc. 128(5), 1487–1495 (1999)
8. Duval, A.M., Klivans, C.J., Martin, J.L.: Critical groups of simplicial complexes. Ann. Comb.
17(1), 53–70 (2013)
9. Watanabe, S., Yamana, H.: Topological measurement of deep neural networks using persistent
homology. Ann. Math. Artif. Intell. (2021).https://doi.org/10.1007/s10472-021-09761-3
10. Chung, M.K., Ombao, H.: Lattice paths for persistent diagrams. In: Reyes, M., Hen-
riques Abreu, P., Cardoso, J., Hajij, M., Zamzmi, G., Rahul, P., Thakur, L. (eds.)
IMIMIC/TDA4MedicalData - 2021. LNCS, vol. 12929, pp. 77–86. Springer, Cham (2021).
https://doi.org/10.1007/978-3-030-87444-5_8
11. Curto, C., Paik, J., Rivin, I.: Betti curves of rank one symmetric matrices. In: Nielsen, F.,
Barbaresco, F. (eds.) GSI 2021. LNCS, vol. 12829, pp. 645–655. Springer, Cham (2021).
https://doi.org/10.1007/978-3-030-80209-7_69
12. Fajstrup, L.: Dipaths and dihomotopies in a cubical complex. Adv. Appl. Math. 35(2), 188–206
(2005)
Research on the Style Classification Method
of Clothing Commodity Images

Yanrong Zhang1,2(B) and Rong Song1,2


1 Harbin University of Commerce, Harbin 150028, China
2 Heilongjiang Provincial Key Laboratory of Electronic Commerce and Information Processing,

Harbin 150028, China

Abstract. E-commerce has a good development trend in commercial digitization.


Purchasing clothing products on e-commerce platforms is also the main purchase
channel currently selected by consumers. With the development of the times, the
public pays more and more attention to the individual pursuit of clothing styles.
The current research on the classification of clothing types is relatively complete,
but the classification of clothing styles still deserves further research. Based on
product style classification, this paper divides 252 suit pictures into two categories:
business style and sports style. First, the random walk algorithm segments the
foreground of the clothes in the product image from the cluttered background.
Then the HOG algorithm is used to extract features from the obtained image data.
Finally, the extracted feature vectors are put into the classification model of the
support vector machine for binary classification. The result is that the accuracy of
style classification after segmentation is better than that without segmentation. It
also takes less time to classify after random walk processing.

Keywords: Commodity image · Clothing style classification · SVM

1 Introduction
As e-commerce continues to expand and develop, the national e-commerce transaction
volume reached 34.81 trillion yuan in 2019 [1]. Figure 1 shows the market size and
growth of China’s apparel e-commerce industry in the past five years. The data shows
that the growth rate of the total scale of my country’s clothing e-commerce has been
unstable in the past five years. But it still keeps increasing year by year. It can be
seen that buying clothing products on the Internet has become a very popular way. The
contemporary public has a certain pursuit of personality and style in terms of clothing.
Therefore, the research on clothing style classification is very helpful for consumers to
retrieve their ideal style of clothing on the e-commerce platform.
In the clothing product display pictures of major e-commerce platforms, in order
to better display the effect, most of the clothing pictures are displayed on the models.
And there will be some exquisite backgrounds to enhance the overall visual effect. So,
some elements are not related to clothing. These elements will cause some errors when
classifying the style of the specific area of clothing, which affects the classification

© The Author(s), under exclusive license to Springer Nature Switzerland AG 2022


A. E. Hassanien et al. (Eds.): BIIT 2021, LNDECT 107, pp. 401–414, 2022.
https://doi.org/10.1007/978-3-030-92632-8_38
402 Y. Zhang and R. Song

accuracy. Therefore, this article pre-segmented the region of interest in the product
image to extract the target region. Secondly, the HOG feature of ROI region is extracted.
Then, the extracted data is put into the classifier as input data. Finally, the product
image style classification is completed. Experimental results show that the model can
improve the accuracy of clothing style classification results. And provide a reference in
the application of clothing style classification.

2015-2019 China's apparel e-commerce industry market size statistics


and growth

12000 29.83%

26.42% growth rate %


8000 Market size 100
20.3%
19.19% million yuan
1013
8502. 3.7
4000 6725. 4
5590.
4306. 7
8
4

2015 2016 2017 2018 2019

Fig. 1. A statistical chart of the market size and growth of China’s apparel e-commerce industry
in the past five years

2 Current Status of Related Research


2.1 Research Status of Image Segmentation Based on Random Walk Algorithm
The random walk algorithm has been well used in many fields, such as image seg-
mentation, image retrieval, and image filtering. Its complete mathematical theoretical
foundation makes the application of this algorithm reliable.
Leo Grady [2] combined the random walk model with the graph theory method in
dealing with the image segmentation problem and proposed a new multi-label interactive
segmentation method used in any graph and dimension. Zhang Xin et al. [3] studied
how to segment the lung tumor target area in CT images based on the random walk
algorithm. The mathematical theory foundation of the random walk algorithm is perfect,
which shows certain advantages in the application of image segmentation. Ju et al.
[4] used the RW algorithm and the joint graph cutting algorithm to segment together
and obtained a more accurate segmentation result than before the improvement. Its
joint distribution map draws on the advantages of PET images and CT images and
transforms the energy functions of PET sub-images and CT sub-images. The optimal
segmentation result is obtained by minimizing the energy function. Bagci et al. [5]
established super-modular images. The random walk algorithm is applied to multi-modal
images to achieve the purpose of segmentation of images with different modalities. Dago
et al. [6] proposed to use a random walk algorithm to complete tumor image segmentation
Research on the Style Classification Method 403

when segmenting PET lung tumor images. When completing the segmentation task, the
traditional random walk algorithm must rely on manual methods to select seed points.
Therefore, the selection of seed points will have a certain impact on the segmentation
results. The position of each seed point and the total number of seed points will make the
segmentation effect different. But the traditional random walk method is very simple to
implement. When Liu Guocai et al. [7] studied the random walk algorithm, they improved
the disadvantage of manually selecting seed points. Using the method of region growth,
automatic seed point selection was realized, which greatly improved the segmentation
speed.

2.2 Research Status Based on HOG Feature Extraction

Compared with other feature description operators, the HOG descriptor has unique
advantages. It has good invariance in image geometric transformation and optical defor-
mation. Therefore, object detection, pedestrian detection, facial expression, and other
aspects have been widely used. There is also a smaller range of applications on clothing
product images.
Firmino M et al. [8] proposed a system for detecting and diagnosing nodules using CT
images. The characteristic values of pulmonary nodules were extracted and input into the
rule-based classifier and SVM classifier by combining the directional gradient histogram
algorithm with the watershed algorithm to complete the binary classification of benign
and malignant pulmonary nodules. Xiaofei Chao et al. [9] proposed using local binary
mode combined with HOG to retrieve clothing images. Sun Shuangchen [10] proposed a
retrieval model based on automatic learning of fashion tags to solve the problem between
user query intention and system retrieval results. It uses HOG features to extract the
underlying features. Finally, it is matched with SVM to complete the clothing fashion
retrieval system. Tian Xianxian et al. [10, 11] used Fisher features to improve the HOG
feature structure. Then, the image blocks generated by HOG are discarded or retained
through certain criteria to achieve the application purpose of MultiHOG feature. Sun
et al. [12] researched and proposed a method for multi-pedestrian tracking. The core
is to combine the enhanced online tracker and the HOG feature description operator
and the particle wave framework. And the results are good. After the improvement, the
tracking object information can be better described. When the tracking target is identified,
the above method is used, and the support vector machine is used. The output results
of dynamic fusion are input into the particle filter framework for processing to detect
similar objects. Pan Bin et al. [13] proposed using HOG to extract features to realize
automatic wine picking. Then, KSVM was used to classify the hop images. Finally,
a good effect was achieved and designed and implemented a clothing image retrieval
system. Madhogaria et al. [13, 14] proposed a novel vehicle recognition method that used
HOG features and MRF fusion in vehicle detection of aerial digital images. Alsahwa
et al. [15] used the method of extracting HOG features and putting them into SVM
classifier in the Marine biometric recognition system to solve the target recognition task
under this background.
404 Y. Zhang and R. Song

2.3 Research Status Based on Support Vector Machine Classifier

Support vector machines have been developed in many fields, such as the medical field,
vehicle transportation field, text recognition field, human body part recognition, human
behavior detection, and others. It has advantages in data sets with small sample size,
nonlinear and high-dimensional pattern recognition applications. There are also some
research achievements in the field of image classification.
Keping Wang et al. [16] improved the feature-weighted support vector machine and
proposed a new image semantic classification method. Since the correlation of each
feature is related to its degree of dispersion, a larger weight is assigned to the related
features and the unrelated features are eliminated. Finally, the SVM classifier is trained
with the weighted features to solve the image classification problem. When Gao Han [17]
used the SVM model for image classification, he proposed replacing the kernel function
with feature distance as a similarity measure. This new kernel function selection method
was adopted in the SVM model, which greatly improved its generalization ability. Ryu
et al. [18] researched the classification of handwritten documents, the core of which is to
apply a structured learning-based SVM to achieve classification. Use structured learning
methods to determine SVM parameters, construct and train relaxed structure SVM.
Finally, the optimal parameters can be estimated. Wang Kuiyang et al. [19] constructed
a braking intention SVM model based on support vector machines, which achieved high
accuracy in braking intention recognition. According to the problem of ship recognition
rate, Wu Yingzheng et al. [20] implemented ship image classification based on HOG
and SVM models. Dong Junjie [21] studied the image recognition technology of the
online clothing similarity retrieval system. The results of several model experiments

Enter a clothing Random walk


product image with to segment out
a background clothing area

HOG feature
extraction

Put into the


SVM classifier

Output classification
results

Fig. 2. Overall flow chart


Research on the Style Classification Method 405

were compared to find a more suitable one. A trainer framework combines HOG features
and SVM—designed and implemented a clothing image retrieval system.
To sum up, this paper firstly adopts the random walk algorithm to segment the apparel
merchandise images with complicated backgrounds. The clothing area is marked with
foreground seed points, and the image areas except clothing are marked with background
seed points. After segmentation, the target clothing image part is obtained. Secondly, the
HOG algorithm extracts the features of the image set that has not been segmented and
the image set that has been segmented. Finally, the extracted HOG features are put into
the support vector machine classifier to classify two styles of sports style and business
style. The flowchart is shown in Fig. 2.

3 Introduction to Data and Methods


3.1 Data Sources
Because the research on clothing style classification is not extensive and the construction
of its public data set is not perfect. The data of this experiment adopts self-built data set.
Among them, the picture data are collected from the apparel product image of Taobao
shopping platform. Contains two styles of product images, business suits and sports
suits. And the images all contain some background parts that have nothing to do with
clothing.

3.2 Random Walk Algorithm


Grady proposed applying the random walk algorithm in the field of image segmentation
in 2004, which promoted the further development of image segmentation algorithms
based on graph theory [22]. Random walk as the ideal mathematical state of Brownian
motion is regarded as a diffusion transportation process. In this algorithm, the image
pixels are recorded as the vertices of the graph. The attributes of the pixels and the
similarity of features between adjacent pixels are defined as the weight value of the
edge formed between two vertices. By defining the weights of the edges in this way,
the probability of a free random walk from one node to another can be obtained. The
user manually marks the foreground and background seed points in the target area,
respectively. The direction of the seed point randomly walks to the target is represented by
a probability value. The actual direction of the seed point can be displayed by composing
the probability value distribution into a map. The random walk segmentation algorithm
can segment multiple unconnected regions of interest in a picture, and to achieve such
a target region segmentation purpose, only one calculation complexity is required.
According to the image constitutes an undirected weighted graph G(V , E), a pixel
or a pixel block of the image is defined as any vertex v ∈ V in the graph G. All vertices
are divided into two sets VM and VU . Marked points are represented by set VM . And
the set of unmarked points is represented by VU . Any edge in the graph is marked as
e ∈ E ⊆ V ∗ V , which means the difference between two adjacent pixels on the image.
The existence relationship between the edges, and the weight W of the edges is used to
measure the feature similarity or difference between pixels.

di = wij (1)
406 Y. Zhang and R. Song

In the formula, di is the degree of vertex vi , which means the sum of the weights
of all edges connected to vertex vi . The Laplace matrix constructed according to the
above-defined quantity is denoted as L. After the vertices in the graph are divided into
set VM and set VU , the Laplace matrix is expressed as
 
LM B
L= (2)
BT LU

The solution process of random walk can be transformed into solving the Dirichlet
problem [23]. The core of the problem is to determine the harmonic function u(x, y)
that meets the boundary conditions, so that the Dirichlet integral in Eq. (3) reaches the
minimum.
1 1 1   2
D[u] = (Au)T C(Au) = uT Lu = wij ui − uj (3)
2 2 2
eij ∈E

Image segmentation is transformed into a problem of classifying unmarked points


based on the maximum probability max xis , where K is a given K class mark
s∈[1,K]

LU xis = −Bms (4)

Among them, xis is the probability of vertex vi reaching the mark S for the first time.

3.3 HOG Algorithm

The Histogram of Oriented Gradient (HOG) feature has been widely used since 2005.
It is a feature description operator used in image processing and computer vision. Dalal
and Triggs accurately defined its meaning at the CVPR conference [24]. The main idea
of the HOG algorithm is to use the direction and density distribution of the gradient to
describe the characterization and shape attributes of the local target in the image, which
focuses on the local area information of an image. The HOG method can be used to
obtain the edge information of the input image. An image will be divided into multiple
interconnected cell units. Each Cell unit generates a corresponding directional gradient
histogram, and all directional gradient histograms are combined. Normalization can be
used to describe the edge characteristics of the image.
The steps of the HOG method can be basically summarized as follows:

(1) Confirm the detection window or detection target;


(2) Use the gamma model to normalize the target image;
(3) Calculate the gradient size and gradient direction value of each pixel;
(4) Generate a histogram of the gradient direction of each Cell unit;
(5) The direction gradient histogram combination corresponding to all Cell units in a
region;
(6) Construct overall HOG characteristics.
Research on the Style Classification Method 407

The mathematical model of the HOG algorithm first uses the gamma equalization
method to normalize and standardize the image. The specific operation is to take the
image with the index value of gamma, and then gamma equalizes each color channel.
The formula is shown in formula (5):

P(x, y) = P(x, y)gamma (5)

Through this method, when there is uneven illumination in the processed target
image recognition or target detection, the image can be enhanced under the condition
of weak illumination so that the robustness is improved. And the shadow caused by the
recognition can be reduced to a certain extent. Can also effectively avoid some noise
interference.
The first-order differential template gradient operator method is used to obtain the
gradient value of each pixel.

Gm (x, y) = f (x + 1, y) − f (x − 1, y) (6)

Gn (x, y) = f (x, y + 1) − f (x, y − 1) (7)

The formula (6) represents the gradient component vector of the image in the hor-
izontal direction. Normally, the gradient operator [−1, 0, 1] is selected to obtain the
image by convolution operation. The formula (7) represents the image in the vertical
direction. The gradient component vector is usually obtained by convolving the image
P with the gradient operator [1, 0, −1].
Formula (8) can calculate the gradient amplitude at the pixel point of the image.

 2  2
M (x, y) = Gm (x, y) + Gn (x, y) ≈ |Gm (x, y)| + |Gn (x, y)| (8)

Formula (9) can calculate the gradient direction at the pixel point of the image.
 
Gm (x, y)
α(x, y) = arctan (9)
Gn (x, y)

Among them, M (x, y) in (8) represents the gradient amplitude, and α(x, y) in (9)
represents the gradient direction.

3.4 SVM

Corinna Cortes and Vapnik are equivalent to the support vector machines proposed in
1995 [25]. SVM can realize functions such as data analysis and pattern recognition in the
field of machine learning. Data classification and regression analysis are regarded as its
important advantages. The most basic understanding of SVM is a linear classifier with
the largest interval in the feature space, which is different from the perceptron. It can be
divided into linear classifiable support vector machines, non-linear classifiable support
vector machines [26], and other categories. This article will focus on linear classifiable
support vector machines [27], as shown in Fig. 3.
408 Y. Zhang and R. Song

L3
L1 L2
O
Fig. 3. Linear sortable graph

The ultimate goal of SVM is to find the optimal hyperplane, that is, to get the
maximum distance between each sample point and the hyperplane. It is therefore also
called the maximum interval hyperplane. Now suppose that there are a total of N sample
points to form the training set, and the data set is linearly separable. The feature vector
composed of these N points is denoted as xi . All the samples in the training set can
be divided into two types, denoted by si . Type one (denoted by s1 ) and category two
(denoted by s2 ). Formula (10) can be used to describe any hyperplane.

g(x) = 0 = wT x + w0 (10)

Among them, w = [w1 , w2 , w3 , ..., wi ]T represents the weight vector, and w0 repre-
sents the threshold or displacement term. Its size is the distance between the hyperplane
and the origin. The above linear equation describes that there are but not unique hyper-
planes that can divide the training samples into two categories. Figure 4 describes the
possible hyperplanes.

L3:g(x)=+1

L2:g(x)=0

L1 g(x)= -1
O
Fig. 4. Hyperplane
Research on the Style Classification Method 409

The distance from the sample point to the hyperplane is obtained by formula (11).
|g(x)|
d= (11)
w
Use ti as the class label. When w and w0 are normalized, the value of g(x) to the
nearest point in class s1 is t1 = 1. The value of the nearest point in class s2 is t2 = −1.
When the condition of formula (12) is satisfied, the minimization is as shown in formula
(13).
ti wT xi + w0 ≥ 1 i = 1, 2, . . . , N (12)
1
C(w) ≡ w2 (13)
2
At this time, the convex quadratic programming problem is obtained and it contains
inequality constraints. The Lagrangian objective function (14) is constructed. And the
Lagrangian multiplier method is used to solve this problem.

1 T  N
L(w, w0 , λ) = w w− λi ti wT xi + w0 − 1 (14)
2
i=1
Among them, the Lagrange multiplier is represented by λi , λi ≥ 0 (i = 1,2,…,n).
Let θ (w) = max L(w, w0 , λ). When the sample point is not within the feasible
αi ≥0
solution range at g(x) < 1, the sample point does not meet the constraint conditions. If
λi is infinite, then θ (w) is infinite. g(x) ≥ 1 is in the feasible solution area, and θ (w) is
the original function at this time. Combine the above two situations to get a new objective
function (15).

C(w), x ∈ Feasible region
θ (w) = (15)
+∞, x ∈ Infeasible area
The problem is transformed into (16).
min θ (w) = min max L(w, w0 , λ) (16)
w,w0 w,w0 λi ≥0

According to the duality of the Grange function, the above formula is reduced to
(17).
max min L(w, w0 , λ) (17)
λi ≥0 w,w0
At this time, the problem is transformed into a dual problem including the minimiza-
tion problem. Solve the above problem, multiply all the non-zero support vectors and
add them up to get formula (18).

N
w= λi ti xi (18)
i=1
If you have the value of w, you can get b. And then you can get the hyperplane. The
SVM coincides with one of the two hyperplanes as shown in formula (19).
wT x + w0 = ±1 (19)
410 Y. Zhang and R. Song

4 Experimental Results and Analysis


This experiment uses python 3.7 to implement product image feature extraction and style
classification on Visual Studio Code. In this experiment, the classical set-aside method
is used to divide the data set. After partitioning, the training set S and test set T satisfy
A and B, that is, S and T are mutually exclusive [28]. In data partitioning, stratified
sampling is generally adopted to ensure that the proportion of each category remains
unchanged. 70% of all data in the data set was divided into training set S and 30% into
test set T. Stratified sampling is used to ensure the consistency of data distribution. There
are 176 sheets in the collection and 76 sheets in the test set. And each category has the
same number.

4.1 ROI Region Extraction Based on RW Algorithm


In order to better display the product images of clothing, they are generally worn on
models for the display of clothing effects. According to the style of clothing, product
images are also taken in some specific scenes. This situation will cause the background
to be complex and cause conflicts. Objects or content features that are irrelevant to the
clothes themselves are present in the image. The random walk segmentation algorithm
has the advantage of segmenting multiple unconnected ROI regions in a picture. And it
only needs one random walk to achieve a better segmentation effect. It can effectively
remove the background that may cause the classification effect. The affected part is seg-
mented into an accurate target area. Therefore, this paper adopts a random walk method
to preprocess the clothing product image to remove the background and segment the part
that only has clothing. So that the classification experiment results can be compared with
the unsegmented image classification results in subsequent experiments. The processing
result is shown in Fig. 5.

4.2 HOG Feature Extraction


Since HOG operates on some cell units of the image, it is not easy to be changed when
the image is deformed due to geometric or optical factors. The above two types of defor-
mation are more likely to occur only in a larger space. The HOG method describes the
structural features of the edge gradient of the image. Therefore, the obtained edge fea-
tures provide the local shape information of the image. When geometric changes such
as translation and rotation can be partially suppressed by quantifying the position and
direction to a certain extent, ignoring the illumination factor will reduce the dimension-
ality of the feature data and cause interference. Still, the combination of histograms of
local areas in this method can reduce the impact of illumination. In addition, the process-
ing method of dividing the image into regional blocks and cell units is more helpful to
describe the relationship between the local pixels of the image so that the edge features
of the object can be better extracted. In this experiment, the pictures are processed in
advance as pictures with a size of 256 * 256 pixels, the number of histogram containers
in each cell is 12, and the angle of 180° is divided into 12 parts as the number of bins.
HOG first takes an area of 8 * 8 as a cell unit and then takes a cell unit of 4 * 4 size
as a group to form a block. Since each pixel contains both the size and direction of the
Research on the Style Classification Method 411

Style Unprocessed RW processed Unprocessed RW processed


classification image image image image

Business
style

Sport
style

Fig. 5. The image result after removing the background

gradient, a small cell contains 128 values (8 * 8 * 2 = 128). Each block is obtained
by HOG through a sliding window, and the feature descriptor of the entire image can
be obtained by combining all blocks into one. The result of HOG feature extraction is
shown in Fig. 6.

4.3 Classification Based on SVM Product Image Style

SVM has certain advantages in solving the problem of small data scales. The weight
of the feature can show the importance of the feature, making the feature interpretable.
Another advantage of SVM is that it does not require a large number of adjustment
412 Y. Zhang and R. Song

Fig. 6. The result of HOG feature extraction

parameters. Many experiments on clothing image retrieval by Dong Junjie [21] show
that the method of HOG+SVM is better than the method of LBP, GIST, and SIFT+BOW.
Therefore, this experiment uses the combination of HOG features and support vector
machines to solve the problem of clothing product image style classification.
This paper mainly studies the two-classification problem. The basic problem is to
study the classification of business style and sports style. According to the above method,
two sets of feature values of the original image and the segmented image are obtained
as the input value of the SVM classifier. First, the characteristic values of business style
and sports style clothing before segmentation are used as the experimental input data.
The specific experiment process is as follows:
Input: HOG extracts the gradient magnitude and gradient direction feature value
of the image pixels as input. Output: the label recognition result obtained after image
classification. The sports style is marked with “1” and the business style is marked with
“2”. When the segmented image features are used as input data, the steps are the same
as above.
After the classification result label is obtained, this experiment uses the following
two experimental evaluation indicators for evaluation:

(1) Accuracy = number of correct style classification marks/total.


(2) Time-consuming.

The experimental results are shown in Table 1.

Table 1. Comparison of experimental results

Original picture Picture after RW processing


Accuracy 57.5% 70.0%
Time consuming 232.2 162.1

From the results of the data shown in the table, it can be seen that the classification
accuracy of pictures that have not undergone random walk for background separation
processing is 57.5%. And the classification accuracy after separating the background
Research on the Style Classification Method 413

reaches 70%. It can be seen that the segmentation removes the interference factors of
the background. Improve the accuracy of the style classification of the main part of the
picture, and the time cost will be reduced.

5 Conclusion
The experimental results can be concluded that the accuracy of clothing style classifica-
tion after removing the background is better than that of unprocessed product image style
classification. Therefore, when categorizing clothing product styles, using appropriate
methods to remove the influence of irrelevant background factors is helpful for more
accurate classification. This article makes a certain contribution to the style classification
of clothing in e-commerce. The part that can be improved is to train the model with a
suitable public data set, perform style transfer, and use it in the self-built data set to
get the result. This article only extracts HOG edge features and then considers fusing
texture features and other features. In addition, the amount of data is not sufficient. In
the future, the method of data expansion can be considered to increase the data set to
improve the experimental results. Finally, more accurate segmentation can be applied in
segmentation mode and convenient automatic split mode.

Acknowledgments. Fund Project: Philosophy and Social Science Research Planning Project of
Heilongjiang Province (20GLE393).

References
1. 2019 China E-Commerce Report. http://dzsws.mofcom.gov.cn/article/ztxx/ndbg/202007/202
00702979478.shtml
2. Grady, L.: Random Walks for Image Segmentation. IEEE Trans. Pattern Anal. Mach. Intel.
28(11), 1768–1783 (2006)
3. Zhang, X., Wang, J., Kong, H., et al.: CT image lung tumor segmentation based on random
walk algorithm. J. Hebei Unive. (Nat. Sci. Edn.) 39(3), 311–322 (2019)
4. Wei, J., Xiang, D., Zhang, B., Wang, L., Kopriva, I., Chen, X.: Random walk and graph cut
for co-segmentation of lung tumor on PET-CT images. IEEE Trans. Image Proces. 24(12),
5854–5867 (2015)
5. Bagci, U., et al.: Joint segmentation of anatomical and functional images: applications in
quantification of lesions from PET, PET-CT, MRI-PET, and MRI-PET-CT images. Med.
Image Anal. 17(8), 929–945 (2013)
6. Dago, P.O., Ruan, S., Gardin, I., et al.: 3D random walk based segmentation for lung tumor
delineation in PET imaging. In: IEEE International Symposium on Biomedical Imaging, New
York (2012)
7. Liu, G., Hu, Z., Zhu, S., et al.: Random walk method for segmentation of head and neck tumor
PET images. J. Hunan Univ. Nat. Sci. Edn. 43(2), 141–149 (2016)
8. Firmino, M., Angelo, G., Morais, H.: Computer-aided detection (CADe) and diagnosis
(CADx) system for lung cancer with likelihood of malignancy. Bio Med. Eng. Line 15(1), 2
(2016)
9. Chao, X., Huiskes, M.J., Gritti, T., Ciuhu, C.: A framework for robust feature selection for
real-time fashion style recommendation. In: Proceedings of the 1st International Workshop
on Interactive Multimedia for Consumer Electronics, pp. 35–42. ACM (2009)
414 Y. Zhang and R. Song

10. Sun, S.: Design and implementation of a visual fashion product search engine based on hot
labeling. Sun Yat-sen University (2012)
11. Tian, X., Bao, H., Xu, C.: A pedestrian detection algorithm with improved HOG features.
Comput. Sci. 9, 320–324 (2014)
12. Sun, L., Liu, G., Liu, Y.: Multiple pedestrians tracking algorithm by incorporating histogram
of oriented gradient detections. IET Image Process. 7(7), 653–659 (2013)
13. Pan, B., Chen, W., Yao, Y.: Research on image classification algorithm in hop image
classification. Autom. Instrum. 2, 186–191 (2021)
14. Madhogaria, S., Baggenstoss, P., Schikora, M., Koch, W., Cremers, D.: Car detection by
fusion of HOG and causal MRF. IEEE Trans. Aerosp. Electron. Syst. 51(1), 575–590 (2015)
15. Alsahwa, B., Maussang, F., Garello, R.: Marine life airborne observation using HOG and
SVM classifier. In: Proceedings of OCEANS 2016 MTS/IEEE Monterey, pp. 1–5 (2016)
16. Wang, K., Wang, X., Zhong, Y.: A weighted feature support vector machines method for
semantic image classification. In: International Conference on Measuring Technology and
Mechatronics Automation, pp. 377–380 (2010)
17. Gao, H.: Image classification algorithm based on SVM. Jilin University (2019)
18. Ryu, J., Koo, H.I., Cho, N.I.: Word segmentation method for handwritten documents based
on structured learning. IEEE Sig. Process. Lett. 22(8), 1161–1165 (2015)
19. Wang, K., He, R.: Recognition method of braking intention based on support vector machine.
J. Jilin Univ. (Eng. Technol. Edn.). https://doi.org/10.13229/j.cnki.jdxbgxb20210187
20. Wu, Y., Yang, L.: Ship Image classification algorithm based on HOG and SVM. J. Shanghai
Inst. Ship Transp. Sci. 42(1), 58–64 (2019)
21. Dong, J.: Design and implementation of clothing image retrieval system based on HOG and
SVM. Sun Yat-sen University (2014)
22. Grady, L., Funka-Lea, G.: Multi-label image segmentation for medical applications based
on graph-theoretic electrical potentials. In: Sonka, M., Kakadiaris, I.A., Kybic, J. (eds.)
CVAMIA/MMBIA-2004. LNCS, vol. 3117, pp. 230–245. Springer, Heidelberg (2004).
https://doi.org/10.1007/978-3-540-27816-0_20
23. Jin, H., Bian, K.: Detection of moisture content in wheat flour by near infrared spectroscopy.
J. Chin. Cereals Oils Assoc. 25(8), 109–112 (2010)
24. Dalal, N., Triggs, B.: Histograms of oriented gradients for human detection. In: Proceedings
of IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR),
CA, USA, pp. 886–893 (2005)
25. Cortes, C., Vapnik, V.: Support-vector networks. Mach. Learn. 20(3), 273–297 (1995)
26. Luo, K.: Research on CT image feature extraction and SVM classification of lung nodules.
Xihua University (2012)
27. Fan, X.: Research and application of support vector machine algorithm. Zhejiang University
(2003)
28. Zhang, X.: Research on facial expression recognition method based on deep learning.
Shanghai University of Engineering Technology (2020)
Topological Feature Analysis of RGB Image
Based on Persistent Homology

Jian Ma1,2 , Lizhi Zhang1,2 , Huadong Sun1,2 , and Zhijie Zhao1,2(B)


1 Harbin University of Commerce, Harbin 150028, China
zhaozj@hrbcu.edu.cn
2 Heilongjiang Provincial Key Laboratory of Electronic Commerce and Information Processing,

Harbin 150028, China

Abstract. In recent years, big data technology and its corresponding research
have become a research hotspot in various fields. The application of persistent
homology to analyze all kinds of big data is one of the research methods that have
attracted much attention. This paper mainly studies the application of continu-
ous homology to qualitative analysis of RGB three-channel image, extracts the
topological invariant features of the image, and writes them into the persistence
graph. The difference and similarity of “persistence map” of different images are
measured by Wasserstein distance. For similar images, there is a smaller distance,
while the distance between different images is larger. This method can better
distinguish images with similar topologies.

Keywords: Persistent homology · Topological features · RGB · Wasserstein


distance · Similarity measure

1 Introduction
Topological Data Analysis (TDA) [1] is data analysis, algebraic topology, computational
geometry, computer science, statistics, and other related fields. The main goal of TDA
is to study the qualitative characteristics of data by using the ideas of geometry and
topology. The persistent homology theory in TDA is an algebraic method to study the
topological properties of space. F frosting and Ferri [2] proposed this concept when they
studied the scaling theory in 1999. Nowadays, Persistent homology has been widely used
in computer science, medicine, social science, civil engineering and other fields. At the
same time, persistent homology has made great progress in image processing and target
recognition. Zhang Jingliang, Ju xianmeng [3] used the continuous coherence group cal-
culation methods and simple, complex row coherence group to classify and recognize
images. By calculating the homology of a simple complex, the corresponding barcode is
obtained. The topological features and the corresponding geometric structure informa-
tion of the image are obtained based on the barcode. In this paper, the color information
is further processed based on this method, and the result is good for distinguishing the
more complex and colorful images.

© The Author(s), under exclusive license to Springer Nature Switzerland AG 2022


A. E. Hassanien et al. (Eds.): BIIT 2021, LNDECT 107, pp. 415–426, 2022.
https://doi.org/10.1007/978-3-030-92632-8_39
416 J. Ma et al.

2 The Basic Principle of Persistent Homology


2.1 Simplex Complex

Continuous coherence is a process in which a series of simplex complexes are used


to approximate the original data linearly, and the local differences are recorded during
the approximation process. TDA uses the concept of the simplex to describe the simple
complex, which is the change and expansion of a triangle in different dimensions. 0-
simplex is a point, 1-simplex is a line segment, 2-simplex is a triangle whose surface
is filled, and 3-simplex is a tetrahedron. For any finite set of vertices, the rank of the
largest unrelated vector formed between vertices is n. Assuming the dimension is n, the
simplex V can be expressed as:

V = {vi , i = 0, 1, · · · , n} (1)

2.2 Filter

Persistent homology is an important research direction in topological data analysis. It


focuses on the topological invariants between points in a multi-dimensional data structure
and is used to study qualitative characteristics at multiple scales. This paper uses the
persistent homology theory to extract the feature descriptors of the five-dimensional
model. An important step is to construct a complex filter flow for the five-dimensional
model. Taking a two-dimensional model as an example, the main process is: in the two-
dimensional point cloud model, each point is surrounded by a sphere with a diameter of
d. The diameter d of the sphere gradually increases from 0. When in the two-dimensional
model, When the distance between any two points is smaller than the diameter d at that
moment, the two points are connected to form a dynamic network structure that changes
with time. In this process, a simple complex is used to fill the entire changing structure,
and finally, a series of simple nested complexes are formed, called complex filtration
[4], as shown in Fig. 1.

Fig. 1. Complex filter


Topological Feature Analysis of RGB Image 417

In the process of building complex filtration flow, it can be observed that the whole
two-dimensional point cloud model is constantly filled, at the same time, there are many
holes with simple complex as the boundary, and these holes continue to produce and
disappear with the change of the diameter of each sphere. In this paper, the change of the
sphere’s diameter is taken as the measurement standard. The length of the diameter when
each hole appears and disappears is recorded, which is used as the topological structure
of the two-dimensional model in the multi-scale range. By constructing a complex filter-
ing flow, the two-dimensional point cloud model appears obvious topological structure.
Some topological structures are filled after a short time, which is considered noise and
disturbance; In addition, some topological structures exist for a long time and are consid-
ered stable topological features. These stable topological features describe the intrinsic
characteristics of the two-dimensional model, and they are used as feature descriptors to
represent the two-dimensional model. This is the whole process of obtaining persistent
homology [5] by constructing a complex filter.

2.3 Introduction to Vietoris-Rips Complex


In the process of persistent homology, the complex filter flows that are often constructed
include the Vietoris-Rips complex, the Witness complex, the Cech complex, and the
Alpha complex. In this paper, the Vietoris-Rips complex is used to construct a five-
dimensional RGB point cloud topology.
Vietoris-Rips complex is the most common complex among all kinds of complex
filtration flows. Its construction process is as follows: A subset of P is generated under
the filtration scale by a point set P in the high-dimensional space. The distance between
any two points in the subset is less than λ [4]. The formalization of the Vietoris-Rips
complex is shown in formula (2).

Vλ (P) = {d(μ, ν) ≤ λ, ∀μ = ν ∈ σ} (2)

The Vietoris-Rips complex is defined as VR (x, r), the point set X = {x1 , x2 , · · · , xn },
satisfying the condition n = |X|, and r is the size of the filter value at a certain time. The
specific construction process [6] of the Vietoris-rips complex is as follows:

(1) Add points: for all point sets of x ∈ X, there is x ∈ VR0 (X, 0).  
(2) Add a one-dimensional skeleton: one-dimensional simplex xi , xj exists in
VR1 (X , r), if and only if d(xi , xj ) holds.
(3) We define VR(X, r) to be the largest simplicial complex containing VR1 (X , r). In
other words, if a complex [x0 , x1 , · · · , xk ] is VR(X, r), if and only if all edges of
this complex exist on VR1 (X , r).

3 Image Feature Extraction


3.1 Extraction of Five-Dimensional Information Space
For an RGB color image, each pixel is represented by the three primary colors of red
(R), green (G), and blue (B) [7]. Add each pixel’s horizontal and vertical coordinates
418 J. Ma et al.

and its RGB components together to form A five-dimensional European subspace. We


use Euclidean distance as a measure of continuous homology distance:

n  2
d(x, y) = xi − yi (3)
i−1

Therefore, the analysis of a color image becomes the analysis of a five-dimensional


Euclidean subspace. We divide it into the following steps:

(1) Extract image information


(2) Sampling points
(3) Construct a simplex complex
(4) Carry out the calculation of persistent homology, obtain its persistence map, cal-
culate the distance between the two images according to the Wasserstein distance
algorithm, and determine whether they belong to the same category in turn.

Traditional persistent homology is mainly for two-dimensional point cloud data.


Two-dimensional point cloud data mainly refers to x and y, the horizontal and vertical
coordinates, which mainly represent position information, so the adjacent distance is 1.
After adding the three components of RGB, since each color component ranges from
0 to 255, the continuously changing radius ε value will be greatly affected by the three
components of RGB during the process of continuous coherence, especially when the
image size is small and relatively When two adjacent points are closer. For example,
in Fig. 2, select two pixels with a relatively close distance and a large color difference.
Their five-dimensional data are (609, 303, 222, 223, 218), (631, 303, 40, 39, 44), If the
Euclidean distance formula is directly imported, the distance between the two points
is 312.634. If only the horizontal and vertical coordinates are considered, the distance
between the two-pixel points is 22.

Fig. 2. Two-pixel data of scissors

It can be seen that when the color difference between the two pixels is large, the
RGB itself will be too large, which will have a greater impact on the calculation of the
coherence. Therefore, we weigh the three components of RGB, and the weighting is
based on the image size. The weight formula is (4):

min(length, width)/2550 (4)


Topological Feature Analysis of RGB Image 419

Where min(length, width) denotes the minimum length and width of the image. Since
the maximum value of RGB is 255, we set the RGB weight to be 1 when the minimum
value of the image length and width reaches 2550, which is ten times 255. From Fig. 2,
its size is 972 * 442, and the two-point five-dimensional data after weighting is (60,
303, 43.61, 43.80, 42.82), (631, 303, 7.85, 7.66, 8.64), and its distance is 65.09, For the
unweighted distance is reduced a lot. Although it is still much larger for the positional
distance alone, it will not select two points close to each other when selecting landmark
points. Therefore, the algorithm can ensure that certain color information is retained
under the premise of focusing on distance information.

3.2 Color Complexity


Due to the large image data set, if every data point is taken as a vertex, our stream will
soon contain too many simplifications and thus cannot perform effective calculations.
Therefore, selecting a certain number of data points from many data points to keep
the original topological features as much as possible and simplify the calculation is an
important issue. Therefore, this paper’s sampling and selecting points adopt the sequen-
tial maximum [8] and minimum value method in javaplex. The number of selected points
is also very important for calculation. That is, what kind of graph is more suitable for
selecting how many data points. Compare the two pictures in Fig. 3.

(a) flower (b) Wrench


Fig. 3. Wrench and flower

Image complexity, regional complexity, and object complexity are three scales. It can
be seen that picture (a) has richer color information for butterflies, while picture (b) lacks
color information for scissors. It can also be said that picture (a) is more complicated, and
picture (b) is simpler. But the image information is more complicated. It is believed that
more data points are needed to represent the information of the original image. Therefore,
we introduce the image complexity [9] to measure the image complexity description can
be described separately from the overall perspective, the regional perspective, and the
target perspective, which corresponds to the entire image. This article mainly studies the
overall description of image complexity.
420 J. Ma et al.

Image complexity is originally aimed at grayscale images. It comprehensively con-


siders the appearance of grayscale, the number of pixels of each grayscale, and the
distribution of pixels. The complexity of the image is described in three aspects: the
spatial distribution and the appearance of the target object. The gray level can reflect the
number of gray image levels and the appearance of each gray level pixel, which can be
described by information entropy [10].
The calculation of information entropy H is defined as:
k
H=− ni · logni  (5)
i=1 N N

Where N denotes the number of gray levels and ni is the number of each gray
level. We mainly consider the color complexity information, referring to the gray image
information entropy to obtain the color image information entropy. We only need to
change the gray level to the color level. However, grayscale images have 256 grayscale
values in total, while color images have 2563 color levels. This magnitude is too large,
which is not conducive to calculation. At the same time, it is almost impossible for an
image to contain all color levels, and in many cases, we do not need to divide the color
levels too finely. As shown in Fig. (2), although the black part of the scissors has slightly
different RGB components, it does not disproportionately impact the overall topology.
Therefore, similar colors can be regarded as a color level. In this paper, 64 adjacent
colors (four values of RGB are each combined) are regarded as a color level, so the
number of selected points for each image is H × 10.

4 Improved Feature Descriptor


4.1 Persistence Graph

We choose to use a persistence graph [11] to represent topological features. As shown in


Fig. 4, the abscissa corresponds to the starting position of the straight line in the bar code
diagram, indicating the time when the hole is generated, and the ordinate corresponds to
the end position of the straight line in the bar code diagram, indicating the disappearing
time of the hole. The points farther from the diagonal line indicate that the topological
feature is more stable. The points closer to the diagonal line indicate that the topological
feature exists for a shorter time and is more unstable.
This paper mainly considers the topological characteristics of one-dimensional
because the one-dimensional Betty number can better describe the internal topologi-
cal characteristics of the data. When substituting the information in the persistence map
to calculate the Wasserstein distance, it needs to be normalized. The main purpose is
to eliminate the influence of the image size on the distance calculation because the
topological feature is a feature that is not sensitive to scale transformation. We use the
Ripser software package to extract topological features. When using Ripser to draw a
persistence map, a maximum value of abscissa X_max will be generated according to
the feature’s death point. For example, X_max in Fig. 4 is 15, Then the feature descriptor
is (birth, death)/X _max.
Topological Feature Analysis of RGB Image 421

Fig. 4. Persistence graph

Figure 5 is a representation of the unnormalized and normalized feature descriptors


of Fig. 4.

(a) unnormalized (b) normalized


Fig. 5. The feature descriptor
422 J. Ma et al.

4.2 Persistence Map Information Improvement

When calculating Wasserstein distance using persistence graphs, due to the non-
negativity of distance, the number of birth-death points in each group of persistence
graphs will have a greater impact on the calculation. Generally, the distance with more
feature descriptors in two pictures will be larger than the distance with fewer feature
descriptors in the other two pictures. Still, the larger distance due to the gap of feature
descriptors does not mean that the gap in the topological structure is too large. If there
is too much difference in the number of feature descriptors between the two pictures,
the structural gap between the two is large. Therefore, we multiply the last calculated
distance by a coefficient max(i, j)/min(i, j), where i and j are the number of feature
descriptors in the group, respectively. When the two numbers are equal, the coefficient
is 1. The greater the difference between the two numbers, the greater the coefficient and
the greater the distance.

5 Numerical Experiment
5.1 Wasserstein Distance Introduction

Wasserstein distance [12] was first used to compare the difference between two his-
tograms and then describe the distance between two probability distributions. It can
be generally understood as the minimum transportation for moving many objects from
one location to another. The cost is defined as for formula (6), where X and Y are the
d-dimensional space sets satisfying the sum probability distribution, p represents the Lp
norm, and inf represents the lower bound. The value of p is 1 in the method in this paper.

 1/
Wp (µ, v) = inf EX − Y p p , p ≥ 1 (6)
X ∼µ
Y ∼v

5.2 Experimental Results

Firstly, the image sampling points are selected according to the complexity of the image,
and then continuous coherence is performed. Finally, the persistence map is obtained
according to the coherence information, and finally, the distance between the persistence
pictures is calculated according to the information of the persistence map. Figures 6 and
7 are two images and the resulting persistence maps:
Topological Feature Analysis of RGB Image 423

(a) Butterfly (b) Butterfly persistence diagram


Fig. 6. Butterfly image and its persistence map

(a) Scissor (b) Scissor persistence diagram


Fig. 7. Scissor image and its persistence map

The image data used in this experiment are all from the Internet, and pure white
background images are used. Figure 8 shows 16 randomly selected pictures of different
sizes, including butterflies, scissors, rings, and dragonflies.
Tables 1, 2, and 3 are the data of the comparison experiment on the 8 images in
Fig. 10. Table 1 shows the Wasserstein distance between 4 butterfly pictures and 4
scissors pictures in Fig. 8 after introducing color complexity:
Table 2 shows the Wasserstein distance between butterfly and scissors without
introducing color complexity, and the sampling points for both are selected as 100.
If all four types of data are displayed, a 16×16 table is required, which is inconvenient
to display here. Table 3 and Table 4 respectively show the average distance between the
four types of images with and without complexity.
424 J. Ma et al.

Fig. 8. Experiment image

Table 1. Introducing complexity of butterfly scissors distance.

S(1) S(2) S(3) S(4) B(1) B(2) B(3) B(4)


S(1) 2.78 4.91 3.42 18.06 4.85 6.74 10.84
S(2) 2.78 6.90 3.92 21.55 8.23 9.68 11.82
S(3) 4.91 6.90 7.94 31.8 18.28 17.53 20.18
S(4) 3.42 3.92 7.94 18.07 6.49 9.02 12.12
B(1) 18.06 21.55 31.8 18.07 3.79 4.05 3.17
B(2) 4.85 8.23 18.28 6.49 3.79 2.04 5.39
B(3) 6.74 9.68 17.53 9.02 4.05 2.04 2.57
B(4) 10.84 11.84 20.18 12.12 3.17 5.39 2.57

As can be seen from Table 1 and 3, after introducing the weighted color space and
color complexity, it can be seen that the distance between similar images is smaller, and
the distance between different images is larger, which has a good effect for comparing
the differences and similarities between images. However, the data in Tables 2 and 4 are
Topological Feature Analysis of RGB Image 425

Table 2. No complexity butterfly scissors distance introduced.

S(1) S(2) S(3) S(4) B(1) B(2) B(3) B(4)


S(1) 3.82 3.57 3.43 1.84 4.72 3.92 4.34
S(2) 3.82 4.66 3.93 6.10 8.51 6.65 7.58
S(3) 3.57 4.66 7.00 4.18 2.36 1.55 2.09
S(4) 3.43 3.93 7.00 3.63 6.27 5.88 5.65
B(1) 1.84 6.10 4.18 3.63 3.22 3.29 3.01
B(2) 4.72 8.51 2.36 6.27 3.22 1.65 1.42
B(3) 3.92 6.65 1.55 5.88 3.29 1.65 1.38
B(4) 4.34 7.58 2.09 5.65 3.01 1.42 1.38

Table 3. Average distance between the four types of images with complexity.

Butterfly Dragonfly Ring Scissor


Butterfly 3.298 6.738 11.981 11.058
Dragonfly 6.738 2.748 5.406 4.593
Ring 11.981 5.406 2.788 5.011
Scissor 11.058 4.593 5.011 4.978

Table 4. Average distance between the four types of images without complexity.

Butterfly Dragonfly Ring Scissor


Butterfly 2.328 4.316 7.258 4.704
Dragonfly 4.316 4.402 6.149 2.945
Ring 7.258 6.149 2.658 5.192
Scissor 4.704 2.945 5.192 4.402

not very obvious for the distance between the same or different images, and the effect
is poor. The improved feature descriptor can better solve the problem of Wasserstein
distance repeated calculation.
426 J. Ma et al.

6 Conclusion
This paper processes the RGB space and introduces the color complexity to measure
the color richness of the image to define how much information the image contains.
The more complex the image, the richer the information it contains. Then the color
information is combined with the topological information, and the similarity and dif-
ference are measured by calculating the Wasserstein distance of the two images. The
distance between the two images is related to its topological features and related to
color information. Compared with traditional topological feature extraction methods,
this measurement method has a better effect on magnifying the differences of different
types of images.

Acknowledgments. This paper is supported by Heilongjiang Provincial Natural Science Foun-


dation of China (LH2020F008).

References
1. Edelsbrunner, H., Harer, J.: Computational Homology: An Introduction, pp. 23–47. American
Mathematical Society (2010)
2. Frosini, P., Landi, C.: Size theory as a topological tool for computer vision. Pattern Recogn.
Image Anal. 9(4), 596–603 (1999)
3. Zhang, J., Ju, X.: Application of persistent homology in image classification and recognition.
J. Appl. Comput. Math. 31(04), 494–508 (2017)
4. Atienza, N., Gonzalez-Diaz, R., Rucco, M.: Persistent entropy for separating topological
features from noise in Vietoris-Rips complexes. J. Intel. Inf. Syst. 52(3), 637–655 (2019)
5. Carrière, M., Oudot, S.Y., Ovsjanikov, M.: Stable topological signatures for points on 3D
shapes. Comput. Graph. Forum 34(5), 1–12 (2015)
6. Zomorodian, A.: Fast construction of the Vietoris-Rips complex. Comput. Graph. 34(3),
263–271 (2019)
7. Liu, S., Chen, Y., Wang, H.: An improved hybrid encryption algorithm for RGB images. Int.
J. Adv. Sci. Technol. 95, 37–44 (2016)
8. De Silva, V., Carlsson, G.: Topological estimation using witness complexes. In: Proceedings
of the 1st Eurographics Conference on Point-Based Graphics, pp. 157–166. Eurographics
Association, Aire-la-Ville, Switzerland (2004)
9. Mario, I., Chacon, M., Alma, D., et al.: Image complexity measure: a human criterion free
approach. In: Proceedings of IEEE Annual Meeting of the North American Fuzzy Information
Processing Society, Detroit, MI, USA, pp. 241–246 (2005)
10. Gao, T., Li, F.: Using persistent homology to represent online social network graphs. In:
2017 IEEE 14th International Conference on Mobile Ad Hoc and Sensor Systems (MASS),
pp. 555–559. IEEE Computer Society (2017)
11. Adams, H., Chepushtanova, S., Emerson, T., et al.: Persistence images: a stable vector
representation of persistent homology. J. Mach. Learn. Res. 18, 1–35 (2015)
12. Panaretos, V., Zemel, Y.: Statistical aspects of Wasserstein distances. Ann. Rev. Stat. Appl.
6(1), 405–431 (2019)
Machine Learning and Deep Learning
and their Applications
A Novel Deep Image Matting Approach Based
on DIM Model

Guilin Yao1,2(B) and Zhiwei Ma1,2


1 Harbin University of Commerce, Harbin 150028, China
glyao@hrbcu.edu.cn
2 Heilongjiang Provincial Key Laboratory of Electronic Commerce and Information Processing,

Harbin 150028, China

Abstract. The digital image matting task is an important research field of com-
puter vision, and the method of deep image matting is a new and efficient automatic
matting method. In the task of deep image matting, to solve the problem that the
details of edge in the feature images of the decoder are easy to lose, the layer-
skipping connection is introduced to concatenate the feature images, which have
the same size in the channel dimension between the encoder and the decoder, it
also realizes the fusion of shallow detailed information and deep semantic infor-
mation. To get deeper semantic information and wider receptive field, the encoder
uses the VGG19 network obtained by migration learning, and the decoder uses
the larger convolutional kernel of 9 × 9 accordingly. At the same time, in order to
solve the problems of slow speed in convergence and insufficient ability of refine-
ment in the refined network, four convolutional layers with the residual structure
are added in this network. Experimental results show that the improved network
has higher accuracy and richer information of shape. The ability of generalization
in this model is also stronger.

Keywords: Deep image matting · Layer-skipping connection · VGG19 ·


Residual structure

1 Introduction

The automatic matting task is an important direction in the field of image processing
and computer vision. It needs to extract foreground objects from images accurately, and
it pays more attention to the details of the edge in foreground objects. Unlike image
segmentation [1], which pays more attention to the division of image region, image
matting is based on users’ rough foreground and background information. Effectively
extracts edge details of foreground objects. This technology is widely used in image
synthesis, post-production of film, and other practical works.
Generally, in the matting work, the pixel color in an image is regarded as a linear
combination of foreground color and background color. The specific formula is:

Ii = αi Fi + (1 − αi )Bi , αi ∈ [0, 1] (1)

© The Author(s), under exclusive license to Springer Nature Switzerland AG 2022


A. E. Hassanien et al. (Eds.): BIIT 2021, LNDECT 107, pp. 429–440, 2022.
https://doi.org/10.1007/978-3-030-92632-8_40
430 G. Yao and Z. Ma

In it, i is the pixel index of the image, and α i is the foreground opacity of the image
at the pixel point of i. I i is a known data.
The matting work can be equivalent to solving the values in the above equation, that
is, solving the α, which contains the foreground opacity of each pixel. However, for a
known image and its corresponding foreground and background images, they usually
contain three color channels: R, G and B, that is, the three above equations need to be
solved uniquely. At this time, the three equations contain three known quantities (the
values on three channels) and seven unknown quantities (the values on three channels
and unique values). Therefore, it is impossible to solve this problem by only a constraint
equation. So, users must provide a trimap with known foreground, background, and
unknown areas as auxiliary information formatting.

2 Related Work

At present, the common methods of image matting [2] can be roughly divided into
three categories: 1) sampling-based method; 2) the method based on propagation; 3) the
method based on deep learning.
The sampling-based matting method [3] generally selects candidate sampling points
for each unknown pixel in the known region on the trimap, then selects suitable fore-
ground and background sampling pairs in the candidate sampling points, and calculates
the values of sampling pairs as the values of unknown pixels. The main algorithms
include Robust Matting [4], Shared Matting [5], and Global Matting [6].
In the method based on propagation, the weighted sum of α values for similar pixels
in known areas is calculated in a small local range of pixels in unknown areas, and then
the α values of pixels in unknown areas are estimated. The main algorithms include
Poisson Matting [7], Closed-Form Matting [8], and KNN Matting [9].
The matting method based on deep learning regards the matting process as a super-
vised learning process. A linear or nonlinear model between α parameter and image
color is constructed through training and learning, and then the α value of unknown
area pixels is estimated. The main algorithms include DCNN Matting [10], Deep Image
Matting [11], Semantic Human Matting [12] and Deep Automatic Portrait Matting [13],
etc. The model in this paper is based on the Deep Image Matting algorithm (as shown
in Fig. 1), and then we have improved it.

Fig. 1. Network diagram of deep image matting algorithm


A Novel Deep Image Matting Approach 431

The main work of this paper includes: 1) we use the layer-skipping connection in the
auto-encoder section of a DIM model to realize the fusion of the information at different
depths; 2) we use the VGG19 network as the main body to enhance the ability to extract
for deep semantic features; 3) in the refined part of this model, the residual structure of
convolution is adopted in corresponding positions, it enhances the refined ability of this
network and ensures the robustness of this model.

3 Our Model and Algorithm


3.1 Dataset
In this paper, we use the synthetic dataset provided by Xu, the author of DIM algorism. It
includes 493,000 training images and 1,000 tested images. This dataset is widely used in
matting, and many famous matting algorithms use it for training and testing, for example,
AlphaGAN [14]. Its training images are composed of 493 foreground pictures and 100
background pictures from the MSCOCO dataset. Its tested images are composed of 50
foreground pictures and 20 background pictures from the Pascal VOC dataset. Their
trimaps are all created by randomly dilating of their real α. In order to enhance the
robustness of the model, the synthesized dataset is also subjected to the processing of
data enhancement. Specifically, the original images and their corresponding trimaps are
cut to different sizes (such as 480 × 480, 640 × 640), then they are resized to a uniform
size (320 × 320) and randomly flipped. In addition, after each epoch of training, the
next epoch is randomly generated.

3.2 Model of Network


The model of our algorithm in this paper (as shown in Fig. 2) is mainly composed of
two parts: the first part is the auto-encoder network, which is based on VGG19 with
layer-skipping connection, it is mainly responsible for making a preliminary prediction
of α; the other part is composed of 8-layer convolutional network with residual structure,
it is mainly responsible for making the final refinement for prediction of α. The input
of the first part of this model is composed of the original image and its trimap, and then
outputs the result of the preliminary prediction of α; in the second part of this model, the
original image and the preliminary α which is gotten in the first part are taken as input,
then output the final prediction of α.

Fig. 2. Schematic diagram of our novel deep image matting approach


432 G. Yao and Z. Ma

VGG19. In the first part of this model, the encoder is based on the VGG19 network
pretrained by the ImageNet dataset. The original full connection layer is removed, and
the four-channel kernels’ parameters in the first convolutional layer are initialized to
0. The VGG19 network is used as the backbone of the encoder. On the one hand, the
existing experimental results show that the VGG network has a stronger generalized
ability than other mainstream deep learning models, and it is usually the first choice for
transformation of the network; on the other hand, in the series networks of VGG, the
VGG19 network has stronger ability of extraction than other VGG networks, and it has
a larger receptive field on feature maps.
In VGG19, a convolutional block composed of four convolutional layers, whose
kernel size are 3 × 3, can be simulated as a convolutional layer, whose kernel size
is 3 × 3, so VGG19 can generate a larger receptive field and obtains more accurate
semantic information. Moreover, in the decoder part, we use the convolutional layers
whose kernel size is 9 × 9 on the corresponding position. The experimental results show
that the improved VGG19 network can extract more detailed information from images.

Layer-Skipping Connection. In the traditional auto-encoder structure [15], it is com-


posed of an encoder and decoder (as shown in Fig. 3): the encoder firstly increases the
dimension of the input images to realize the extraction of the hidden features; then,
the decoder reduces the dimension of the feature maps to realize the reconstruction of
images. In the process of increasing dimension, it is often realized by increasing strides
of convolution or pooling. However, during this process, these methods usually result in
the loss of detailed information of feature maps. These pieces of information can not be
recovered by the methods of deconvolution [16] or reverse pooling.

Fig. 3. Traditional auto-encoder network

Therefore, our model uses layer-skipping connection in the auto-encoder part (as
shown in Fig. 4). It is embodied as follows: in the encoder, the output of each convo-
lutional block is concatenated with the input of each convolutional block at the corre-
sponding position in the decoder, and the concatenated feature maps are used as the input
of the next convolutional processing in the decoder. Experimental results show that the
shape of the original image is preserved by using the layer-skipping connection, and it
realizes the fusion of shallow detailed information and high-level semantic information.
A Novel Deep Image Matting Approach 433

Fig. 4. Auto-encoder network with layer-skipping connection

Residual Structure. For a shallow network, because of the limited complexity, it is easy
to lead to the problem of insufficient extraction and insufficient ability of expression on
large datasets; however, if the network’s depth is increased blindly, on the one hand, it
may lead to the degradation of the network [17], and the dispersion of gradient [18], on
the other hand, it may have the over-fitting problem if don’t have massive datasets.
Therefore, in the refined part of our model, we use the residual structure (as shown
in Fig. 5), it is realized as follows: seven convolutional layers whose kernel size are 3 ×
3 are used for the refinement of α, and short-circuit layers are added to the second and
the fifth convolutional layers [18], that is, the input and the output of each short-circuited
convolutional block are directly added, to ensure that when the ability to express of our
model is insufficient, the short-circuit fails and the short-circuited convolutional layers
are added to our model; when the network degrades, the short-circuit layers work, and
the corresponding convolutional layers are short-circuited, it accelerates the speed of
convergence and reduces the risk of over-fitting.

Fig. 5. Refined network of residual structure

3.3 Loss Function

The total loss function in this paper is mainly composed of two parts, and one is the loss
function of α prediction, the other is the loss function of α synthesis.
434 G. Yao and Z. Ma

The loss function of α prediction represents the error between the predicted values
of α and the true values of α, the specific formula is:

 2
Lα =
i
αpi − αgi + ε2 . αpi , αgi ∈ [0, 1] (2)

In it, α ig means a certain pixel’s value on the real α, α ip is the pixel’s value at the
corresponding position on the predicted α, and ε is a certain minimum constant, it is
10–6 here; the loss function of α synthesis represents the error between the real RGB
images and the predicted RGB images. The predicted RGB images are synthesized of the
values of α prediction, foreground images and background images, the specific formula
is:

 2
Lc =
i
cpi − cgi + ε2 . (3)

In it, cip means a certain pixel’s value on the predicted RGB image and cig is the pixel’s
value at the corresponding position on the real RGB image; the total loss function in this
paper is the weighted sum of these two loss functions, the specific formula is:

L = w × Lα + (1 − w) × Lc . (4)

In it, Lα means the loss function of α prediction, Lc means the loss function of α synthesis,
and w means a certain weight, it is 0.5 here. In the training process of the auto-encoder
network, we use the total loss function; but in the refined network and whole network,
we use the loss function of α prediction.

4 Experiment
4.1 Experimental Environment
The experiment in this paper is performed in Windows 10.0 system, and it is trained
and tested by building the framework of Tensorflow 2.4.1 and Keras 2.4.1. The CPU is
conFig.d with Inter (R) i7-10700KF, and the GPU is NVIDIA RTX2080-ti. This work
is supported by the Youth Innovation Talent Support Program of the Harbin University
of Commerce under Grant 2020CX39.
By default, our dataset uses the dataset in Xu’s paper. The batch size of our dataset
is set to 8, and the size of the input image in training dataset is set to 320 × 320. In the
stage of training, Adam optimizer is used in the auto-encoder network, the refinement
network and the whole network. The learning rates are set to 1.0 × 10–5 , 1.0 × 10–4 ,
and 1.0 × 10–5 , respectively. The training epoch is 20 by default, and the loss function is
the mean square error. In the stage of test, two general measurement indexes, the mean
square error (MSE) and the gradient error (Gradient), are set. They are used in many
matting algorisms, such as Fast Automatic Portrait Matting Based on Multitask Deep
Learning [19], Automatic Matting Algorithm for Human Foreground [20], etc. Their
formulas are listed as follows:
n  2
MSE(x, y) = xi − yi (5)
i=0
A Novel Deep Image Matting Approach 435

n  
 i 
Gradient(x, y) = ∇x − ∇yi  (6)
i=0

Among of them, xi is the predicted α of the image, yi is the real α, ∇ means the gradient
of the image.

4.2 Experimental Setup

The experiments are divided into two parts: the first part is the experiments between each
improved algorithm and the original DIM algorithm; The second part is the experiments
between the algorithm after adding all the improvements and other classical matting
algorithms. In the first part, we will compare the results which are gotten by different
structures after training, it includes whether there is a VGG19 network, whether there
is a layer-skipping connection, and whether there is a residual structure. The tested
dataset is the Composition-1k dataset consistently. In the comparison process, the batch
size, the loss function, the learning rate and other experimental factors are consistent,
and they are all the default values in the above experimental environment. The second
part compares the improved algorithm with the KNN Matting algorithm and the DCNN
Matting algorithm. The size of pictures in our tested dataset is reduced to 0.3 times (after
considering the speed of matting in the KNN algorithm). At the same time, we keep the
other experimental factors, such as the batch size, the loss function, and the DCNN
algorithm’s learning rate, which also belongs to the deep learning algorithm, consistent
with our algorithm. They are all the default values. The MSE and the Gradient are used
as the performance indexes of these experiments.

4.3 Experimental Results and Analysis

The first part of the experiment is a comparative experiment between each improved
algorithm with the original DIM algorithm: in the auto-encoder part of this network (as
shown in Table 1), the original algorithm uses VGG16 as the main body, after changing
to VGG-19 as the main body, the MSE drops significantly, it is mainly because the
deeper convolutional network has the ability of extraction for deeper dimension, and the
larger convolutional kernel can get the larger receptive field, but the Gradient doesn’t
have obvious change. After adding the layer-skipping connection, the Gradient begins
to drop significantly. The layer-skipping connection fuses the shallow features of shape
and the deep features of segmentation and keeps the richness and integrity of the output
information. The original model comprises four simple convolutional layers in the refined
part of this network (as shown in Table 2). After adding the convolutional layers with
residual structure, it not only makes the extraction of information richer but also avoids
the risk of over-fitting in our model. It makes these two performance indexes decrease
significantly. Finally, compared with the original model, the MSE decreased by 0.0020,
and the Gradient decreased by 0.0017. This fully demonstrates the effectiveness of our
improvements in the original algorithm (as shown in Fig. 6) (Table 3).
436 G. Yao and Z. Ma

Table 1. Comparison of auto-encoder networks’ performance

DIM +VGG19 +Skip MSE Gradient


0 .0259 0 .0226
0 .0257 0 .0226
0 .0218 0 .0190

Table 2. Comparison of refined networks’ performance

DIM + Residual MSE Gradient


0.0244 0.0217
0.0232 0.0208

Table 3. Comparison of overall networks’ performance

Methods MSE Gradient


DIM 0.0206 0.0183
Ours 0.0186 0.0166

The second part is the comparative experiment between our improved algorithm
and other classical matting algorithms. Because the KNN algorithm takes a long time
to calculate, we reduce the image size of Composition-1k dataset by 0.3 times as our
tested dataset. According to the test (as shown in Table 4), the KNN algorithm has a
good performance on smaller images, but the DIM algorithm and the improved DIM
algorithm have better performance, especially the improved DIM algorithm has the
best performance, which further shows the feasibility and the effectiveness of these
improvements.
The α results of several classical automatic matting methods on Composition-1k
(reduced by 0.3 times) are shown in Fig. 7: the KNN matting method gets a good α, but
it does not perform well in details, such as human’s hair and animal’s hair; the DCNN
matting method doesn’t get an ideal result, and a large misjudged area appears on the
output image, it may be related to the insufficient dataset and the small size of tested
images. The DIM algorithm has achieved a better α, but there are also some problems
of missing details; our improved DIM algorithm has gotten the ideal output, especially
in detail processing, it preserves much rich information of shapes, such as clothes’
corners and hair of human, the nose and hair of the animal, they are all visible in our α.
Experimental results show that our improved algorithm can obtain more accurate results
of α prediction, and the generated α has higher quality.
A Novel Deep Image Matting Approach 437

Image Trimap DIM Ours

GT α

Image Trimap DIM Ours

GT α
Fig. 6. Experimental results of DIM and Ours

Table 4. Comparison of classical matting algorithms

Methods MSE Gradient


DCNN 0.2589 0.1883
KNN 0.0977 0.0849
DIM 0.0743 0.0603
Ours 0.0692 0.0562
438 G. Yao and Z. Ma

Image Trimap KNN DCNN

DIM Ours GT α

Image Trimap KNN DCNN

DIM Ours GT α
Fig. 7. Experimental results of automatic matting

5 Conclusion
By analyzing the existing problems of the network in deep image matting, this paper
proposes to use the VGG19 network as the backbone of the encoder to enhance the ability
of extraction on feature maps; the auto-encoder uses the layer-skipping connection to
get a more detailed and rich shape information; in the part of refined network, we use a
deeper convolutional network with residual structure, it improves the refined ability and
the speed of convergence in this network. Experiments show that these improvements
significantly improve the accuracy of the synthetic dataset, and enhance the ability to
capture details for the generated α.
Although the improved model has obvious improvement in α prediction and extrac-
tion of shaped information, it still has shortcomings in real-time and portability, which
are mainly due to the problems that this model relies too much on hardware’s acceler-
ation and the amount of dataset is relatively insufficient. At the same time, it is also a
common problem that needs to be solved urgently in deep learning and matting. The
A Novel Deep Image Matting Approach 439

following research will be an important direction for us to reduce the parameters [21] in
our model without affecting the overall performance and to get more matting datasets
with lots of unique foreground pictures.

Acknowledgments. This work is supported by Heilongjiang Provincial Natural Science Foun-


dation of China (No. LH2021F034), and the Youth Innovation Talent Support Program of Harbin
University of Commerce (No. 2020CX39).

References
1. Huang, P., Zheng, Q., Liang, C.: Overview of image segmentation methods. J. Wuhan Univ.
(Sci. Edn.) 66(06), 519–531 (2020)
2. Liang, Y., Huang, H., Cai, Z., Hao, Z., Feng, F.: Summary of natural image matting technology.
Comput. Appl. Res. 38(05), 1294–1301 (2021)
3. Tang, J., Aksoy, Y., Oztireli, C., et al.: Learning-based sampling for natural image matting.
In: Conference on Computer Vision and Pattern Recognition (CVPR), pp. 3055–3063(2019)
4. Wang, J., Cohen, M.: Optimized color sampling for robust matting. In: IEEE Conference on
Computer Vision & Pattern Recognition, pp. 1–8. IEEE Computer Society (2007)
5. Gastal, E., Oliveira, M.: Shared sampling for real-time alpha matting. Comput. Graph. Forum
29(2), 575–584 (2010)
6. He, K., Rhemann, C., Rother, C., et al.: A global sampling method for alpha matting. In: IEEE
Conference on Computer Vision & Pattern Recognition, pp. 2049–2056. IEEE (2011)
7. Sun, J., Jia, J., Tang, C., et al.: Poisson matting. ACM Trans. Graph. 23(3), 315–321 (2004)
8. Chen, Q., Li, D., Tang, C.: KNN matting. In: IEEE Conference on Computer Vision and
Pattern Recognition, vol. 35, no. 9, pp. 2175–2188. IEEE (2013)
9. Levin, A.: A closed form solution to natural image matting. IEEE Comput. Soc. 30(2), 228–
242 (2008)
10. Cho, D., Tai, Y.-W., Kweon, I.: Natural image matting using deep convolutional neural net-
works. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9906,
pp. 626–643. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46475-6_39
11. Xu, N., Price, B., Cohen, S., Huang, T.: Deep image matting. In: IEEE Conference on
Computer Vision and Pattern Recognition, pp. 2970–2979 (2017)
12. Chen, Q., Ge, T., Xu, Y., Zhang, Z., Yang, X., Gai, K.: Semantic human matting. In:
Proceedings of the 26th ACM international conference on Multimedia (2018)
13. Shen, X., Tao, X., Gao, H., Zhou, C., Jia, J.: Deep automatic portrait matting. In: Leibe, B.,
Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9905, pp. 92–107. Springer,
Cham (2016). https://doi.org/10.1007/978-3-319-46448-0_6
14. Lutz, S., Amplianitis, K., Smolic, A.: AlphaGAN: generative adversarial networks for natural
image matting. In: British Machine Vision Conference (2018)
15. Badrinarayanan, V., Kendall, A., Cipolla, R.: SegNet: a deep convolutional en-coder-decoder
architecture for image segmentation. IEEE Trans. Pattern Anal. Mach. Intell. 39(12), 1 (2017)
16. Shelhamer, E., Long, J., Darrell, T.: Fully convolutional networks for semantic segmentation.
In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 3431–3440
(2015)
17. Ioffe, S., Szegedy, C.: Batch normalization: accelerating deep network training by reduc-
ing internal covariate shift. In: International Conference on Machine Learning, pp. 448–456
(2015)
440 G. Yao and Z. Ma

18. He, K., Zhang, X., Ren, S., et al.: Deep residual learning for image recognition. In: IEEE
Conference on Computer Vision and Pattern Recognition (CVPR) (2016)
19. Xu, Z., Yang, Y.: Fast automatic portrait matting based on multitask deep learning. J. Wuhan
Univ. (Eng. Edn.) 53(08), 740–745+752 (2020)
20. Ran, Q., Feng, J.: Automatic matting algorithm for human foreground. J. Comput. Aided Des.
Graph. 32(02), 277–286 (2020)
21. Wang, R., Xu, S., Huang, J.: Image matting technology based on deep learning. JShanghai
Univ. (Nat. Sci. Edn.) 41(05), 1–9 (2021)
Application of Bloch Spherical Quantum
Genetic Algorithm in Fire-Fighting Route
Planning

Wei Zhao1,2 , Xuena Han1(B) , Hui Li1,2 , Zeming Li1 , and Xue Tan1
1 Harbin University of Commerce, Harbin 150028, China
2 Heilongjiang Provincial Key Laboratory of Electronic Commerce and Information Processing,

Harbin 150028, China

Abstract. Aiming at the fire-fighting path planning problem, the Bloch Quantum
Genetic Algorithm (BQGA) is studied and used. In this algorithm, first use Bloch
spherical coordinates to encode the quantum chromosome, construct the update
strategy of the quantum chromosome, establish the angle size and direction of the
quantum revolving door, and finally construct the phase formula in the mutation
operation. In this article, the three-dimensional (3D) point data obtained from the
three-dimensional building floors are converted into a 3D raster map; considering
the height gap of different areas, generating the feasible shortest path route can
directly generate a clear movement trajectory in the raster map to ensure the
driving process People’s safety in China. The experimental results show that the
algorithm does not need a priori knowledge of the map, has strong reliability and
practicability, and has less evolutionary algebra, fast convergence speed, and high
optimization efficiency.

Keywords: Bloch sphere · Quantum genetic algorithm · Fire-fighting path


planning · Quantum computing

1 Introduction

Quantum genetic algorithm combines the advantages of quantum computing and genetic
algorithm. A Quantum genetic algorithm is a brand-new evolutionary algorithm. It is
based on quantum principles and uses qubit encoding and quantum gates to update
the population to find the optimal global solution. Compared with traditional, quan-
tum evolutionary algorithms have a small population size, fast calculation speed, and
strong global optimization ability. Therefore, the quantum genetic algorithm has great
advantages and contains strong vitality, and has extremely high theoretical value and
application prospects [1].
Lewandowski et al. [14] used the Bloch sphere to describe the behavioral modeling
system in quantum reversible logic design. Later, Bloch spherical coordinates are often
used in quantum logic operations and quantum algorithms [1]. A quantum genetic algo-
rithm based on Bloch spherical coordinates was first proposed in China by Li Panchi

© The Author(s), under exclusive license to Springer Nature Switzerland AG 2022


A. E. Hassanien et al. (Eds.): BIIT 2021, LNDECT 107, pp. 441–449, 2022.
https://doi.org/10.1007/978-3-030-92632-8_41
442 W. Zhao et al.

et al. [1]. The algorithm directly uses the Bloch spherical coordinates of quits to encode
quantum chromosomes. It uses quantum revolving gates to update qubits. A simple and
practical method is proposed to determine the direction of the turning angle of the revolv-
ing door. Mutation and a new operator based on the qubit Bloch spherical coordinate
are proposed [1]. At the same time, the three Bloch spherical coordinates of the quit
are all regarded as gene positions; therefore, each chromosome has three gene chains.
Compared with traditional optimization algorithms to solve complex nonlinear prob-
lems, the quantum genetic algorithm is produced under the guiding ideology, inherited
the advantages of the genetic algorithm itself, and improved some of the original short-
comings to some extent, such as the shortcomings of too large a population and too
long convergence time. Achieve better optimization effects than traditional algorithms.
The quantum genetic algorithm is applied in different disciplines to solve a variety of
practical problems. Facts have proved that its calculation results are often much better
than those obtained by traditional calculation methods, so they are widely used in differ-
ent optimization problems [1]. The simulation results of the extreme value optimization
of the fire path planning function show that the optimization performance of BQGA is
better than the ordinary ant colony algorithm [2].
This article focuses on the study of fire-fighting path planning for large-space building
fires. It is based on the existing literature and adopts a based on the Bloch spherical
coordinate algorithm. A quantum chromosome update strategy is constructed and its
population size can be adjusted freely, the convergence speed is fast, it has a strong
global optimization ability and rich population diversity and randomness [2].

2 Bloch Quantum Genetic Algorithm

2.1 Triple-Stranded Gene Coding Scheme of Quantum Chromosome

On the Bloch sphere, a point p can be determined by two angles θ and ϕ, as shown in
the figure below:

Fig. 1. Bloch spherical representation of qubits


Application of Bloch Spherical Quantum Genetic Algorithm 443

As shown in Fig. 1, on the 3D Bloch sphere, any qubit corresponds to a point on the
Bloch sphere. Then the qubit can be expressed in Bloch spherical coordinates as:

|ϕ >= [cos ϕ sin θ sin ϕ sin θ cos θ ]T (1)

In BQGA, the Bloch spherical coordinate coding of qubits is directly used. Let Pi
be the i chromosome in the population, and the BQGA coding method is:
   
 cosϕi1 sinθi1   cosϕin sinθin 
   
Pi =  sin ϕi1 sin θi1  · · ·  sin ϕin sin θin  (2)
 cos θ   cos θ 
i1 in

Where 0 ≤ θ ≤ π, 0 ≤ ϕ ≤ 2π .

2.2 Solution Space Transformation


Each chromosome in the population contains 3n Bloch spherical coordinates of n qubits.
Using linear transformation, this 3 n coordinates are mapped to the solution space of the
optimization problem, and each coordinate corresponds to an optimization variable in
the solution space. Suppose the corresponding solution space transformation formula of
the j qubit on the i chromosome pi is:

j 1    
Xix = bj 1 + xij + aj 1 − xij
2
j 1    
Xiy = bj 1 + yij + aj 1 − yij
2
j 1    
Xiz = bj 1 + zij + aj 1 − zij (3)
2
where i = 1, 2, · · · , m and j = 1, 2, · · · n. m is the size of the population, n is
quantum digits.

2.3 Quantum Chromosome Update


In BQGA, the renewal of the quantum chromosome is generally achieved by changing
the quantum phase through the revolving gate. The purpose of quantum phase bit rotation
is to use each chromosome in the current population to approximate the contemporary
optimal chromosome. In this approach, it is possible to produce better contemporary
optimal chromosomes so that the population continues to evolve [4]. Therefore, a new
quantum revolving door is proposed, the form of which is:
⎡ ⎤
cos ϕ cos θ − sin ϕ cos θ sin θ cos(ϕ + ϕ)
U = ⎣ sin ϕ cos θ cos ϕ cos θ sin θ sin(ϕ + ϕ) ⎦ (4)
ϕ 
− sin θ − tan 2 sin θ cos θ
Regarding the direction of the turning angle, the literature [1] gives the corresponding
rules, which can be judged by the algebraic operation of the three qubit coordinates of
the current chromosome and the optimal chromosome.
444 W. Zhao et al.

2.4 Variation of Quantum Chromosomes


Since the purpose of mutation operation is to achieve population diversity and break
through premature convergence. In the BQGA algorithm, a quantum revolving gate is
used to operate on θ and ϕ.
According to the mutation probability Pm , randomly select several qubits of a chro-
mosome, perform mutation operations on these qubits and construct the selection formula
corresponding to the mutation angle as.
ϕ = 0.5π − 2ϕ
(5)
θ = 0.5π − 2θ
It helps to increase the diversity of the population and helps avoid premature
convergence.

2.5 Selection of the Best Chromosome


The selection of the fitness function directly affects the convergence speed of the quan-
tum genetic algorithm and the ability to find the optimal solution because the genetic
algorithm does not use external information in the evolutionary search, only based on
the fitness function, using the fitness of each individual in the population degree to
search [5]. Because the complexity of the fitness function is the main component of the
complexity of the quantum genetic algorithm, the design of the fitness function should
be as simple as possible to minimize the computational time complexity. Establishing
the mapping relationship between the objective function of the optimization problem
and the individual’s fitness can realize the optimization of the objective function of the
optimization problem in the process of group evolution [6].
Each chromosome has three gene chains corresponding to three objective function
values. The best gene chain should represent the chromosome, so the fitness function:

Fit(i) = (B1(1) − B2(1))2 + (B1(2) − B2(2))2 + (B1(3) − B2(3))2 (6)


The chromosome with the smallest fitness value in the population is selected as the
contemporary optimal chromosome, and the one with the smallest objective function
value among the three gene chains is the contemporary optimal chain. Among them,
Bi (i = 1, 2, 3) represents the position of each point. According to the shortest distance
formula of three-dimensional graphics, the shortest distance is the fitness function.

3 Algorithm Flow
In the ideal algorithm, a person is regarded as a point with no size. The grid method [7]
is used to simplify a building as a whole to obtain an equal-sized 21 × 21 grid, and then
assign a value to each grid. The value of the grid represents the height, and then BQGA
is used for simulation. Proceed as follows:

Step 1 Initialize the population. Set the population size to popsize, the number of iter-
ations to maxgen, the corner step length to shiftstep, and the mutation probability to
Pm .
Application of Bloch Spherical Quantum Genetic Algorithm 445

Step 2 Solution space transformation. Combining specific optimization problems, use


Eq. (3) to transform the solution space, calculate the fitness of each chromosome accord-
ing to Eq. (5), and use its optimal chromosome and optimal gene chain as the historical
optimal chromosome and gene chain.
Step 3 Chromosome update and mutation. Use the quantum revolving door in Eq. (4)
to update the chromosome to obtain a new population.
Step 4 Filter by the fitness function. If Fit(i) < Fit(i − 1), then update the current
optimal solution. Otherwise, keep the previous solution unchanged.
Step 5 Iterate. Keep the optimal solution of the algorithm, and judge whether the number
of iterations reaches the set value, if yes, the iteration ends and stops; otherwise, go to
step 3.

The algorithm flow chart is shown in Fig. 2.

Fig. 2. Algorithm flow chart


446 W. Zhao et al.

4 Simulation Analysis of Fire-Fighting Path Planning Examples


In order to verify the effectiveness of the Bloch spherical quantum genetic algorithm,
this section simulates the 3D fire-fighting path planning problem [10]. This paper uses
the raster method to convert the 3D point data obtained from the building floors into a
21 × 21 3D raster map. Assuming that each grid represents an evacuation point, the size
and length of each grid are the same as unit 1, and the grid is assigned a value. The value
obtained is the height of each point, taking into account the height difference of different
areas, so a grid map is formed. It is assumed that the starting point and the endpoint of
people are the same each time. The evolution algebra is 100, the population number is
100, and the evacuation node is 19, then verify its algorithm’s effectiveness.
As shown in Fig. 3 and Fig. 4. Figure 3 show the planned route of the fire-fighting
route. Both the horizontal and vertical coordinates indicate the distance of the route, and
the unit is km. The red dashed line represents the best route based on the ACA search,
and the running time is 1.362073 s, while the solid black line represents the best route
based on the BQGA search, and the time used is 1.26061 s. Figure 4 is a 3Dpath planning
space diagram.

Fig. 3. Fire route planning route

As shown in Fig. 5 and 6, they respectively show the change trend of the best
individual fitness under different algorithms. The abscissa represents the number of
iterations, and the ordinate represents its corresponding fitness value. Figure 5 shows
the change trend of the best individual fitness of the BQGA algorithm, which drops from
the initial about 10000 to nearly 8100, which is a difference of about 1900. Figure 6
shows the change trend of the best individual fitness of the ACA algorithm, from the
initial 137 to nearly 115, the difference is about 22. In comparison, it is obvious that the
optimization efficiency of BQGA algorithm is higher than that of ACA.
Application of Bloch Spherical Quantum Genetic Algorithm 447

Fig. 4. Three-dimensional path planning space

Fig. 5. The change trend of the best individual fitness of the BQGA model
448 W. Zhao et al.

Fig. 6. The changing trend of the best individual fitness of the ACA model

5 Conclusion
This paper uses a qubit-based Bloch spherical quantum genetic algorithm to solve the
three-dimensional fire path planning problem. Its coding method has three advantages:
First, it can avoid the random characteristics brought by the binary code generated by
measuring the qubit. Second, it can avoid the frequent binary numbers decoding process.
Third, it can expand the number of globally optimal solutions and increase the probability
of obtaining optimal global solutions. This method uses grid division to process the
3Dbuilding environment and then calculates the shortest path distance traveled by people
through the algorithm. A typical test example shows that the algorithm can always
achieve a smooth and safe path in a randomly distributed working environment with
various shapes. At the same time, the algorithm has all aspects of convergence speed,
iteration steps, execution time.

Acknowledgments. This work is supported by the Natural Science Foundation of Heilongjiang


Province of China (No. LH2020F007).

References
1. Li, S., Li, P.: Quantum Computation and Quantum Optimization Algorithms. Harbin Institute
of Technology Press, Harbin (2009)
2. Wang, J., Wang, H., Feng, L.: Building fire evacuation path planning based on quantum ant
colony algorithm. Comput. Meas. Control 28(07), 167–172 (2020)
3. Li, J., Han, K., Bao, T.: An improved bloch spherical quantum genetic algorithm and it’s
application. J. Railway Sci. Eng. 13(11), 2262–2269 (2016)
Application of Bloch Spherical Quantum Genetic Algorithm 449

4. Yan, F., Iliyasu, A.M., Liu, Z.-T., Salama, A.S., Dong, F., Hirota, K.: Bloch sphere-based
representation for quantum emotion space. J. Adv. Comput. Intel. Intel. Inf. 19(1), 134–142
(2015)
5. Yang, S., Xu, Y., Li, P.: Quantum-inspired artificial fish swarm algorithm based on the bloch
sphere search algorithm. Inf. Control 43(06), 647–653 (2014)
6. Chen, Y., Li, A., Huang, Y.: Quantum particle swarm optimization based on bloch coordinates
of qubits. J. Comput. Appl. 33(02), 316–318+322 (2013)
7. Yi, Z., He, R., Hou, K.: Quantum artificial bee colony optimization algorithm based on bloch
coordinates of quantum bit. J. Comput. Appl. 32(07), 1935–1938 (2012)
8. Yi, Z., Hou, K., He, R.: Adaptive quantum genetic algorithm based on bloch sphere. Comput.
Eng. Appl. 48(35), 57–61 (2012)
9. Li, P.: Quantum computation with applications to intelligent optimization and control. Harbin
Institute of Technology (2009)
10. Jafarizadeh, M.A., Karimi, N., Zahir, H.: Quantum discord for generalized bloch sphere states.
Eur. Phys. J. D 68(5), 1–9 (2014)
11. Yang, L., Zhang, A., Zhang, J., Song, J.: Real-time path planning of velocity potential for
robot in grid map environment. Comput. Eng. Appl., 1–8 (2021). http://kns.cnki.net/kcms/
detail/11.2127.tp.20210514.1007.008.html
12. Guo, L., Xiao, S., Wang, Z.: Geometric phase of two-level mixed state and bloch sphere
structure. Int. J. Theor. Phys. 52(9), 3132–3140 (2013)
13. Bartkiewicz, K., Miranowicz, A.: Optimal cloning of arbitrary mirror-symmetric distributions
on the bloch sphere: a proposal for practical photonic realization. Phys. Scr. T147, 014003
(2012)
14. Lewandowski, M., Ranganathan, N., Morrison, M.: Behavioral model of integrated qubit gates
for quantum reversible logic design. In: Conference: IEEE Computer Society International
Symposium on VLSI (2013). https://doi.org/10.1109/ISVLSI.2013.6654658
Chaotic Neural Network with Legendre
Function Self-feedback and Applications

Yu Zhang1,3 , SiFan Wei1 , Yu Wang1 , and Yaoqun Xu1,2,3,4(B)


1 School of Computer and Information Engineering,
Harbin University of Commerce, Harbin 150028, China
xuyq@hrbcu.edu.cn
2 Heilongjiang Provincial Key Laboratory of Electronic Commerce and Information Processing,
Harbin 150028, China
3 Heilongjiang Cultural Big Data Theory Application Research Center, Harbin 150028, China
4 Institute of System Engineering, Harbin University of Commerce, Harbin 150028, China

Abstract. A new transient chaotic neuron model which introduces Legendre func-
tion into the self-feedback term is constructed. The dynamic characteristics of the
single neuron are analyzed by the time evolution graph and the inverted bifurca-
tion graph of the largest Lyapunov exponent. The network parameters are set for
settling the combinatorial optimization problem. The effectiveness of this model
has been verified by the simulation results in nonlinear function optimization,
traveling salesman problem (TSP).

Keywords: Chaotic neural network · Legendre function self-feedback ·


Lyapunov exponents · Combinatorial optimization problem

1 Introduction
Transiently chaotic neural networks (TCNN) have been extensively researched [1–5],
which successfully and efficiently have solved TSP because of its complex nonlinear
dynamical behavior. By decreasing the self-feedback link weight exponentially, which
ensures, in chaotic search behavior, that can be astringed a point stably and behave
transiently and introducing a linear self-feedback, Chen-Aihara has proposed a CNN
with CSA [1]. Different self-feedback functions can generate different chaotic traversal
searches and different chaotic dynamics. Xu Nan, Xu Yaoqun, Sun Ming, Ye Yonggang
present a self-feedback term with trigonometric functions, nonlinear wavelet functions,
and Bessel functions to their models [6–10]. During the search process, this is why their
models have more complex chaotic search capabilities.
In this study, we put forward a Legendre function self-feedback CNN, which is unlike
the linear self-feedback term of Chen and Aihara’s network, because it is not simple and
nonlinear. Our presented chaotic neural network, with the suitable arguments, has a
stronger ability in terms of falling into a local minimum point and optimal performance.
These can be shown by simulations about two aspects, optimizing continuous function
and 10 cities TSP problem, and for practical questions, that is very complicated and
classical [11].

© The Author(s), under exclusive license to Springer Nature Switzerland AG 2022


A. E. Hassanien et al. (Eds.): BIIT 2021, LNDECT 107, pp. 450–460, 2022.
https://doi.org/10.1007/978-3-030-92632-8_42
Chaotic Neural Network with Legendre Function 451

2 Chaotic Neural Network with Legendre Function Self-feedback


2.1 Chaotic Neural Network Model
Legendre function, a nonlinear and basic function, is a particular self-feedback term, is
brought into the neural network. The new chaotic neural network model is described as
follows:

xi (t) = 1/(1 + exp(−yi (t)/ε)) (1)


n
yi (t + 1) = kyi (t) + α[ wij xj + Ii ] − zi (t)g(xi (t) − I0 ) (2)
j=1,j=i


n
g(x) = λi P2i+1 (x) (3)
i=1

zi (t + 1) = (1 − β)zi (t) (4)

dn
Pn (x) = 1
2n ·n! · dxn (x
2 − 1)n n = 0, 1, 2, · · · (5)

The term zi (t)g(xi (t)−I0 ) is the nonlinear self-feedback term. In the usual stochastic
annealing process, the variable corresponds to the temperature. How does the chaotic
neural network introduce the chaotic mechanism? It asks this question that the connection
of the self-feedback weight, step by step, decreases as the value of zi (t) trails off. Then
the chaotic behavior in the chaotic neural networks is important to be a whole search
role. In order to anneal for the network, Eq. (4) is a program that adopts exponential
cooling. And the symbolic meaning is in Table 1.

Table 1. Symbol meaning

Symbol Meaning
xi The output of neuron i
yi The inside of the state of neuron i
zi The self-feedback connection weight
Ii Input bias of neuron i
λi The parameter of the nonlinear function (0 ≤ λi ≤ 1)
g(x) The nonlinear function
wij The link weight, wij = wji
β The damping factor
ε Steepness parameter of the activation function (ε > 0)
(continued)
452 Y. Zhang et al.

Table 1. (continued)

Symbol Meaning
k Damping factor of nerve membrane (0 < k < 1)
Pn Legendre function
I0 Positive parameter

2.2 Chaotic Dynamics of the Single Neuron


The characteristics of the chaotic neural network are illustrated by the neural unit which
we analyze. The description of the single neuron model follows:

x(t) = 1/(1 + exp(−y(t)/ε)) (6)

y(t + 1) = ky(t) − z(t)g(x(t) − I0 ) (7)


n
g(x) = λi P2i+1 (x) (8)
i=1

z(t + 1) = (1 − β)z(t) (9)

Next, neurons’ transient chaotic dynamics characteristics can be described by the


inverted bifurcation diagram and the maximum Lyapunov exponent time evolution dia-
gram. As times go on because chaotic motion is closely related to initial conditions, the
orbitals that came into being by two similar initial values will develop naturally.
Moreover, it is separated exponentially, and the Lyapunov exponent is a quantitative
description of the phenomenon’s strength. The calculation formula of the Lyapunov
exponent is as follows:
n−1  
1   dy(t + 1) 
λL = lim ln (10)
n→∞ n
n
dy(t) 

For the model, set i to 1 and 2, the chaotic neuron model has the next equations:

dy(t + 1) dg(x) dx
= k − z(t) · (11)
dy(t) dx dy(t)
dx dx(t) 1
= = x(t)(1 − x(t)) (12)
dy(t) dy(t) ε
dg(x) λ1 λ2
= (15x2 − 3) + (315x4 − 210x2 + 15) (13)
dx 2 8
The Lyapunov exponent can reflect the chaotic power of the model and then reflect the
strength of the global optimization ability of the network. In one-dimensional mapping,
Chaotic Neural Network with Legendre Function 453

λL > 0 indicates that the model is in a state of chaos. The higher the value, the stronger
the confusion. λL = 0 means that the model is in a stable boundary state. λL < 0
indicates that the motion is stable and the model does not have a chaotic state.
The self-feedback function in the model is a combination of Legendre functions of
λ1 = 1/3, n = 3 and λ2 = 1/100, n = 5, as shown in Fig. 1.

Fig. 1. The self-feedback function

Then, we made some emulation by Matlab that the parameters are these:
ε = 0.018, λ1 = 1/3, λ2 = 1/100, y(1) = 0.1, z(1) = 0.98, k = 0.1, I0 = 0.85,
β = 0.001. And it takes our some time. Time evolution diagram and the inverted
bifurcation diagram of the maximum Lyapunov exponent of the neuron are shown as
Fig. 2.

Fig. 2. State bifurcation figure and the maximal Lyapunov exponents of the single neuron when
β = 0.001, λ1 = 1/3 and λ2 = 1/100

When ε = 0.018, y(1) = 0.1, z(1) = 0.98, k = 0.1, I0 = 0.85, β = 0.001,


λ1 = 1/3, λ2 = 5/100, the time evolution diagram and the inverted bifurcation diagram
of the maximum Lyapunov exponent of the neuron are shown in Fig. 3.
The time evolution diagram and the inverted bifurcation diagram of the maximum
Lyapunov exponent of the neuron, when the selected parameters:
ε = 0.018, y(1) = 0.1, z(1) = 0.98, k = 0.1, I0 = 0.85, β = 0.001, λ1 = 1/3,
λ2 = 1/10, are shown in Fig. 4.
Furthermore, we can get some knowledge from the upper state bifurcation figures, it
can be found that this neuron model has transient chaotic dynamics behavior, the value
of λ1 is definite, the value of λ2 affects the chaotic state of the network. The larger the
value, the easier the network will exit the chaotic state. As the value z(t) decreases, the
454 Y. Zhang et al.

Fig. 3. State bifurcation figure and the maximal Lyapunov exponents of the single neuron when
β = 0.001, λ2 = 5/100 and λ1 = 1/3

Fig. 4. State bifurcation figure and the maximal Lyapunov exponents of the single neuron when
β = 0.001, λ2 = 1/10 and λ1 = 1/3

neurons exhibit transient chaotic dynamics, and the reverse bifurcation converges to a
stabilized equilibrium state.
Usually, the gradient descent method [12] is used to get command of the dynamic
behavior which belongs to the single neural unit, when the stable equilibrium state
begins. The length of the reversed bifurcation [13] can be impacted by β that is called
the simulated annealing parameter, which means that, as the value of β becomes small,
the reversed bifurcation behaves longer. Under the circumstance that the single neural
unit behaves alike as well as the Hopfield network, a stabilized equilibrium point is the
final state, which is the tendency of the network.

3 Application

Currently, chaotic neural networks are used in the following areas: continuous function
optimization problems, TSP problems, image encryption and communication. In this
paper, we focus on two aspects of continuous function optimization and the 10-city TSP
problem.

3.1 Continuous Function Optimization Problems

Our team study the function [14] described as follows:

f (x1 , x2 ) = (x1 − 0.7)2 [(x2 + 0.6)2 + 0.1]


Chaotic Neural Network with Legendre Function 455

+ (x2 − 0.5)2 [(x2 + 0.4)2 + 0.15] (14)

Minimum value of the function is 0, the minimum point of function is (0.7, 0.5),
local minimum point is (0.6, 0.4) and (0.6, 0.5). We set relevant parameters as follows:
ε = 0.018, k = 1, y1 (1) = y2 (1) = 0.1, z1 (1) = z2 (1) = 0.98, β = 0.001,
α = 0.05, I0 = 0.85, λ1 = 1/3, λ2 = 1/100.
Then, it is a new way, for chaotic neural network model which adopts Legendre
function self-feedback to deal with optimization function transiently. The development
of the energy function and the optimal solution are shown in Fig. 5.

Fig. 5. Time evolution figure of energy function and the optimal solution of X1 X2

Set λ2 = 3/100, other parameters remain unchanged, and study the influence of λ2
on the network, then time evolution figure of energy function and the optimal solution
is shown in Fig. 6.

Fig. 6. Time evolution figure of energy function and the optimal solution of X1 X2

Through the above simulation experiments, it is found that as the parameter λ2


increases, the time for neurons to exit the chaotic search is faster, and the parameter λ2
affects the chaotic search process of the neuron. By setting a suitable λ2 , the optimization
ability of the network can be improved.

3.2 Application to 10-City TSP


The proposed neural network with Legendre function self-feedback that is nonlinear is
applied to the TSP problem [15], a very typical problem and an NP-hard problem in
combinatorial optimization that is difficult to settle in practical engineering problems to
demonstrate the performance. Seeking out a feasible way is a goal, over these years, for
456 Y. Zhang et al.

many students and teachers to settle this problem more validly. TSP asks the following
question: There are many cities. We have to reach many cities only once, and the distance
between two cities came to light. Now, meeting the above conditions, What is the shortest
route? And find it.
A  B 
n n n
E= ( xij − 1)2 + ( xij − 1)2
2 2
j=1 j=1 i=1
(15)

n 
n 
n
+ D2 (xk,j+1 + xk,j−1 )xij dik
i=1 j=1 k=1

Where, xij refers to city i visited in order j, parameter A = B, dik is on behalf of the
distance between city i and city k, and the shortest effective path is equivalent to a global
minimum energy value. Then, we select normalized 10 city coordinates as follows Table
2.

Table 2. City coordinates

i 1 2 3 4 5
X (i) 0.4 0.2439 0.1707 0.2293 0.5171
Y (i) 0.4439 0.1463 0.2293 0.716 0.9414
i 6 7 8 9 10
X (i) 0.8732 0.6878 0.8488 0.6683 0.6195
Y (i) 0.6536 0.5219 0.3609 0.2536 0.2634

In the literature, the shortest distance of the 10-city is 2.6776. Figure 7 shows the
optimal path of 10-city.

Fig. 7. The optimal path of 10-city

It is of significance for the neural networks to alter the arguments and energy
coefficients which is concerned to validity.
Chaotic Neural Network with Legendre Function 457

Then, we initialize the arguments as follow:


A = B = 1, D = 3.6, k = 0.9, I0 = 0.5, β = 0.002, z1 (1) = 0.8, ε = 0.04, λ1 =
1/3, α = 0.1, λ2 = 1/100. The program runs a total of 5 times, and 200 disparate original
conditions come into being at random in the interval [−0.1, 0.1]. The consequences are
collected in Table 3.

Table 3. Results of model solving TSP problem in 10 cities

Legal path Optimal Legal ratio Optimal


path (%) ratio (%)
194 188 97.0 94.0
198 196 99.0 98.0
195 194 97.5 97.0
194 191 97.0 95.5
197 193 98.5 96.5

As can be seen from Table 3, when the network solves the 10-city TSP with the above
parameters, the highest proportion of the legal path is 99%, the average proportion of
the legal path is 97.9%, the average proportion of the optimal path is 96.2%, and the
maximum success rate of the optimal solution is 98%. On average, the difference between
the optimal ratio and the legal ratio is not significant, indicating the validity of this model.
Varying the function in the self-feedback with suitable parameters gives Table 3. it
can be seen that the nature of the self-feedback function has a degree of influence on the
effect of the optimal and legal ratios.
In the Chen’s chaotic neural network, set the parameters as follow: A = B = 1,
D = 2, α = 0.1, k = 1, I0 = 0.5, ε = 0.04, z(1) = 0.8, β = 0.002.
The program runs a total of 5 times, and 200 disparate original conditions of are
came into being at random in the interval [−0.1, 0.1].The consequences are collected in
Table 4.

Table 4. Results of Chen’s model solving TSP problem in 10 cities

Legal path Optimal Legal ratio Optimal


path (%) ratio (%)
172 170 86.0 85.0
169 162 84.5 81.0
170 168 85.0 84.0
180 170 90.0 85.0
179 173 89.5 86.5
458 Y. Zhang et al.

The simulation experiment shows that when solving the TSP problem of 10 cities, the
solution ability of the neural network model with the Legendre function self-feedback
is better than that of the Chen’s neural network model.
Next, keeping other parameters the same, we study the influence of different λ on
the optimization of ability of TSP. The simulation consequences, with diverse λ1 and
λ2 , of 100 disparate internal conditions, are gathered in Table 5.

Table 5. Simulation results for λ

λ1 λ2 Legal Optimal λ1 λ2 Legal Optimal


ratio (%) ratio (%) ratio (%) ratio (%)
1/3 0.007 89 86.7 0.25 0.01 96.4 33.8
1/3 0.009 96.2 94.5 0.3 0.01 98.8 94.4
1/3 0.01 97.2 95.9 0.32 0.01 98.7 97
1/3 0.011 98.3 96.9 0.322 0.01 99.1 98.4
1/3 0.013 98.6 97.7 0.323 0.01 98.6 97.3
1/3 0.014 98.3 97.6 0.325 0.01 98 97
1/3 0.015 98.6 97.6 0.33 0.01 97.1 96.3
1/3 0.016 99.2 98.1 0.333 0.01 97.8 96
1/3 0.017 98.6 97.2 1/3 0.01 97.2 95.9
1/3 0.02 98.6 96.4 0.34 0.01 96.4 95.7
1/3 0.03 97.4 86.3 0.38 0.01 74.2 70.9

Under various conditions in the table, the program is run 20 times, and the simulation
results obtained are averaged. In addition, As seem from Table 5:

(1) The parameter λ1 and λ2 are very important in simulation results. When λ1 = 1/3,
λ2 = 0.016 and λ2 = 0.01, λ1 = 0.322, The chaotic neural network, in the TSP
question, which is equipped with Legendre function self-feedback has the greatest
performance.
(2) When λ1 = 1/3, 0.01 ≤ λ2 ≤ 0.02 and λ2 = 0.01, 0.32 ≤ λ1 ≤ 0.33, they also
have a better performance. When the parameters are farther and farther away from
the above ranges, it is obvious that the network optimization ability will decline.
Chaotic Neural Network with Legendre Function 459

4 Conclusion
In this study, the higher-order polynomial of Legendre basis function is recommended
into the self-feedback term of Chen’s transient chaotic neural network. The linear com-
bination of the higher-order polynomial of Legendre basis functions is recommended
used to form a new self-feedback function to construct a new transient chaotic neural
network. This model is employed to settle optimization functions, TSP problems, and
the influence of parameters on the network. The simulation consequences show that this
model has a strong power to optimize network performance.

References
1. Chen, L., Aihara, K.: Chaotic simulated annealing by a neural network model with transient
chaos. Neural Netw. 8(6), 915–930 (1995)
2. Xu, Y., Sun, M.: Wavelet chaotic neural network and its application in optimization calcula-
tion. In: 6th World Congress on Intelligent Control and Automation, WCICA 2006, Dalian,
pp. 3004–3009 (2006)
3. Zhao, L., Sun, M., Cheng, J., Xu, Y.: A novel chaotic neural network with the ability to
characterize local features and its application. IEEE Trans. Neural Netw. 20(4), 733–738
(2009)
4. Xu, Y., Sun, M.: Shannon wavelet chaotic neural network to solve TSP problems. Control
Theor. Appl. 25(3), 574–577 (2008)
5. Xu, Y., Yang, X.: Chaotic neural network with inverse trigonometric functions feedback and
its application. J. Harbin Univ. Commer. 26(3), 324–328 (2010)
6. Ye, Y.: Bessel function self-feedback chaotic neural network model and applications. Int. J.
Hybrid Inf. Technol. 7(4), 19–28 (2014)
7. Xu, Y., Xu, N., Qiu, Z.: Chaotic neural network with trigonometric function self-feedback.
In: Proceeding of the 39th Chinese Control Conference, Shenyang, pp. 7396–7401 (2020)
8. Sun, M., Xu, Y., Dai, X., Guo, Y.: Noise-tuning-based hysteretic noisy chaotic neural network
for broadcast scheduling problem in wireless multihop networks. IEEE Trans. Neural Netw.
Learn. Syst. 23(12), 1905–1908 (2012)
9. Xu, Y., Liu, J.: A novel chaotic neural network with self-feedback. In: 2016 Chinese Control
and Decision Conference, Yinchuan, pp. 3449–3454 (2016)
10. Sun, M.: Chaotic neural network and its application based on wavelet and hysteresis, pp. 39–
45. Harbin Engineering University (2010)
11. Ye, Y.: Simulation study on parameters of SLF chaotic neural network model. Int. J. Grid
Distrib. Comput. 8, 159–168 (2015)
12. Kwok, T., Smith, K.: Experimental analysis of chaotic neural network models for combina-
torial optimization under a unifying framework. Neural Netw. 13(7), 731–744 (2000)
13. Zhao, L., Sun, M., Cheng, J., Xu, Y.: A Novel chaotic neural network with the ability to
characterize local features and its application. IEEE Trans. Neural Netw. 20(4), 737–738
(2009)
460 Y. Zhang et al.

14. Wang, B., Liu, S., Qian, G.: Optimization method based on combination of wavelet neural
network and genetic algorithm. J. Aerosp. Power 23(11), 1953–1960 (2008)
15. Luo, Y., Li, X.: Adaptive synchronization of chaotic neural networks with time-delay based
on time-delay feedback control. J. Appl. Math. Comput. Math. 31(04), 443–453 (2017)
Deep Reinforcement Learning for Resource
Allocation in Multi-cell Cellular Networks

Ming Sun1(B) , Liangjin Hu1 , Yanchun Wang1 , Hui Zhang1 , and Jiaqiu Wang2
1 College of Computer and Control Engineering, Qiqihar University, Qiqihar 161006, China
2 Beijing China-Power Information Technology Co., Ltd., Beijing, China

Abstract. Resource allocation for full-frequency multiplexing cellular networks


tends to involve two aspects of both the channel allocation and the power allo-
cation. In order to maximize the energy efficiency of full-frequency multiplexing
cellular networks, this paper proposes a deep reinforcement learning algorithm
that constructs a deep Q network (DQN) with multiple hidden layers to output the
channel allocation scheme and power control scheme. In the proposed DQN, the
transition state is the power allocated for each channel. In contrast, the transition
action consists of the channel allocation scheme, the power allocated for each
channel, and the corresponding power adjustment amount. Besides, the reward of
the transition of the DQN is the total energy efficiency. Simulation results show
that the proposed DQN can ensure the low computation delay and that the energy
efficiency obtained by the proposed method is larger than other methods at large
numbers of channels.

Keywords: Deep reinforcement learning · Resource allocation · Multi-cell


cellular networks

1 Introduction
As a scarce mobile communications resource, the wireless spectrum has always been
an important research aspect in the mobile communication field [1, 2]. With the rise of
new concept networks, the new generation of wireless communication networks puts
forward higher and higher requirements for high rate, low energy consumption, and low
delay [3–9]. Therefore, it is important to manage spectrum resources efficiently in the
new generation of wireless communications.
At present, traditional methods, including heuristic algorithms, have been used for
the resource allocation of cellular networks. These methods affect the real-time per-
formance of the communication system and increase the computational burden of the
communication system [2], so that they cannot be applied to the new generation of
wireless communication networks.
Deep learning has the advantages of low computational complexity and low delay
[10–15], and has been used to solve the resource allocation of wireless mobile com-
munications. Some researchers have proposed supervised deep learning (DL) resource
allocation scheme [2], where the training data are generated by some heuristic algo-
rithms such as genetic algorithm, ant colony algorithm, simulated annealing algorithm,

© The Author(s), under exclusive license to Springer Nature Switzerland AG 2022


A. E. Hassanien et al. (Eds.): BIIT 2021, LNDECT 107, pp. 461–470, 2022.
https://doi.org/10.1007/978-3-030-92632-8_43
462 M. Sun et al.

etc. However, the generation of training data is often expensive and time-consuming,
so that the supervised DL approach is not suitable for large network systems. Some
researchers also use the reinforcement learning (RL) method [8, 12–15] to obtain the
optimal solution of the power control problem by interaction with the environment.
However, strategies in traditional RL are often stored in a table, which lacks flexibility
and is not feasible for large spaces of both the action and the state.
Due to the following three aspects, deep reinforcement learning (DRL) has become
a popular technology to solve complex control problems. Firstly, DRL combines the
perceptual ability of deep learning with the decision-making ability of reinforcement
learning. Secondly, DRL interacts with the environment continuously in the way of trial
and error. Thirdly, DRL obtains the best strategy by maximizing the cumulative reward.
In [15], the authors proposed a new online computing unloading method using DRL to
solve the mobile edge computing problem.
Inspired by the above, a deep Q network (DQN) is present in this paper to solve
resource allocation for full-frequency multiplexing cellular networks. In the proposed
DQN, the two aspects of resource allocation problems, i.e., channel allocation and power
control, are realized simultaneously to improve the energy efficiency of communication
systems. In the proposed DQN, the transition state is the power allocated for each channel.
In contrast, the transition action consists of the channel allocation scheme, the power
allocated for each channel, and the corresponding power adjustment amount. Besides,
the reward of the transition of the DQN is the total energy efficiency. The simulation
results show that, compared with the traditional artificial bee colony algorithm, the
greedy allocation algorithm, and the random allocation algorithm, the proposed DQN
can ensure the low computation delay and that the energy efficiency obtained by the
proposed DQN is larger than other methods as the number of channels increases.

2 System Model

As shown in Fig. 1, the downlink system of a cellular network with full-frequency mul-
tiplexing is composed of several OFDM cells. Assume that the cellular system contains
M cells, N orthogonal channel resources, and K users. Each cell has a base station in
the center, and users are randomly distributed in the cells. The channel resources are
reused with full frequency in the cells, and one single user only occupies each channel. In
addition, B and N 0 denote the bandwidth of each channel and the noise spectral density,
respectively. In this paper, the channel gain of the system is modeled as Eq. (1) [2]:
 2
n
Hm,k = 10−(PLm,k +Xα )/10 hnm,k  (1)
n is the channel gain as the base station m uses the channel n to communicate
where Hm,k
with the user k; PLm,k is the path loss model between the base station m and the user k;
X α represents the shaded fading which is a normal random variable with the mean of
zero and the standard deviation of α; hnm,k is the fast fading as the base station m uses
the channel n to communicate with the user k, which is an complex Gaussian random
variable with the mean of zero and the variance of one.
Deep Reinforcement Learning for Resource Allocation 463

Signal

Interference
signal

Base station

User

Fig. 1. Model of a multi-cell system with full-frequency reuse

The total transmission rate and the energy efficiency of the system can be expressed
as [8]:
⎛ ⎞
M  N  K n pn H n
Dm,k
B log2 ⎝1 +   ⎠
m,k m,k
R= (2)
m=1 n=1 k=1 N0 B + I m,k 
n

D n pn H n
B log2 1 +  m,k m,k m,k


M 
N 
K n
N0 B+Im,k 
E= (3)
106 · pm,k
n
m=1 n=1 k=1

where R is the transmission rate with bit/(s·Hz); E is the energy efficiency of the system
n = 1 denotes that the base station m allocates the channel n to the
with Mbit/(s·W); Dm,k
user k which belongs to the m cell; Otherwise, Dm,kn = 0; pn is the power as the base
m,k
station m uses the channel n to communicate with the user k;  = − ln(5BER)/1.6;
n is the interference as the base station m uses the
and BER is the bit error rate; and Im,k
channel n to communicate with the user k, which can be expressed as:


M 
K
n
Im,k = n n
Di,j n
pi,j Hi,k (4)
i=1 j=1
i=m

Based on both the total transmission power constraints of a base station and the mini-
mum power constraints required for channel transmission, this paper models the resource
allocation problem of the above multi-cell cellular network with the full-frequency reuse
as the constrained optimization problem P1, shown as follows:

(P1) : max E
{D,p}


K
s.t. C1 : n
Dm,k = 1, Dm,k
n
∈ {0, 1}
k=1
C2 : n
pm,k ≥0
464 M. Sun et al.

C3 : pm
n
≥ pmin
C4 : N · pmin ≤ ptot,m ≤ pmax (5)

where D = [Dm,kn ], p = [pn ], H = [H n ] (1 ≤ m ≤ M , 1 ≤ n ≤ N , 1 ≤ k ≤ K);


m,k m,k
C1 denotes that the base station m only allocates a channel to one user k in the cell;
C2 indicates that the transmission power of the base station m using the channel n
to communicate with the user k is non-negative; C3 represents the minimum power
n is the transmission power of the
constraint required for the channel transmission, and pm
K
base station m using the channel n, satisfying pm = k=1 Dm,k
n n pn ; C denotes the total
m,k 4
transmission power constraints of a base station, where ptot,m is the total transmission
power of the base station m, satisfying ptot,m = N n
n=1 pm .
The above problem is an NP-hard problem, which means that it is usually difficult
to solve such a problem by using the traditional model-based method due to the large
computational amount and the high complexity. In addition, these traditional approaches
cannot adapt to the future multi-channel service requirements with full-frequency mul-
tiplexing and the dynamically changed environment. Therefore, data-driven deep rein-
forcement learning algorithms is applied to solve this optimization problem in the next
section.

3 Resource Allocation Method Based on DQN


Deep learning tends to be superior in perception than decision-making. On the contrary,
reinforcement learning has the strong decision-making ability, but is not good at solving
perceptual problems. DRL combines the perceptual ability of deep learning with the
decision-making ability of reinforcement learning, aiming to find an optimal strategy to
maximize the cumulative rewards by the continuous interaction with the environment
[13]. Besides, the agent in DRL can interact with the environment continuously, the
system performance can be improved by the agent obtaining the reward signals from
the environment. Therefore, this paper uses deep Q-network (DQN) to solve resource
allocation problem of downlink in full frequency multiplexing cellular network system.

3.1 Problem Mapping

The basic model of DRL consists of action spaces A = {a1 , a2 , ..., an }, state spaces
S = {s1 , s2 , ..., sm }, reward signal R and the strategy π: S → A.
State space S: The state space is S = {s0 , s1 , ..., sT } where st = {p11 , p21 , ..., pM N}
n
represents the power set of each channel in each cell, and pm is the channel power as
the cell m uses the channel n to communicate with the user, and the dimension of st is
N × M.
Action space A: The action space is A = {a0 , a1 , ..., aT } where at is the t th action, and
at = {dkn , pm
n , p } where d n with the dimension N × M means the channel allocation
m k
scheme that the nth channel is assigned to user k, pm with the dimension M × 5
represents the power adjustment amount of base station M.
Deep Reinforcement Learning for Resource Allocation 465

Reward signal: The energy efficiency rt is used to act as the reward and punishment
value, shown below.

D n pn H n
log2 1 +  m,k m,k m,k


M 
N 
K n
N0 B+Im,k 
rt = n (6)
106 pm,k
m=1 n=1 k=1

The weights of the DQN are updated as follows.



θt+1 = θt + η rt + γ maxa Q(s , a ; θ ) − Q(s, a; θ ) ∇Q(s, a; θ ) (7)

where θ and θ are weights of the main network and the target network in DQN,
respectively, and η is the learning rate, and γ is the discount parameter.

3.2 DQN Model

The structure of the proposed DQN model is shown in Fig. 2.

p11 X1L1 1
W11 WL1 d11 p11 p1
p2
ReLU

ReLU

N
p1 d K1 p1N
1 1
pMN b 1 b L
d KN pMN pM
(N×K) (M×N) (M×5)
(Batch, M×N)
(Batch, N×K+M×N+M×5)

Fig. 2. Structure of the proposed DQN

As shown in Fig. 2, the input of the proposed DQN is the state st = {p11 , p21 , ..., pM
N }.

After the hidden layer, the output layer at = {dk , pm , pm } includes the following three
n n

parts: the channel allocation dmn , the power allocation pm n , and the power adjustment

amount of the base station pm . According to the neuron output, the corresponding
maximum output value in each base station is select as the channel allocation scheme
dkn . In each base station, pm
n is the power selected by the DQN that needs to be adjusted

when entering the next state, and the adjustment amount pm of the base station is used
to adjust the select power pm n.

3.3 Steps of Algorithm


The algorithm includes the following 3 steps.

Step 1. Initialize parameters. Set the initial parameters of the DQN randomly. Initialize
discount factor γ and the learning rate η, and set the power adjustment amount pm
466 M. Sun et al.

as 5 levels, i.e., (−0.002, −0.001, 0, 0.001, 0.002), and set the size of experience pool,
and set the maximum number of episodes and the number of iterations in each episode.
Then, initialize the power pm n of the initial state s , which should satisfy the constraints
0
pm ≥ pmin and N · pmin ≤ ptot,m ≤ pmax .
n

Step 2. Store the state transition information in the replay memory. Based on the strategy
of exploration and exploitation, this paper uses ε-greedy policy to generate the action
at . That is, the action at is randomly selected or selected according to the output of the
deep neural network, i.e., at = arg maxa Q(s , a ; θ ). Reward r t is calculated according
to Eq. (6). The state information st is generated randomly by satisfying C2 and C3 of
Eq. (5). The next state st+1 is output by the DQN and is handled to stratify C2 and C3
of Eq. (5). Finally, the state transition information (sj , aj , rj , sj+1 ) value is stored in the
replay memory.
Step 3. Train the DQN. As the present capacity of the replay memory is reached, take a
small batch of state transition information randomly from the replay memory. And then
perform the gradient descent for the parameters θ of the DQN by using the loss function
[yj − Q(sj , aj ; θ )]2 , where

rj , if the episode ends at step j + 1
yj = (8)
rj + γ max

Q(s , a ; θ ), otherwise
a

4 Simulation and Analysis

In this section, the proposed DQN is evaluated by simulations. The simulation parameters
for the wireless network are shown in Table 1.

Table 1. Simulation parameters for the wireless network.

Parameter (units) Values


Base Station 3
Radius of cell (m) 200
The maximum transmitting power of the base 38
station (dBm)
The minimum transmitted power of the 20
channel (dBm)
Carrier frequency (GHz) 2.0
Channel bandwidth (kHz) 180
Noise spectral density (dBm/Hz) −170
Total channel resources 4, 6, 8, 10, 12, 14, 16
Number of mobile subscribers 10
Path loss index 3.2
Reference range (m) 100
Shadow fading standard deviation 8
BER 10–3
Deep Reinforcement Learning for Resource Allocation 467

In this paper, the proposed algorithm is implemented by Pytorch. According to the


number of base stations (M), the number of mobile users (K) and the number of channel
resources (N) in Table 1, the structure of the DQN is shown in Table 2, where Input
represents the input layer; FC represents the fully connected layer, and BN stands for
the batch normalization layer, and ReLU and ELU stand for the activation functions,
and Output stands for the output layer.

Table 2. Structure of the DQN.

Number of base stations, mobile users Number of network layers and neurons
and channel resources
M, N, K M*N(Input)-180(FC-BN-ReLU)-120(FC-BN-ReLU)-
80(FC-BN-ReLU)-N*K+M*N+M*5(Output)

During the training for the DQN, the replay memory size is determined to be 64. A
total of 50 episodes of training are used in simulations, and the number of iterations for per
episode is set as 90. Figure 3 shows the energy efficiency in each episode as the number
of sub-channel is 16. During the training for the DQN, the state transition information in
the replay memory is randomly sampled in each episode, and the environment variable
of the channel gain H is initialized randomly.

Fig. 3. Energy efficiency obtained as the number of channel is 16.

Figure 4 shows that the resource allocation scheme based on the DQN proposed in
this paper is compared with the traditional artificial bee colony algorithm, greedy channel
allocation with WMMSE power control algorithm and random channel allocation with
WMMSE power control algorithm. In this figure, 50 groups of unseen channel gain H
are selected as external environment variables to make predictions, and the average value
of the energy efficiency of 50 groups is taken. The results show that as the number of
468 M. Sun et al.

channels is more than 12, the proposed DQN algorithm begins to outperform the artificial
bee colony algorithm. With the increase of the number of channels, the advantage of
the proposed DQN becomes more and more obvious. Meanwhile, it can be seen from
Figs. 3 and 4 that the energy efficiency predicted by the trained DQN in channel 14 is
obviously superior to other algorithms.

Fig. 4. Energy efficiency obtained by different algorithms under different number of channels.

Fig. 5. Computational delay of different algorithms.

As shown in Fig. 5, the proposed DQN has lower computational delay than other
algorithms. It indicates that the proposed DQN acts more quickly in terms of resource
allocation and can obtain higher energy efficiency at the larger number of channels.
Deep Reinforcement Learning for Resource Allocation 469

5 Conclusion
In order to improve the energy efficiency of next-generation cellular networks and reduce
the computing delay, this paper proposes a DQN to solve the resource allocation problem
in the cellular network with full-frequency multiplexing, which maximizes the energy
efficiency by the DQN for channel allocation and power control. Simulation results show
that the proposed resource allocation method based on the proposed DQN can achieve
larger energy efficiency than other algorithms under the premise of low computational
delay.

Acknowledgments. This work was supported in part by the Joint guiding project of Natural Sci-
ence Foundation of Heilongjiang Province under Grant LH2019F038, in part by the Young Inno-
vative Talents Program of Basic Business Special Project of Heilongjiang Provincial Education
Department under Grant 135309340.

References
1. He, X., Wang, K., Huang, H., et al.: Green resource allocation based on deep reinforcement
learning in content-centric IoT. IEEE Trans. Emerg. Top. Comput. 8(3), 781–796 (2020)
2. Ahmed, K., Tabassum, H., Hossain, E.: Deep learning for radio resource allocation in multi-
cell networks. IEEE Netw. 33(6), 188–195 (2019)
3. Liang, F., Shen, C., Yu, W., Wu, F.: Power control for interference management via ensembling
deep neural networks. In: 2019 IEEE/CIC International Conference on Communications in
China (ICCC), Changchun, pp. 237–242. IEEE (2019)
4. Lee, W., Kim, M., Cho, D.: Deep power control: transmit power control scheme based on
convolutional neural network. IEEE Commun. Lett. 22(6), 1276–1279 (2018)
5. Lee, W.: Resource allocation for multi-channel underlay cognitive radio network based on
deep neural network. IEEE Commun. Lett. 22(9), 1942–1945 (2018)
6. Lee, W., Kim, M., Cho, D.: Transmit power control using deep neural network for underlay
device-to-device communication. IEEE Commun. Lett. 8(1), 141–144 (2019)
7. Liu, D., Sun, C., Yang, C., Hanzo, L.: Optimizing wireless systems using unsupervised and
reinforced-unsupervised deep learning. IEEE Network 34, 270–277 (2020)
8. Liao, X., Yan, S., Shi, J., Tan, Z., Zhao, Z., Li, Z.: Deep reinforcement learning based resource
allocation algorithm in cellular networks. J. Commun. 40(2), 11–18 (2019)
9. Sun, M., Cao, W., Li, D., Ma, Z.: Strategy of maximizing capacity of OFDMA system for
ensuring fairness. Control Decis. 35(5), 1175–1182 (2020)
10. Challita, U., Dong, L., Saad, W.: Proactive resource management for LTE in unlicensed
spectrum: a deep learning perspective. IEEE Tran. Wirel. Commun. 17(7), 4674–4689 (2018)
11. Li, X., Fang, J., Cheng, W., et al.: Intelligent power control for spectrum sharing in cognitive
radios: a deep reinforcement learning approach. IEEE Access 6, 25463–25473 (2018)
12. Ye, H., Li, G., Juang, B.: Deep reinforcement learning based resource allocation for V2V
communications. arXiv arXiv:1805.07222v1 (2018)
470 M. Sun et al.

13. Abishi, C., Shital, A.R., Husnu, S.N.: DA-DRLS: drift adaptive deep reinforcement learning
based scheduling for iot resource management. J. Netw. Comput. Appl. 138, 51–65 (2019)
14. Tabrizi H., Farhadi G., Cioffi J.: Dynamic handoff decision in heterogeneous wireless sys-
tems: Q-learning approach. In: Proceedings of 2012 IEEE International Conference on
Communications (ICC), Ottawa, pp. 3217–3222. IEEE (2012)
15. Xiaoyu, Q., Luobin, L., Wuhui, C.: Online deep reinforcement learning for computation
offloading in block chain-empowered mobile edge computing. IEEE Trans. Veh. Technol.
68(8), 8050–8062 (2019)
Gauss Nonlinear Self-feedback Chaotic Neural
Network and Its Application

Nan Xu(B) , Bin Zhou, and Yamin Wang

College of Information and Electrical Engineering, Heilongjiang Bayi Agricultural University,


Daqing 163319, China

Abstract. The new network model with gauss nonlinear self-feedback is con-
structed. The appropriate initial value of each parameter is selected, making the
network reflect the chaotic dynamic characteristics. The comparison of simulation
figures illustrates the feasibility of the application of the model, and the network’s
sensitivity to chaotic parameters is also reflected. It is proved that the width coef-
ficient increase or decrease in different structure-function influences the output to
change differently by comparing Gaussian and the contrary multiquadric function.
The new model is used in the combinatorial optimization problem. The experi-
mental data show that the new network can effectively escape from the minimum
range of chaotic characteristics and achieve a high convergence effect to a station-
ary point if appropriate parameters are adopted. We also study the influence of the
ability of optimization belonging to main parameters in the network.

Keywords: Nonlinear · Self-feedback · Chaotic neural network · Gauss function

1 Introduction
At present, the nonlinear chaotic neural network has been studied primarily. Such as
Xu Yaoqun et al. constructed a new chaotic neural network (CNN) with trigonometric
function as self-feedback connection term [1], Yang Xueling et al. constructed a class
of CNN with ant-trigonometric function as self-feedback [2]. The feasibility of CNN
with nonlinear self-feedback has been obtained [3]. All these have laid a theoretical
foundation for further research on it. This paper continues the direction of research
based on Chen’s CNN [4], which uses a kind of radial basis function. Gauss function
constitutes a nonlinear self-feedback connection item, and it shows a new characteristic
different from linear self-feedback. The output graph and Lyapunov exponent of the
new model are analyzed. The classic problem “traveling salesman shortest path (TSP)”
is a kind of Non-deterministic Polynomial (NP) problem. It is solved using this new
network, and the ability to find the shortest path is studied.

2 Transient Chaotic Neuron Model with Gauss Self-feedback


2.1 The Single Neuron Model of the New Network
Chen’s CNN is used as the basic model; considering the low optimization efficiency
of its linear self-feedback process, a new CNN is constructed by adding a nonlinear

© The Author(s), under exclusive license to Springer Nature Switzerland AG 2022


A. E. Hassanien et al. (Eds.): BIIT 2021, LNDECT 107, pp. 471–481, 2022.
https://doi.org/10.1007/978-3-030-92632-8_44
472 N. Xu et al.

self-feedback function. Gauss function is a kind of radial basis function, which has a
strong function approximation ability. It is adopted as the self-feedback term of the new
network. This makes the feedback term of the new network show a nonlinear trend. The
chaotic behavior of neurons can investigate the chaotic ergodic ability of the new neural
network, so the model of a single neuron is studied first [5]. The single neuron model of
the new network is constructed as follows:
The new Gauss nonlinear self-feedback single neuron model shows as:

x(t) = 1/(1 + exp(−y(t)/ε0 )) (1)

y(t + 1) = ky(t) − z(t)f (x(t) − I0 ) (2)

z(t + 1) = (1 − β)z(t) (3)

1 − u22
f (u) = e δ (4)
c
In this model, x(t) is output; y(t) is used to examine the trend of independent variables
over time; z(t) is self-feedback item; f(u) is one of radial basis function named Gauss
which and z(t) is multiplied by the nonlinear self-feedback connection in this model [6];
k is damping factor and its value between 0 and 1; δ is extension constant of Gauss; c
also a parameter of Gauss.
The following simulation experiment is configured for Lenovo G450 Notebook PC,
the operating system is Windows XP, Intel Core 2 DuoT6600CPU, Hard disk capacity
is 320 GB, Graphics chip is NVIDIA GeForce G210M, Program running environment
is Matlab7.0, Computer memory is 2 GB.
The setting of initial value is very important for the embodiment of chaos. This is
because only a small part of the numerical interval of a model meets the requirements of
chaos. And the chaotic trajectory changes with time. Lyapunov exponent is the quantity
to quantitatively describe this variation. Lyapunov exponent is often used to measure
the chaotic characteristics of systems. When the initial values of parameters are very
close, it will lead to the emergence of adjacent orbits in space, then whether the change
trend between orbits converges or diverges with the passage of time, and the speed
of change can be directly reflected by Lyapunov exponent. If the maximum Lyapunov
exponent is greater than zero in a time period, it means that the system has entered a
chaotic state. When the Lyapunov exponent is positive, it indicates that even if the initial
values are very similar, the difference will change over time, and the exponential rate
will increase, so it cannot be predicted. This is chaos. “Bifurcation” in dynamics refers
to the continuous change of a parameter of the dynamic system, but it causes the sudden
and sharp response of the system behavior. Until a certain point, the system suddenly
changes from law to irregularity, which is the state of stepping into chaos.
Therefore, the dynamic properties of the model are analyzed by using the bifurcation
diagram of the neuron and the trend chart of maximum Lyapunov exponent over time.
Select parameters: ε0 = 1.25, y(1) = 0.283, z(1) = 0.5, k = 0.1, I0 = 0.85, δ = 0.25,
c = 0.0001 fixed invariant. The bifurcation diagram of the neuron and the trend chart
Gauss Nonlinear Self-feedback CNN 473

of maximum Lyapunov exponent when β = 0.0025 and β = 0.003 are shown as Fig. 1,
Fig. 2, Fig. 3 and Fig. 4.
Conclusions can be drawn from the above 4 chart: When the parameters are properly
selected, the neuron model has the characteristics of transient chaotic dynamics and the
chaotic behavior can fully reflect the sensitivity of the simulated annealing parameters
β. With the increase of β, the convergence speed will accelerate. With the decrease of
β, the more the chaotic search process is. From the numerical point of view, the two
simulate test β value difference of 0.0005, it is far apart. But the balance point one about

Fig. 1. The bifurcation diagram of the neuron when β = 0.0025

Fig. 2. The trend chart of maximum Lyapunov exponent when β = 0.0025

Fig. 3. The bifurcation diagram of the neuron when β = 0.003

Fig. 4. The trend chart of maximum Lyapunov exponent when β = 0.003


474 N. Xu et al.

t = 2750 s, the other about the t = 2250 s, the time difference of 500 s. It shows that the
small change of β will lead to a larger difference in the search process of chaos.
From another point of view, we can easily see through the formula (3): Self-feedback
connection z(t) is a process of constant attenuation and the speed of the reduction depends
on the value of β. The greater the β value, the faster the z(t) changes and the more
rapidly the simulated annealing temperature is reduced, then the network can reach a
stable equilibrium point. Not the proposed beta is too large so what the chaotic behavior
cannot be fully reflected and not easy to obtain the optimal solution of traversing the
global network; The smaller the value of β is, z(t) changes more slowly and network
continue to have higher simulated annealing temperature. Although the network can get
rich chaotic dynamical behavior but slow convergence speed and seeking the optimum
speed, the value of beta should not be too small.
After learning that the network has the characteristics of transient chaos, we study
the influence of the parameters of Gauss function on the model. First of all, to understand
the Gauss function curve as shown in Fig. 5. When the value of u is the same, the bigger
the width δ value is and the bigger the output f(u) value is. Compared with another kind
of radial basis function named contrary multiquadric function as shown in Fig. 6: When
the value of u value is the same, the bigger the width δ value is AMD the smaller the
output f(u) value is. It is proved that the width parameter has different influence on the
output result in different radial basis function structure.

Fig. 5. The figure of Gauss function when δ = 1 and δ = 2

Next, the influence of the width parameter on the network is studied. Select param-
eters: ε0 = 1.25, y(1) = 0.283, z(1) = 0.5, k = 0.1, I0 = 0.85, β = 0.0025, c = 0.0001
fixed invariant. The bifurcation diagram of the neuron and the trend chart of maximum
Lyapunov exponent when δ = 0.2 are shown as Fig. 7 and Fig. 8.
Figure 7 and Fig. 8 are compared with Fig. 1 and Fig. 2, it shows that the size of the
width parameter affects the network’s chaotic search process under the condition that
other parameters are fixed. Convergence point is located at t = 2450 s when δ = 0.2 for
this model. Convergence point is located at t = 2750 s when δ = 0.25. The shorter the δ,
the shorter the chaotic search process, easy to fall into local minimum points. The larger
the δ value is, the more capable of fully reflecting the chaotic behavior, can be separated
from the small point limit, so as to find the global optimal solution.
Gauss Nonlinear Self-feedback CNN 475

Fig. 6. The figure of Contrary Multiquadric function when δ = 0.3 and δ = 0.9

Fig. 7. The bifurcation diagram of the neuron when δ = 0.2

Fig. 8. The trend chart of maximum Lyapunov exponent when δ = 0.2

Nonlinear self-feedback connection is composed of inverse multiquadric function in


the radial basis nonlinear self-feedback chaotic neural network. The smaller the width,
the more fully the chaotic search process. Compared with the Gauss nonlinear self-
feedback CNN, the width influence on the network is just the opposite. This is not
contradictory, which is closely related to the specific structure of the two functions,
which can be clearly illustrated by Fig. 5 and Fig. 6.
Next, the influence of the parameter C of Gauss function on the dynamic behavior
of neurons is investigated. Select parameters: ε0 = 1.25, y(1) = 0.283, z(1) = 0.5, k
= 0.1, I0 = 0.85, β = 0.0025, δ = 0.25 fixed invariant. The bifurcation diagram of the
neuron and the trend chart of maximum Lyapunov exponent when c = 0.0003 as shown
in Fig. 9 and Fig. 10.
476 N. Xu et al.

Fig. 9. The bifurcation diagram of the neuron when c = 0.0003

Fig. 10. The trend chart of maximum Lyapunov exponent when c = 0.0003

Figure 9 and Fig. 10 are compared with Fig. 1 and Fig. 2, it shows that the single
neuron’s chaotic search ability is very sensitive to the parameters c. Convergence point is
located at t = 2750 s when c = 0.0001. Convergence point is located at t = 2300 s when
c = 0.0003. The difference between the c value is only 0.0002 but the convergence point
is far away. Therefore, the value of c should be appropriate to ensure that the network
has chaotic characteristics and has a strong ability to find the best.
Next, the influence of the parameter I0 of Gauss function on the dynamic behavior
of neurons is investigated. Select parameters: ε0 = 1.25, y(1) = 0.283, z(1) = 0.5, k =
0.1, β = 0.0025, δ = 0.25, c = 0.0003 fixed invariant. The bifurcation diagram of the
neuron and the trend chart of maximum Lyapunov exponent when I0 = 0.75 as shown
as Fig. 11 and Fig. 12.

Fig. 11. The bifurcation diagram of the neuron when I0 = 0.75

Figure 11 and Fig. 12 are compared with Fig. 9 and Fig. 10, it shows that parameter
I0 is also very sensitive to the chaotic search. See from formula (2) and formula (4): In
Gauss Nonlinear Self-feedback CNN 477

Fig. 12. The trend chart of maximum Lyapunov exponent when I0 = 0.75.

the nonlinear self-feedback model, I0 greatly affects the output of Gauss function. So
it is very important in the convergence of the network (corresponding to the inverted
bifurcation and the equilibrium point in the image). By numerical results we can see
that the difference of I0 is only 0.1, but the convergence of the network is a difference of
300 s. The larger the value of I0 , the shorter the time of chaotic search. If the I0 value is
too large, it is not easy to generate chaos. The smaller the value of I0 can be fully ergodic
chaotic region, it is conducive to find the global optimal solution. However, if the value
of I0 is too small, the search takes too long and does not meet the actual requirements.
Therefore, the range of I0 should be adjusted according to the situation.

2.2 The Gauss Nonlinear Self-feedback Chaotic Neural Network


Based on the Gauss nonlinear self-feedback CNN model, the energy function is intro-
duced into the internal state y(t) [7], then constitute formula (5). The Gauss nonlinear
self-feedback CNN can be constructed by a number of neurons.

xi (t) = 1/(1 + exp(−yi (t)/ε0 )) (5)


⎡ ⎤
⎢n

yi (t + 1) = kyi (t) + γ ⎢
⎣ wij xj (t) + Ii ⎥
⎦ − zi (t)f (xi (t) − I0 ) (6)
j=1
j=i

zi (t + 1) = (1 − β)zi (t) (7)

1 − u22
f (u) = e δ (8)
c
In the formula: γ is the positive scale parameter for input; Ii is the input deviation of
the neuron; Wij is the connection weights from neuron i to neuron j, and Wij = Wji , Wii
= 0.
The dynamic characteristic of the CNN model is very sensitive to the value of γ
which represents the influence of energy function on the change of internal state [8].
If the value of γ is too large, the energy function will be too strong and the network
convergence too fast. The network can’t get the transient chaos phenomenon, and it is
very easy to linger in the local range and cannot be traversed globally. If γ is too small,
the energy function will be too weak. Although transient chaos can be obtained, it is
possible not to converge to the optimal solution in a certain time range.
478 N. Xu et al.

The chaotic search process consists of two stages:


1. Rough search phase: The control parameters are selected to make the system enter
into a large range of chaotic dynamics in the beginning. Using the random nature of the
chaotic search and the ergodic property of the orbits, search for the best results in the
overall space. In the initial stage, the network is chaotic ergodic. With exponential decay
the output of the neuron is gradually transitioned to the stable state through a process of
inverted bifurcation. As the chaotic search is carried out, the control parameter begins to
decrease. When the parameter is down to a certain degree and due to the self-organizing
ability of the system, the chaotic state disappears gradually. It tends to be stable after
several inverted bifurcation processes. Since then, the search process will be limited
to a number of periodic solutions, and the control parameters can be further reduced,
and the search area is reduced. When the inverted bifurcation state disappears gradually
over time, the algorithm obtains a solution in a neighborhood that is easy to obtain the
optimal solution in the global sense. And then, the rough search process is over and then
transferred to the neural network gradient search.
2. Fine search phase: The initial value of the optimal solution is obtained at the end
of the coarse search. The optimization process is based on the strength of the energy
function and the direction of its energy value decline. The network can converge to a
more realistic result in the whole range at a faster speed. The gradual formation of smooth
curves in the image corresponds to the gradient convergence process of the network.

3 Application in Combinatorial Optimization Problems


In this paper, the traveling salesman problem (TSP) in the combinatorial optimization
problem is regarded as a simulation example used to test the advantages and disadvan-
tages of the neural network algorithm. TSP is a typical NP problem, and it is related to
many applications, such as Vehicle Routing Problems (VRP), Very Large scale Integrated
Circuits (VLSIC), etc. Because the traveling salesman problem is simple to describe but
difficult to solve, it is suitable to test whether the algorithm is feasible and efficient.
A brief explanation of the TSP is as follows: Let’s say there are N cities, the distance
between any two cities is known, It is required to find the shortest path in these cities that
traverses each city and traverses each city only once, and finally returns to the starting
point to form a loop.
N city TSP can be solved by the continuous Hopfield neural network the neuron
N × N. N × N neurons correspond to a N × N solution matrix. Each solution matrix
corresponds to a feasible path. The element xij = 1 in the matrix indicates that the i city
is accessed in the second order of j.
Formula (9) is the energy function that can solve the shortest path and meet the
constraint conditions of the TSP.
Different from the static neural network, the chaotic neural network is rich and far
away from the dynamic behavior of the equilibrium point.

A  B  D 
n n n n n n n
E= ( Vxi − 1)2 + ( Vxi − 1)2 + dxy Vxi Vy,i+1 (9)
2 2 2
x=1 i=1 i=1 x=1 x=1 y=1 i=1
Gauss Nonlinear Self-feedback CNN 479

In order to verify the effectiveness of the new network for this problem, the original
data is the classical normalized 10 city coordinates: (0.4, 0.4439); (0.2439, 0.1463);
(0.1707, 0.2293); (0.2293, 0.716); (0.5171, 0.9414); (0.8732, 0.6536); (0.6878, 0.5219);
(0.8488, 0.3609); (0.6683, 0.2536); (0.6195, 0.2634). The shortest path of the 10 cities
is 2.6776 (see Fig. 13).

Fig. 13. The optimal distance of 10-city TSP

The Gauss nonlinear self-feedback CNN is applied to solving the 10 city TSP.
Research on the function of the network solution ability and the main parameters.
Select parameters: ε0 = 0.8, z(1) = 0.5, k = 1, I0 = 0.7, β = 0.001, A = 8.3, D =
3.7, γ = 0.9, δ = 1 fixed invariant. The influence of different c values on the solution of
TSP is studied. The following Table 1 gives the results of the initial value of 200 random
assignment in this case.

Table 1. The results of 200 different internal conditions for each c

c Legal path Optimal path Legal ratio /% Optimal ratio/%


5 200 142 100 71.0
3 200 141 100 70.5
1 200 140 100 70.0
0.8 200 135 100 67.5
0.7 200 131 100 65.5
0.6 200 130 100 65.0
0.4 200 126 100 63.0
0.2 200 125 100 62.5
0.17 200 120 100 60.0
0.15 200 111 100 55.5
0.13 200 95 100 47.5
0.11 200 87 100 43.5
480 N. Xu et al.

Seen from the Table 1: Regardless of the c value, the proportion of the legal path [9]
is 100%, but its value greatly influences the proportion of the optimal path. The optimal
ratio is more than 70% when c = 1; The optimal path proportion is gradually decreasing
and the whole is more than 60%. When 0.17 ≤ c < 1; The ability to find the optimal
network greatly decreased when c ≤ 0.15. The value of the parameter c can be explained
to the Gauss nonlinear self-feedback chaotic neural network to solve the TSP. c value
should be properly selected to ensure the ability of the network to obtain the global
optimal solution [10].
Select parameters: z(1) = 0.5, k = 1, I0 = 0.7, β = 0.001, A = 8.3, D = 3.7, γ =
0.9, δ = 1, c = 5 fixed invariant. The influence of different ε0 values on the solution of
TSP is studied. The following Table 2 gives the simulation results of the initial value of
200 random assignment in this case.

Table 2. The results of 200 different internal conditions for each ε0

ε0 Legal path Optimal path Legal ratio /% Optimal ratio/%


0.1 191 113 95.5 56.2
0.2 200 118 100 59.0
0.3 200 127 100 63.5
0.5 200 140 100 70.0
0.8 200 142 100 71.0
1 200 141 100 70.5
1.2 200 142 100 71.0
1.5 200 126 100 63.0
2 200 117 100 58.5
2.5 200 112 100 56.0
3 200 106 100 53.0

See from the Table 2: Influence of the steepness parameter ε0 value on the network is
more obvious. The optimal path ratio is less than 70% when ε0 < 0.5. With the decrease
of ε0 , network optimization ability decreased [11] gradually. Not only reduce the number
of optimal path to and decrease the number of legal paths when ε0 is too small such as
ε0 = 0.1 in the Table 2. This shows that the network is easy to fall into local minimum
point and can’t make full use of the chaotic characteristics to find the global optimal
solution; The optimal ratio is lower than 70% when ε0 > 1.2. The bigger the ε0 value,
the smaller the optimal rate, but the legal path ratio remains unchanged at 100%; The
optimal ratio keeps at 70% when 0.5 ≤ ε0 ≤ 1.2. This shows that the network has a better
ability to solve the TSP and has a better ability to find the best. Therefore, under the
same condition, the range of steepness parameters suggested that the control at 0.5–1.2.
Gauss Nonlinear Self-feedback CNN 481

4 Conclusion
A new nonlinear self-feedback chaotic neural network model is constructed in this paper.
The new model has the characteristics of chaotic dynamics by analyzing the inverted
bifurcation and Lyapunov exponent [12]. The new network is applied to solve the TSP.
The simulation experiment shows web searches that the global minimum point is sen-
sitive to Gaussian function parameters c and the steepness parameter ε0 . The network
can have a good ability to find the best solution if the model parameters are selected
appropriately.

References
1. Xu, Y.Q., He, S.P.: Self-feedback chaotic neural networks with trigonometric functions and its
applications. In: Proceedings of the 2009 China Conference on Control and Decision Making,
Guilin, Guangxi (2009)
2. Xu, Y.Q., Yang, X.L.: Chaotic neural network with anti-trigonometric function self-feedback
and its application. J. Harbin Univ. Commer. (Nat. Sci. Edn.) 26(3), 324–328 (2010)
3. Tong, X., Li, J.B.: Robust encryption face recognition algorithm based on composite chaos
in cloud environment. Mod. Electron. Technol. 42(06), 166–169 (2019)
4. Chen, L., Aihara, K.: Chaotic simulated annealing by a neural network model with transient
chaos. Neural Netw. 8(6), 915–930 (1995)
5. Xu, Y., Yu, Q.: Optimizing logistic distribution routing problem based on wavelet chaotic neu-
ral network. In: Proceedings of the 2017 Chinese Control and Decision Conference, CCDC,
pp. 3460–3464 (2017)
6. Hopfield, J.: Neural networks and physical systems with emergent collective computational
abilities. Proc. Natl. Acad. Sci. 79, 2554–2558 (1982)
7. Wang, X.-Y., Bao, X.-M.: A novel image block cryptosystem based on a spatiotemporal
chaotic system and a chaotic neural network. Chin. Phys. B 22(5), 050508 (2013)
8. Abdoun, N., El Assad, S., Hoang, T.M., Deforges, O., Assaf, R., Khalil, M.: Designing two
secure keyed hash functions based on sponge construction and the chaotic neural network.
Entropy 22(9), 1012 (2020)
9. Ye, Y.G., Xu, Y.Q.: SLF chaotic neural network and its application. Comput. Eng. Appl.
(2015)
10. Xiu, C.B., Liu, C., Guo, F.H., Cheng, Y., Luo, J.: Research on control strategy and application
of delayed chaotic neuron/network. ActaPhys. Sin. (2015)
11. Hu, Z.Q., Li, W.J., Qiao, J.F.: Frequency conversion sinusoidal chaotic neural network based
on adaptive simulated annealing. J. Electron. 47(3), 613–622 (2019)
12. Hu, N.W., Qi, M.L.: Molecular dynamics simulation of displacement cascades in Ni-Mo alloy.
Nucl. Sci. Tech. 26(6), 060603 (2015)
Guidance Prediction of Coupling Loop Based
on Variable Universe Fuzzy Controller

Ming Zhao1,2 , Yang Liu1 , Hui Li1,2 , Yun Cao1(B) , Yuru Zhang1 , and Hao Jin1
1 Harbin University of Commerce, Harbin 150028, China
2 Heilongjiang Provincial Key Laboratory of Electronic Commerce and Information Processing,

Harbin 150028, China

Abstract. This paper proposes a guidance prediction system of Landing Signal


Officer (LSO) based on variable universe fuzzy logic to ensure the landing safety
of carrier-based aircraft. After analyzing safety factors during the landing pro-
cess, the longitudinal loop’s glideslope deviation and sink rate deviation and the
centering deviation and drift rate deviation of the lateral loop are the influencing
factors of carrier-based aircraft. The LSO landing guidance system structure and
operation characteristics are discussed. Considering the nonlinearity, complexity
and fuzziness of decision-making behavior, a variable universe fuzzy system is
designed to realize the LSO prediction process. Simulation results show that the
improved LSO guidance prediction model presented in this paper can simulate
the actual decision-making characteristics of LSO, and the output results of the
system conform to the deviation correction effect under the real environment. The
obtained results have a certain reference value for instruction decision research,
LSO training, and adaptation of carrier and aircraft especially.

Keywords: Variable universe fuzzy controller · Guidance prediction · Coupling


loop · Landing signal officer

1 Introduction
Because of the significance of decision-making, researchers have brought the LSO guid-
ance prediction approach into focus [1–4]. A discrete-time series of LSO is applied to
model the pilot’s final actions [5]. “Instruction sending-operation responding-deviation
correcting” process is realized based on LSO guidance instructions [6]. Ming presented
a model of LSO for the digital simulation of a pilot-carrier system [7]. Four LSO grade
technology should be proposed: effect capability integration of multiple flight states
attributes; effect capability integration at multiple reference positions; effect capabil-
ity quantification and evaluation on the final approach [8]. However, current research
mainly focuses on the dynamic simulation of LSO guidance decision-making, and there
is a lack of analysis on the characteristics as a person of LSO.
The rest of this paper is structured as follows: in the next section, we first analyze
the influence factors of the landing process. Section 3 researches the landing guidance
system characteristics. The control approach of variable universe fuzzy logic will be
introduced in Sect. 4. The guidance prediction system of LSO has been spread to prove
the feasibility of the improved algorithm in Sect. 5.

© The Author(s), under exclusive license to Springer Nature Switzerland AG 2022


A. E. Hassanien et al. (Eds.): BIIT 2021, LNDECT 107, pp. 482–490, 2022.
https://doi.org/10.1007/978-3-030-92632-8_45
Guidance Prediction of Coupling Loop 483

2 Influence Factors Analysis of Landing Process


During the landing process of carrier-based aircraft, the pilots need to make constant
adjustments according to the current flight status and LSO instructions. The deviation
is corrected completely, and finally, a safe landing is achieved [3]. The factors affecting
the safety of carrier-based aircraft landing are mainly divided into a longitudinal loop
and a lateral loop.

2.1 Longitudinal Loop

In the longitudinal loop, two factors impact safety: glideslope deviation and sink rate
deviation.

Glideslope Deviation. It is a longitudinal real-time element that is the most concerned


with LSO. By judging the glideslope position of the carrier-based aircraft, the flight
states of the aircraft can be monitored all the time, and the ramp clearance should also
be calculated.

Sink Rate. The ideal landing longitudinal sink rate is constant. However, there is pitch,
yaw and roll motion during the landing process, and the pilot sometimes changes the
attitude by altering the sink rate or throttle.

2.2 Lateral Loop

In the lateral loop, the centering deviation and drift rate are the two important factors
for landing safety.

Centering Deviation. It is the lateral factor that is the most concerned with LSO. The
centering means that the carrier-based aircraft must be correctly aligned with the center-
line of the landing operation area on the deck. Otherwise, lateral collision accidents may
occur during landing. The centering quality of aircraft based on the “centring scaleplate”
on the ramp should be judged by LSO.

As shown in Fig. 1, when the plane is centered on the left, it may crash into the sea.
Else if the aircraft is on the right side, an aircraft crash accident may happen.

Drift Rate. The lateral deviation is compensated by changing the drift rate, however,
when a great right drift rate has occurred, a large position deviation should exist. The
aircraft body is likely to strike the carrier’s starboard. Therefore, the pilot should change
the drift rate according to the horizontal instruction provided by LSO in real-time to
achieve the purpose of lateral centering.
484 M. Zhao et al.

Very Right
Right
A Little Right
OK
Cables A Little Left
Left
Desired touchdown
Very Left
Ramp
Carrier

Fig. 1. Centring position deviations

LSO control loop

Predicion Longitudinal fuzzy


information Coupling fuzzy guidance system
guidance system
Lateral fuzzy
guidance system

Aircraft control loop

Trajectory control loop Attitude control loop

Inner-loop Cockpit
Outer-loop Inner-loop controller
command
control control Airframe
Outer-loop technique technique Outer-loop
Engine state
command
Pilot vision sensitivity
APCS
Position Attitude

Fig. 2. Landing guidance system of LSO

3 Landing Guidance System Characteristic


The structure of LSO landing guidance system is shown as Fig. 2.
As an individual, the instructions of LSO have some characteristics [4]:

(1) Behaviors have nonlinear characteristics;


(2) The landing guidance information of carrier-based aircraft should be achieved
according to multiple flight states, and the process has complex characteristics;
(3) Guidance judgment is a kind of fuzzy description. For example, LSOs often send
instructions such as “Position is a little high” and “Centring is very right” [2].
Guidance Prediction of Coupling Loop 485

Due to the nonlinearity, complexity, and fuzziness of LSO decision behavior, it is so


difficult to model by using the traditional control methods. In this paper, the modeling
object is taken as a “black box”. The fuzzy logic approach is used to realize the language
expression of operational experience of LSO, and “fuzzy rules” are formed to simulate
the guidance decision behavior.

4 Control Approach of Variable Universe Fuzzy Logic


4.1 Fuzzy Rule Representation
Let xk = (x1k , x2k , ..., xnk )T be the input at step k, and yk+1 be the output at step k + 1. The
input universe is U = U1 ×U2 ×...×Un ⊂ Rn , where Ui = [−ηCIi , CIi ](i = 1, 2, ..., n),
and output universe is V ⊂ R, where V = [−ξ YI , YI ]. The ni fuzzy sets Airi (ri =
1, 2, ..., ni ) are divided in Ui , and the center is xiri . The no fuzzy sets Bro (ro = 1, 2, ..., no )
are divided in V, and the center is yro [9–14]. The fuzzy rule is expressed as:

Rj : if x1 is A1j and x2 is A2j . . . and xn is Anj , then y is Bj , j = 1, 2, . . . m (1)


Where
n
m= ni (2)
i=1

{Aij |j = 1, 2, ..., m} = {Airi |ri = 1, 2, ..., ni } (3)

{Bj |j = 1, 2, ..., m} = {Bro |ro = 1, 2, ..., no } (4)

For a Multiple Input Single Output (MISO) fuzzy system, the controller f : xk ∈ U ⊂
Rn → yk+1 ∈ V ⊂ R based on the fuzzy rule shown in (1) could be expressed as [15]:

m 
n
yj μAij (xik )
j=1 i=1
yk+1 = f (xk ) = (5)

m 
n
μAij (xik )
j=1 i=1

Where μAij (xik ) is the membership function of input fuzzy sets Airi , and yj is the center
of output fuzzy set Bro .

4.2 Variable Universe Fuzzy Control Algorithm


Let xkij and ykj be the centers of fuzzy sets Akij and Bjk , the algorithm of variable universe
fuzzy control is [15–17]:
Step 0: At k = 0 moment, the input of controller is x0 = (x10 , x20 , ..., xn0 ), the output of
controller on the basis of fuzzy rule is:

m 
n
y0j μA0 (xi0 )
ij
j=1 i=1
y1 = f (x0 ) = (6)

m 
n
μA0 (xi0 )
ij
j=1 i=1
486 M. Zhao et al.

Step 1: At k = 1 moment, the input of controller is x1 = (x11 , x21 , ..., xn1 ), and x1ij =
α(xi1 )x0ij , y1j = β(y1 )y0j , then the output of controller is:

m 
n
y1j μA1 (xi1 )
ij
j=1 i=1
y2 = f (x1 ) = (7)

m 
n
μA1 (xi1 )
ij
j=1 i=1

Step k: At k moment, the input of controller is xk = (x1k , x2k , ..., xnk ), and xkij = α(xik )x0ij ,
ykj = β(yk )y0j , then the output of controller is:

m 
n
ykj μAk (xik )
ij
j=1 i=1
yk+1 = f (xk ) = (8)

m 
n
μAk (xik )
ij
j=1 i=1

According to (5), it can be seen that


xik
xik ∈ Uik = [−ηCIi αi (xik ), CIi αi (xik )] ⇔ ∈ [−ηCIi , CIi ] (9)
αi (xik )
And:
xik
xik ∈ Akij ⇔ ∈ A0ij (10)
αi (xik )
xik
μAk (xik ) = μA0 ( ) (11)
ij ij αi (xik )

Meanwhile, ykj = β(yk )y0j , the final output of controller is:



m 
n
xik
y0j μA0 ( )
ij αi (xik )
j=1 i=1
yk+1 = f (xk ) = β(yk ) (12)

m 
n
xik
μA0 ( )
ij αi (xik )
j=1 i=1

The universe Uik shrinks with the decrease of xik , and because of the invariance of
input universe Uik division at k moment, the fuzzy sets Akij still shrink. The neighbourhood
of expected control point xik = 0 is divided particularly, which is equivalent to increase
the number of rules.

5 Guidance Prediction System of LSO


5.1 Fuzzy Rule of Coupling Loop
During the landing process of carrier-based aircraft, pilots will be distracted when receiv-
ing LSO instructions. When a number of instructions is more than one dimension, it will
Guidance Prediction of Coupling Loop 487

cause severe coupling of pilot control, and the risk coefficient of landing would be
increased simultaneously. Therefore, an important working principle of LSO is to send
as few instructions to pilots as possible on the premise of ensuring the safety of pilot’s
operation [6]. Based on it, in the actual control process, the priority order of LSO com-
mand information needs to accomplish, and the most critical guidance instructions for
pilots are sent first.
The design principles of the LSO coupling fuzzy guidance predictin system are as
follows:

(1) The “Climbing” instruction has a higher priority than “Left/Right” instruction of
the same level.
(2) The “Left/Right” instruction has a higher priority than “Falling” instruction of the
same level.
(3) The “Left/Right” instruction has a higher priority than “Climbing” instruction of
the next level.

Table 1 is the fuzzy control rule of the coupling system.

Table 1. Fuzzy control rule of coupling loop system.

Longitudinal Lateral
PB1 PM1 PS1 AZ1 NS1 NM1 NB1
PB2 PB2 PB2 PB2 PB2 PB2 PB2 PB2
PM2 PB1 PM2 PM2 PM2 PM2 PM2 NB1
PS2 PB1 PM1 PS2 PS2 PS2 NM1 NB1
AZ2 PB1 PM1 PS1 AZ2 NS1 NM1 NB1
NS2 PB1 PM1 PS1 NS2 NS1 NM1 NB1
NM2 PB1 PM1 NM2 NM2 NM2 NM1 NB1
NB2 PB1 NB2 NB2 NB2 NB2 NB2 NB1

5.2 Fuzzy Guidance System of Coupling Loop

Since the instructions sent by LSO to the pilots are discrete signals, it is necessary to
first discrete the output curves of the longitudinal and lateral fuzzy controllers of LSO.
The discrete instruction bar chart is shown in Fig. 3.
As shown in Fig. 3, the instruction priority would be automatically determined by
LSO based on the current flight states. One-dimensional instruction is selected from
the 14 longitudinal and lateral instructions to send to the pilot. The pilot can clarify
the current most major flight deviation and perform maneuver operations to ensure the
safety of the landing process.
488 M. Zhao et al.

Longitudinal output instructions

Very High
High

A Little High
OK

A Little Low
Low

Very Low

0 1 2 3 4 5 6 7 8 9 10
t/s

(a) Output instructions of longitudinal system


Lateral output instructions

Very Right
Righ
t
A Little Right
OK

A Little Left
Left

Very Left

0 1 2 3 4 5 6 7 8 9 10
t/s

(b) Output instructions of lateral system


Coupling output instructions
Very Right
Right
A Little Right
OK
A Little Left
Left
Very Left

Very High
High
A Little High
OK
A Little Low
Low
Very Low
0 1 2 3 4 5 6 7 8 9 10
t/s

(c) Output instructions of coupling system


Fig. 3. Output instructions of the system
Guidance Prediction of Coupling Loop 489

6 Conclusions
In the landing process of carrier-based aircraft, the characteristics of LSO deviation
instructions, the glideslope deviation, and sink rate deviation of the longitudinal loop,
and the centering deviation and drift rate deviation of the lateral loop are the influencing
factors of carrier-based aircraft landing safety. The deviations above are input into the
landing guidance model of LSO and an intelligent guidance prediction system fuzzy
intelligent guidance system based on variable universe fuzzy logic. Simulation results
show that the guidance prediction model established by fuzzy logic conforms to the
operating characteristics of the true LSO, and the output results accord with the correction
of deviation effect under the real environment. In the meantime, the establishment of
the system model also provides an effective solution for the objects with uncertainty,
nonlinearity, and task environment complexity, especially for the object is an individual.

Acknowledgments. This work is supported by the Natural Science Foundation of Heilongjiang


Province of China (No. YQ2020G002), University Nursing Program for Young Scholars with
Creative Talents in Heilongjiang Province (No. UNPYSCT-2020212), and Science Foundation of
Harbin Commerce University (No. 18XN064).

References
1. Zhu, Q., Yang, Z.: Dynamic recurrent fuzzy neural network-based adaptive sliding control
for longitudinal automatic carrier landing system. J. Intell. Fuzzy Syst. 37(1), 53–62 (2019)
2. Zuo, Z., Wang, L., Liu, H., Wang, Y.: Similarity for Simulating automatic carrier landing
process of full-scale aircraft with scaled-model. Acta Aeronautica et Astronautica Sinica
40(12), 123005 (2019)
3. Zhou, J., Jiang, J., Yu, C., Xiao, D.: Carrier aircraft dynamic inversion landing control based
on improved neural network. J. Harbin Eng. Univ. 39(10), 1649–1654 (2018)
4. Hess, R.: Simplified approach for modeling pilot pursuit control behaviour in multi-loop flight
control task. Inst. Mech. Eng. 220(2), 85–102 (2006)
5. Wang, L., Zhu, Q., Zhang, Z., Dong, R.: Modeling pilot behaviors based on discrete-time
series during carrier-based aircraft landing. J. Aircr. 53(6), 1922–1931 (2016)
6. Li, H.: Modeling landing signal officer instruction associated with operation guide system.
Int. J. Control Autom. 8(2), 373–382 (2016)
7. Shi, M., Cui, H., Qu, X.: Modeling landing signal officer for carrier approach. J. Beijing Univ.
Aeronaut. Astronaut. 32(2), 135–138 (2016)
8. Li, H.: Integrated evaluation technology of landing signal officer for carrier-based aircraft.
Int. J. Multimedia Ubiquit. Eng. 11(1), 169–178 (2016)
9. Li, H.: The essence of fuzzy control and a kind of fine fuzzy controller. Control Theory Appl.
14(6), 868–871 (1997)
10. Shi, P., Xu, Z., Wang, S.: Variable universe adaptive fuzzy pid control of active supension.
Mech. Sci. Technol. Aerosp. Eng. 38(5), 713–720 (2019)
11. Du, E., Wang, S., Chang, L.: Variable universe fuzzy controller design of missile mark
trajectory with feed-forward compensation. J. Acad. Armored Force Eng. 31(2), 84–89 (2017)
12. Li, H.: Interpolation mechanism of fuzzy control. Sci. China (Series E) 28(3), 259–267 (1998)
13. Yang, Z., Wang, H.: Maximum power point tracking for photovoltaic power system based on
asymmetric fuzzy control. Mech. Autom. 41(2), 153–156 (2012)
490 M. Zhao et al.

14. Liu, J., Zhang, Y.: Variable universe fuzzy pid control method in piezoelectric ceramic
precision displacement system. Autom. Instrum. 32(2), 45–49 (2017)
15. Li, D., Shi, Z., Li, Y.: Sufficient and necessary nonditions for boolean fuzzy systems as
universal approximators. Inf. Sci. 178(2), 14–24 (2008)
16. Li, H.: Adaptive fuzzy control of four-stage inverted pendulum based on variable universe.
Sci. China (Series E) 32(1), 65–75 (2002)
17. Chen, G.: On approaching precisions of standard fuzzy systems with different basic functions.
Acta Autom. Sinica 34(7), 823–827 (2008)
Heart Disease Recognition Algorithm Based
on Improved Random Forest

HaiTao Xin(B) and Hao Yu

Harbin University of Commerce, Harbin 150028, Heilongjiang, China


102714@hrbcu.edu.Cn

Abstract. To improve the reliability of heart disease diagnosis, an


optimized random forest algorithm is proposed to model and predict the open-
heart disease data sets of Kaggle and Cleveland Medical Center. The optimization
process includes data preprocessing such as missing data processing, one-hot
encoding, and normalization, using a learning curve and grid search algorithm
to adjust the parameters. Experimental results showed that the optimized random
forest algorithm has an accuracy of 94.9% for heart disease recognition and an
AUC value of 98.5%. This algorithm has the advantages of higher classification
accuracy, lower overfitting, and better model generalization ability. It is feasible
for assisting doctors in diagnosing heart disease and reducing the misdiagnosis
rate.

Keywords: Random forest · Data preprocessing · Heart disease diagnosis

1 Introduction
Predicting whether heart disease occurs has always been a research hotspot in the modern
medical field. According to the “China Cardiovascular Health and Disease Report 2019”
issued by the National Medical Center, the number of people suffering from heart disease
in my country each year is gradually increasing compared with previous years, and the
death rate of cardiovascular disease is the first. The use of machine learning classification
algorithms for heart disease diagnosis is significant for establishing a stable and reliable
predictive model for heart disease. Scholars at home and abroad have conducted a lot of
research. In 2014, Dai et al. used classification models such as SVM, logistic regression,
and naive Bayes to predict heart disease and obtained a prediction accuracy of 82% [1].
Kedar et al. used the K-nearest neighbor algorithm to predict and classify heart disease
in 2016 and obtained an accuracy of 75% [2]. Chen Jianghong and others used SVM to
diagnose heart disease in 2017 with missing data [3]. Luo Benjie designed a heart disease
recognition platform based on random forest in 2017 [4]. Wen Bowen used improved
grid search technology to optimize the parameters of random forest in 2018 [5]. Ding
Weijie compared the recognition accuracy of seven classification algorithms on the heart
disease data set in 2019 [6]. Zhao Jinchao et al. used an optimized random forest for
heart disease prediction in 2021 [7].

© The Author(s), under exclusive license to Springer Nature Switzerland AG 2022


A. E. Hassanien et al. (Eds.): BIIT 2021, LNDECT 107, pp. 491–500, 2022.
https://doi.org/10.1007/978-3-030-92632-8_46
492 H. Xin and H. Yu

In traditional diagnosis, the doctor judges whether a person has heart disease based
on clinical experience. There will inevitably be a risk of misjudgment and cause med-
ical accidents. Combining medical diagnosis with machine learning, data mining and
other technologies can achieve the dual protection of man-machine integration, greatly
reducing the risk of medical misdiagnosis and improving medical efficiency. The opti-
mized random forest proposed in this paper has a high accuracy rate for diagnosing heart
disease. It is more suitable for the diagnosis of heart disease than some other machine
learning algorithms.

2 Related Theories
As the name suggests, random forest uses a random way to build a forest composed of
multiple trees. It is an integrated algorithm based on a decision tree as a learner. The
random forest algorithm integrates many independent and identically distributed decision
trees into a forest, and together they are used to predict the final result. Compared with a
single classification algorithm, the integration of decision trees has a better classification
effect, lower overfitting, and stronger generalization ability. It is essentially an integrated
improvement of decision trees.

2.1 Decision Tree


The decision tree is very intuitive and belongs to the white-box model. In the process of
tree building, all features are traversed, and the feature with the highest attribute impor-
tance is selected as the split node until there are no remaining features in the data set for
further division. The impurity from the root node to the leaf node is gradually decreas-
ing. Common feature measurement methods include information gain, information gain
rate, and Gini coefficient, etc. The steps of the decision tree algorithm are as follows:

(1) Create the root node N.


(2) According to a certain measurement method, select the best split feature and
establish branch nodes.
(3) For the remaining features, repeat operation (2) to further divide the subset.
(4) Perform pruning operations.

The two most critical parts in the construction of a decision tree are node splitting and
pruning operations. Choosing the best split node is to choose the feature node with the
highest purity and how to prune to make the decision tree fit the least, which determines
the quality of the classification effect. Random forest introduces randomness theory to
avoid the limitations of decision trees.

2.2 Random Forest


Random forest is a machine learning algorithm proposed by Leo Breiman in 2001 by
combining Bagging ensemble learning theory and random subspace method [8]. Random
forest solves the shortcomings of low classification accuracy and over-fitting of decision
Heart Disease Recognition Algorithm 493

trees. It is an integrated algorithm of decision trees. Several decision trees first determine
each sample of the input, and then the results of each tree are summarized to determine
the sample type. For the judgment of each sample, only when more than half of the
decision trees are judged incorrectly will there be a misjudgment, so the accuracy rate
is greatly improved.
The randomness of random forest is mainly reflected in two aspects: one is to ran-
domly select data from the data set, and the self-service method introduces higher diver-
sity to the training subset of each predictor so that the correlation between the base
predictors is lower; The second is to select a feature subset from the features randomly,
and randomly select a certain size subset every time a node is split, and then select the
best feature from the feature subset to split according to the importance of the feature.
The algorithm execution steps are as follows:

(1) Use the Bootstrap sampling method from the original data set D to extract D training
sample sets of the same size as the original data set used to build the model.
(2) Randomly select some features each time a node is split and select the best split
node from the subset of features.
(3) Each decision tree determines the class of the sample.
(4) Voting to get the final category of the sample, the voting formula can be expressed
by formula (1).


k
H(x) = argmax I (hi (x) = Y ) (1)
Y i=1

Compared with other single classifiers, the random forest has a better classification
effect. The generalization ability of the model is strong. Because of its randomness, it
does not need to perform feature selection when making classification predictions.

3 Data Preprocessing

This experiment uses discrete data one-hot encoding as the experimental data. Data
preprocessing was completed through missing value processing on this data set, hot
editing of discrete data code, and normalization processing.

3.1 Missing Value Processing

There are two commonly used methods for processing missing values in data mining:
filling method and the deletion method. The filling method mainly includes meaning
filling, mode filling, and using machine learning algorithm modeling prediction filling,
etc. It is mainly suitable for scenarios with relatively large feature data missing and
relatively important features. In this experiment, the number of missing data is small, so
the deletion method is adopted to process the missing values. The experimental results
show that the deletion of a few missing values will not reduce the prediction effect of
the model but may increase the accuracy of the prediction.
494 H. Xin and H. Yu

3.2 Discrete Data One-Hot Encoding

The purpose of one-hot encoding is to convert sub-type variables into numerical values
for later modeling and prediction. This article maps all sub-type variables to integer
values and represents them as binary vectors, except that the integer index is 1, and the
rest are 0. The corresponding category matrix is generated according to the number of
categories of each classification feature, the category corresponding to the sample is 1,
and the rest are 0.
First, convert the values of all classification features into corresponding textual mean-
ings and then perform one-hot encoding. After deletion of missing values and one-hot
encoding, the data set has changed from 1328 samples and 14 feature columns to 1315
samples and 26 feature columns.

3.3 Data Normalization


Data normalization is a means of data preprocessing. The normalized data is compressed
to a range of 0–1 and obeys normal distribution. This paper compares the influence of the
data after standardization and normalization on the model prediction results, and selects
the normalization operation with higher accuracy and faster convergence speed. The
normalization uses the MinMaxScaler method in the sklearn library, and the formula is
shown in formula (2).
x − min(x)
x= (2)
max(x) − min(x)

4 Algorithm Optimization Design

4.1 Feature Correlation Analysis

Before modeling, a heat map is built to analyze the correlation of data features, as shown
in Fig. 1.
The horizontal and vertical coordinates of the heat map respectively represent the
feature columns of the data set, and the right side represents the correlation coefficient
between the features. The darker the color, the stronger the correlation between the two
features. It can be seen from the above Fig. that the diagonal color is the darkest, the
correlation coefficient is 1, and the colors of the other positions are lighter. It can be seen
that the correlation between the features of the data set is low, and there is no redundancy
between the features.

4.2 Parameter Optimization

The experiment is based on Python language and performs random forest modeling and
prediction in the Anaconda environment. Use learning curve and grid search technology
to optimize the design of random forest parameters. According to the importance of
the influence of each parameter on the prediction result, five important parameters of
Heart Disease Recognition Algorithm 495

Fig. 1. Feature correlation analysis.

n_estimators, max_depth, min_samples_leaf, min_samples_split, and max_features are


adjusted in turn.
First, extract the features and label columns of the data set and use the train_test_split
class to divide this experiment’s heart disease data set according to the ratio of 7 training
sets and 3 test sets. In theory, the more decision trees in the random forest, the larger
the n_estimators parameter, the better the classification effect of the random forest.
Use the learning curve to adjust the n_estimators parameter. The initial range is set to
[0,200,10], from 0 to 200, and the score is calculated by the 10-fold cross-validation
method every 10 intervals, and the best value is found in the interval from 0 to 200.
The best score is the approximate intervals of n_estimators, and then further refine
the interval to find the best value of the parameter. Use grid search for max_depth,
min_samples_leaf, min_samples_split, and max_features to adjust to the direction with
the lowest model generalization error. This experiment adopts a parameter-by-parameter

Fig. 2. Experimental model flow chart.


496 H. Xin and H. Yu

adjustment method, which not only reduces the algorithm optimization time, but also
accurately knows the prediction of each parameter to the model The effect of the effect.
The final optimal combination of parameter settings is: n_estimators: 81, max_depth:
12, min_samples_split: 3, min_samples_leaf: 1, max_features: by default, the algorithm
has the highest accuracy rate. The flow chart of the experimental model is shown in
Fig. 2.

5 Experimental Results and Analysis


5.1 Introduction to the Data Set
The experimental data comes from the Kaggle Heart Disease Data Set and the Heart
Disease Data Set provided by the Cleveland Clinic, which are the data sets used by
almost all researchers to date. These two data sets contain 1025 and 303 data samples,
respectively. In order to obtain a better generalization effect in this model, two data sets
are merged here. The composited data set has a total of 1328 items and 14 attributes, or
tag columns. The meaning of the attribute name is shown in Table 1.

5.2 Experimental Setup and Environment


The experiment uses the Jupiter notebook that comes with the anaconda environment, the
version is 6.1.4. The hardware device is Intel Core i5, the operating system is Windows10,
the programming language used is python, and the version is 3.8, including third-party
libraries NumPy, pandas, matplotlib and sklearn.

5.3 Evaluation Index


The evaluation indicators of the experimental results select the commonly used model
evaluation indicators of machine learning, including confusion matrix, accuracy rate,
precision rate, recall rate, and AUC value. The confusion matrix can be used to evaluate
the accuracy of the prediction results of the model established in this paper on the test
set, as shown in Table 2.
Accuracy: The percentage of samples with correct predictions to the total samples. The
specific formula is as shown in formula (3).
TP + TN
Accuracy = (3)
TP + FN + FP + TN
Precision: The percentage of the positive samples predicted by the model that are truly
positive. The specific formula is as shown in formula (4).
TP
Precision = (4)
TP + FP
Recall rate: the percentage of samples that are actually positive, which is predicted to
be positive. The specific formula is as shown in formula (5).
TP
Recall = (5)
TP + FN
Heart Disease Recognition Algorithm 497

Table 1. Data set attribute list.

Feature name Feature description Feature type


age Age Numerical
sex Sex (0,1) Two types
cp Types of chest pain (0,1,2,3) Four types
trestbps Resting blood pressure Numerical
chol Serum cholesterol content Numerical
fbs Fasting blood glucose >120 mg/dl (0,1) Two types
restecg Resting electrocardiogram (0,1,2) Three types
thalach Resting electrocardiogram (0,1,2) Numerical
exang Exercise-induced angina pectoris (0,1) Two types
oldpeak ST depression caused by exercise Numerical
slope Slope of ST segment (0,1,2) Three types
ca Number of main blood vessels Numerical
thal Thalassemia (1,2,3) Three types
target Heart disease prediction (0,1) Two types

Table 2. Confusion matrix.

Healthy/illness Predict health Predict illness


Actual health TP FN
Actual illness FP TN

The ROC curve refers to constantly moving the prediction “threshold” of the classifier
to generate a set of key points (FPR, TPR) on the curve. From a qualitative point of view,
the closer the ROC curve is to the upper left corner, the better the classification effect
of the model. From a quantitative point of view, the larger the AUC value, the better the
classification ability of the model.

5.4 Result Analysis


In order to visually show the prediction results of the model on the test set, this experiment
draws the heat map of the confusion matrix as shown in Fig. 3.
According to the confusion matrix and formulas (3), (4), and (5), the accuracy,
precision, and recall of the optimized random forest on the heart disease data set can be
obtained. In order to further verify the validity of the experiment, this paper compares
the model with a decision tree (DT), unoptimized random forest (RF), and random
forest based on K-nearest neighbor processing (KNN-RF). By comparing the accuracy,
precision, recall, and AUC value of these algorithms, it is found that the optimized
498 H. Xin and H. Yu

Fig. 3. Confusion matrix heat map.

Table 3. Comparison results of different algorithms.

Classification algorithm Accuracy Precision Recall AUC


DT 0.909 0.904 0.904 0.909
RF 0.937 0.927 0.942 0.985
KNN-RF 0.812 0.812 0.824 0.923
Optimized RF 0.949 0.956 0.936 0.985

random forest algorithm proposed in this paper has a better predictive effect on heart
disease. The experimental results are shown in Table 3.
It can be seen from Table 3 that the heart disease data set after data preprocessing has a
good predictive effect on the random forest model itself, indicating that the random forest
algorithm itself is suitable for the diagnosis of heart disease. The model’s accuracy used
in this experiment is 94.9%, and the AUC value is as high as 98.5%, which is significantly
improved compared to decision trees and KNN-RF.
In order to better display the AUC value, the ROC curve of each algorithm is drawn.
Figure 4 is a schematic diagram of the ROC curve of a decision tree, Fig. 5 is a schematic
diagram of the ROC curve of a random forest without tuning parameters, and Fig. 6 is
a schematic diagram of the ROC curve of KNN-RF. Figure 7 is a schematic diagram of
the ROC curve of the model used in the experiment.
Heart Disease Recognition Algorithm 499

Fig. 4. Decision tree ROC curve. Fig. 5. Random forest ROC curve.

Fig. 6. KNN-RF ROC curve. Fig. 7. Optimized random forest ROC curve.

6 Conclusion

Through preprocessing of the Kaggle heart disease data set and the heart disease data set
provided by the Cleveland Medical Center, an optimized random forest algorithm was
proposed. In this diagnosis and recognition system, feature correlation analysis, adjust-
ment of model hyperparameters, and the comparison of related studies were adopted to
obtain a better accuracy rate and prediction effect. Although the random forest algorithm
has a high degree of fit and good generalization ability, due to the small heart disease
data set, the predictive effect of the model on a large data set is unknown. to prove the
predictive effect of the model on a large data set. The algorithm needs to be further
optimized in conjunction with feature engineering.

References
1. Dai, W.Y., Brisimi, T.S., Adams, W.G.: Prediction of hospitalization due to heart disease by
supervised learning methods. Int. J. Med. Inform. 84(3), 189–197 (2015)
2. Kedar, S., Bormane, D.S., Nair, V.: Computational Intelligence in Data Mining, vol. 1, pp. 69–
56. Springer, New Delhi, India (2016)
500 H. Xin and H. Yu

3. Chen, J.H., Wan, H.J., Zhang, Q.H.: diagnosis of heart disease based on support vector machine
with missing data. Math. Pract. Knowl. 47(02), 130–135 (2017)
4. Luo, B.J.: Design and Implementation of A Heart Disease Prediction Platform Based on
Random Forest. University of Posts and Telecommunications, Beijing (2018)
5. Wen, B.W., Dong, W.H., Xie, W.J., Ma, J.: parameter optimization of random forest based on
improved grid search algorithm. Comput. Eng. Appl. 54(10), 154–157 (2018)
6. Ding, W.J.: Research on Classification Algorithm in Prediagnosis of Heart Disease. Xidian
University, China (2019)
7. Zhao, J.C., Li, Y., Wang, D., Zhang, J.H.: An optimized random forest heart disease prediction
algorithm. J. Qingdao Univ. Sci. Technol. (Nat. Sci. Edition) 42(02), 112–118 (2021)
8. Wang, Y.S., Xia, S.T.: Overview of random forest algorithms for ensemble learning. Inf.
Commun. Technol. 12(01), 49–55 (2015)
Improved YOLOv3 Road Multi-target Detection
Method

Jingtao Fan1 , Yaoqun Xu1,2(B) , and Meng Tang1


1 School of Computer and Information Engineering,
Harbin University of Commerce, Harbin 150028, China
xuyq@hrbcu.edu.cn
2 Institute of System Engineering, Harbin University of Commerce, Harbin 150028, China

Abstract. Traditional object detection algorithms have problems of low efficiency


and high error rate in traffic object detection. So, this study proposes an improved
algorithm based on YOLOv3. Firstly, the K-means++ algorithm was used to
improve the extraction of the central point of clustering prior box, and a more
appropriate prior box was selected. Meanwhile, the loss function was replaced
with CIoU loss. Besides, the CIoU algorithm was used for optimization. Exper-
iments were conducted to detect pedestrians and vehicles on a self-made mixed
dataset. The experimental results indicate that the improved YOLOv3 algorithm
can effectively reduce the object’s missed and false detection rate. The improved
algorithm achieved an average accuracy of 92.79% on the mixed data, and its
accuracy and recall rate are 2.6% and 1.79% higher than those of the original
YOLOv3 algorithm.

Keywords: Deep learning · YOLOv3 · Object detection · CIOU · K-means++

1 Introduction
Object detection has always been an important research direction in the field of computer
vision. Traditional methods, such as histogram of oriented gradient (HOG) [1], scale-
invariant feature transform (SIFT) [2], deformable parts model (DPM) [3], mainly rely
on artificial extraction of the object feature. Meanwhile, only describing the appearance
and shape of an object cannot show its deep characteristic information, and the number
of candidate object boxes is more, leading to poor generalization ability of the model.
Besides, the practical application of the model is limited by a large amount of calculation
in the complex background and its weak robustness [4].
With the development of computing technologies, many researchers have focused on
deep learning methods. Currently, the detectors can be divided into two main categories.
The first category is region-based (two-stage framework) that performs object detection
into two steps, such as the R-CNN (R-CNN, Fast-RCNN, Faster-RCNN) [5–7] series
model. The other category is unified pipeline-based (one-stage framework) that uses
direct regression, such as SSD (SSD, DSSD, DSOD) series model and YOLO (YOLOv1,
YOLOv2, YOLOv3) [8–11] series model.

© The Author(s), under exclusive license to Springer Nature Switzerland AG 2022


A. E. Hassanien et al. (Eds.): BIIT 2021, LNDECT 107, pp. 501–510, 2022.
https://doi.org/10.1007/978-3-030-92632-8_47
502 J. Fan et al.

Due to occlusion or unclear features, detecting small and medium-sized objects on


the road suffers high error rates. This paper adopts the YOLOv3 model and applies
the K-means++ algorithm to dimensional clustering on self-made mixed data sets to
solve this problem. Meanwhile, CIoU loss instead of prediction box square loss is used.
Finally, the idea of transfer learning is exploited to load the weight of the pre-training
network. At the same time, the freezing training method is adopted to accelerate the
training process and prevent the weight of the initial training from being destroyed.

2 Related Work

2.1 YOLOv3 Model

YOLOv3 is mainly composed of two parts: darknet-53 feature extraction network and
prediction network. The whole neural network consists of 252 layers, among which
darknet-53 has 185 layers. It borrows the idea of a residual network [12] and uses a
large number of jump connections to avoid the problem of gradient dispersion. Besides,
YOLOv3 uses the feature pyramid network (FPN) [13] for feature extraction. In the
pyramid, the lower the feature layer, the less semantic information, but the more precise
location information; by contrast, the higher the feature layer, the more the seman-
tic information, and the fuzzier the location information. The feature fusion pyramid
obtains more abundant information through sampling on the fusion of low-level fea-
tures. Darknetconv2d_BN_Leaky (DBL) structure is the basic component of YOLOv3,
and it consists of a convolution layer, a batch normalization layer, and a Leaky-Relu
activation function. The network structure is shown in Fig. 1.

Fig. 1. YOLOv3 network structure

2.2 Data Set Production

KITTI training set contains 7481 pictures with marked information. In road target
detection, the algorithm only needs to detect pedestrians and vehicles. So, this study
merges Van, Truck, and Tram into Car. Also, it merges Pedestrian (sitting), Cyclist, and
Pedestrian into Person and deletes misc.
Improved YOLOv3 Road Multi-target Detection Method 503

The number of pedestrian samples in the KITTI data set is less than that of vehicle
samples, which may cause problems such as overfitting. Therefore, this study incor-
porates self-made data set HCU-data into the KITTI data set. HCU-data contains 531
pictures, which are mainly mobile phone camera photos and road traffic surveillance
photos. In most of the pictures, people and cars are overlapped. Therefore, pictures with
non-target obstructions such as leaves and railings are added to improve the detection
ability of the model. Some of the pictures in the dataset are shown in Fig. 2.

Fig. 2. Example of data set

2.3 Improved Clustering Algorithm

The width and height of the anchor box in the original YOLOv3 algorithm are obtained
by clustering the COCO data set, but they do not necessarily apply to KITTI data sets.
Besides, this paper only focuses on the detection of vehicles and pedestrians, so it is
necessary to re-cluster the KITTI data set to select a more appropriate prior box. The
original YOLOv3 algorithm adopts a K-means [14] algorithm to select the anchor box.
However, this algorithm is sensitive to the selection of initial points. As more clustering
number increases, only locally optimal results may be obtained. Therefore, this paper
adopts the K-means++ [15] algorithm and define the distance by GIoU:

d = 1 − GIoU (point, centroid ) (1)

The point represents the coordinates of sample points, and the centroid represents
the center of the cluster. The algorithm steps are as follows:

1. Randomly select ci fromdataset


 X as the cluster center at first.
2. Calculate the distance D xj of each point xj in the dataset from its nearest cluster
center using formula 1:
   
D xj = 1 − GIOU xj , ci (2)
504 J. Fan et al.

3. Use formula 3 to calculate the probability of each sample being selected as the next
cluster center:
D(x)2
P=  (3)
D(x)2
x∈X

4. Select K cluster centers following Steps 2 and 3.


5. Use the K-means algorithm to obtain K cluster centers.

The width and height of the anchor box obtained by the above algorithm on the
mixed dataset are listed in Table 1.

Table 1. The width and height of the anchor box

Characteristics of Fig. Scale The width and height of the


anchor box
13 × 13 Big (38,260) (92,173) (147,292)
26 × 26 Medium (18,136) (37,76) (60,108)
52 × 52 Small (14,36) (9,76) (23,52)

After the anchor box is obtained by the two algorithms, the detection result on the
mixed data set is listed in Table 2.

Table 2. The detection result of different clustering algorithms

Methods mAP (%) FPS (f/s)


K-means 90.02 42.15
K-means++ 90.83 43.14

As can be seen from Table 2, compared with the original algorithm, the anchor box
obtained by the K-means++ clustering algorithm contributes to 0.81% higher accuracy
and 0.99 f/s faster detection speed on the mixed data sets.

2.4 Improved Bounding Box Loss Function


As shown in Formula 4, IoU can evaluate the distance between the output box and the
real box.
|A ∩ B|
IoU = (4)
|A ∪ B|
However, when the prediction box and the real box do not intersect, IoU cannot
reflect the coincidence degree of the two boxes. As shown in Fig. 3, a pair of boxes with
the same IoU can have different regression effects due to different overlap modes.
Improved YOLOv3 Road Multi-target Detection Method 505

Fig. 3. Example of predict box

In Fig. 3, the three superposition methods have equal IoU. Meanwhile, the left one
has the best regression effect and the right one has the worst regression effect.
Since IoU is only a ratio and is not sensitive to the size of the object, Rezatofighi
[16] et al. put forward GIoU in CVPR2019, and they proposed that IoU can be set as the
regression loss, as shown in Formula 5 [16].
|Ac − U |
GIoU = IoU − (5)
|Ac |
However, when the two contained boxes are in the horizontal and vertical directions,
the GIoU loss almost degenerates into the IoU loss. Zheng [17] et al. proposed DIoU to
handle this problem. The calculation of DIoU is shown in Formula 6 [17]:
 
ρ 2 b, bgt
DIoU = IoU − (6)
c2
However, considering the aspect ratio of the three factors of bounding box regression,
Zheng et al. proposed CIoU based on DIoU (see Formula 7).
 
ρ 2 b, bgt
CIoU = IoU − − αν (7)
c2
Where α represents the weight function and v is used to measure the similarity of
aspect ratio. The definition of v is shown in Formula 8 [17]:
 2
4 ωgt ω
ν = 2 arctan gt − arctan (8)
π h h
Where ω and h are the length and the width.
The derivative of ν to ω is shown in Formula 9:
 
∂ν 8 wgt w h
= 2 arctan gt − arctan × 2 (9)
∂ω π h h w + h2
The derivative of ν to h is shown in Formula 10:
 
∂ν 8 wgt w w
= − 2 arctan gt − arctan × 2 (10)
∂h π h h w + h2

When both the length and width fall into the range of [0,1], ω2 + h2 is usually small,
1
and a gradient explosion appears, Therefore, ω2 +h 2 is replaced by 1 in the paper.
506 J. Fan et al.

This paper uses CIoUloss instead of the bounding box loss function in YOLOv3, and
the definition of CIoUloss is shown in Formula 11 [17]:

CIoUloss = 1 − CIoU (11)

After modification, the loss function is shown in Formula 12:


s 
B 2

obj j∗ j
loss = Ii,j (1 − CIoU ) ∗ 2 − ωi hi
i=0 j=0
s2 
 B
j
  j

obj j j
− Ii,j C i log Ci + 1 − C i log 1 − Ci
i=0 j=0
(12)

s 
B 2
j
  j

noobj j j
−λnoobj Ii,j Ci log Ci + 1 − C i log 1 − Ci
i=0 j=0

s2
obj
  j

j
 j

j
− Ii,j P i log Pi + 1 − P i log 1 − Pi
i=0 c∈classes

The results of using different loss functions are listed in Table 3.

Table 3. The effect of each method

Loss function mAP (%) FPS (f/s)


IoU 90.02 42.15
GIoU 90.34 42.44
DIoU 92.15 42.91
CIoU 92.18 42.85

Table 3 shows that the average accuracy of IoU and GIoU is not much different, and
that of DIoU and CIoU is similar. CIoU contributes to the highest average accuracy of
92.18%, with a detection speed of 42.85 f/s.

3 Experiment
3.1 Experimental Environment and Parameter Setting
The experiment in this paper is performed by renting an instance on Featurize. The
instance runs Ubuntu 18.04 system, and the GPU configuration is manually set. The
software used in the experiment includes TensorFlow-GPU 1.13.2, kearas 2.1.5, and
opencv_python 4.1.2. The hardware platform is equipped with Intel (R) Core (TM)
I5-9400F CPU @ 2.90 GHz (6 cores), 218 GB disk space, and GeForce RTX 2080
Improved YOLOv3 Road Multi-target Detection Method 507

Ti graphics card (11 GB video memory, 31 GB main memory, with 26.90 TFLOPS
single-precision and 13.45 TFLOPS double-precision computing power).
The model uses batch gradient descent, and the idea of transfer learning was used to
load the pre-training weight. The epoch was set to 50000; the initial learning rate was
set to 0.001; the batch size was set to 64; the momentum was set to 0.9, and the weight
attenuation coefficient was set to 0.0005. The first 10000 training sessions were frozen.
When 40000 and 45,000 iterations were performed, the learning rate was respectively
decreased to 10% and 1% to accelerate the convergence of the model.

3.2 Experimental Results


The model trained by the original YOLOv3 algorithm and the model trained by the
improved YOLOv3 algorithm were respectively used to predict and classify the three
groups of road photos. The experimental results are shown in Figs. 4, 5, and 6. From left
to right are the original pictures, the results of the original YOLOv3 algorithm, and the
results of the improved algorithm.

Fig. 4. The first set of pictures

As for the detection of the first group of pictures, the original algorithm failed to
detect the two cars because they are only partially exposed and some of them are blocked
by leaves. In comparison, the improved algorithm successfully detected the two cars.

Fig. 5. The second set of pictures

As for the detection of the second group of pictures, the original algorithm failed
to detect the black car at the school gate because it is far away from the camera and is
blocked by railings. In comparison, the improved algorithm successfully detected the
car.
508 J. Fan et al.

Fig. 6. The third set of pictures

For the detection of the third group of pictures, the original algorithm detected the
pedestrians but failed to detect the black car behind due to the occlusion and small object.
In comparison, the improved algorithm can detect this object.
The results before and after the algorithm improvement are listed in Table 4.

Table 4. Comparison of the algorithm before and after improvement

Methods Precision (%) Recall (%)


The original algorithm 90.57 88.35
Improved algorithm 93.17 90.14

3.3 Comparison with Other Object Algorithms

Since there are only two categories of car and person in the data set, this paper uses
mAP, FPS, and F1-score to evaluate the performance of object detection models. The
calculation formula of mAP is shown in Formula 13:
APcar + APperson
mAP = (13)
2
The calculation formula of F1-SCORE is shown in Formula 14:
pre × rec
F1 = 2 × (14)
pre + rec
The proposed model is compared with YOLOv2, original YOLOv3, Faster-RCNN,
SSD, and the algorithms of other papers. The comparison result is shown in Table 5.
Table 5 shows that the average accuracy of the improved algorithm in this paper
reaches 92.79%, and the detection speed reaches 42.88 f/s. Meanwhile, the improved
algorithm achieves the highest f1-score, which is significantly better than that of other
models.
Improved YOLOv3 Road Multi-target Detection Method 509

Table 5. Model comparison

Methods mAP (%) FPS (f/s) F1-score (%)


YOLOv2 66.0 90.12 65.38
YOLOv3 90.05 42.15 89.77
Faster-RCNN 74.98 10.18 73.83
SSD 83.94 50.18 82.70
Literature [18] 88.55 35.00 87.38
Improved YOLOv3 92.79 42.88 91.62

4 Conclusion

Aiming at the problem that the yolov3 algorithm fails to detect small or blocked objects,
this paper uses the K-means++ algorithm to re-cluster on the mixed data set and selects
a priori frame more suitable for road objects to improve the average accuracy and speed
of detection. Then, the loss of boundary box is replaced with CIoU loss to improve the
positioning accuracy and average accuracy. The trained model is compared with other
models. The results show that the improved yolov3 algorithm performs significantly
better than other models, with an average accuracy of 92.79%, a detection speed of
42.88 f/s, and an F1 score of 91.62%.
There are still many deficiencies in the YOLOv3 algorithm. For example, the multi-
scale feature fusion method has poor portability, so it is not suitable for migration learn-
ing. Also, the detection of different scenes needs to select the appropriate feature fusion
structure to change the scene [19]. This paper also tried to add a 104 × 104 scale in
addition to the three detection scales. However, the experimental results on mixed data
sets are not as good as those of the original algorithm. With the development of computer
hardware, the YOLOv4 model or YOLOv5 model can be further optimized in the future.

Acknowledgment. This work is supported by the Nature Science Foundation of Heilongjiang


Province (LH2021F035).

References
1. Dalal, N., Triggs, B.: Histograms of oriented gradients for human detection. In: Proceedings
of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 886–893.
IEEE (2005)
2. Luo, J., Oubong, G.: A comparison of SIFT PCA-SIFT and surf. Int. J. Image Process. 3(4),
143–152 (2009)
3. Felzenszwalb, P.F., Girshick, R.B., Mcallester, D.: Object detection with discriminatively
trained part-based models. IEEE Trans. Pattern Anal. Mach. Intell. 32(9), 1627–1645 (2009)
4. Li, Z., Huang, M.H.: Real-time vehicle detection based on YOLO_v2 Model. China Mech.
Eng. 29(15), 1869–1874 (2018)
510 J. Fan et al.

5. Girshick, R., Donahue, J., Darrell, T.: Rich feature hierarchies for accurate object detection
and semantic segmentation. In: Proceedings of the IEEE Conference on Computer Vision and
Pattern Recognition (CVPR), pp. 580–587. IEEE (2014)
6. Girshick, R: Fast R-CC. In: Proceedings of the 2015 IEEE Conference on Computer Vision
and Pattern Recognition (CVPR), pp. 1440–1448. IEEE (2015)
7. Ren, S., He, K., Girshick, R.: Faster R-CNN: towards real-time object detection with region
proposal networks. IEEE Trans. Pattern Anal. Mach. Intell. 39(6), 1137–1149 (2016)
8. Liu, W.: SSD: single shot multibox detector. In: European Conference on Computer Vision,
pp. 21–37 (2016)
9. Redmon, J.: You only look once: unified, real-time object detection. In: Proceedings of the
2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 779–788.
IEEE (2016)
10. Redmon, J.: YOLO9000: better, faster, stronger. In: Proceedings of the 2017 IEEE Conference
on Computer Vision and Pattern Recognition (CVPR), pp. 7263–7271. IEEE (2017)
11. Redmon, J.: Yolov3: an incremental improvement. In: Computer Vision and Pattern
Recognition, arXiv preprint arXiv:1804.02767 (2018)
12. Lin, T Y.: Feature pyramid networks for object detection. In: Proceedings of the 2017 IEEE
Conference on Computer vision and Pattern Recognition (CVPR), pp. 2117–2125. IEEE
(2017)
13. He, K M.: Deep residual learning for image recognition. In: Proceedings of the 2016 IEEE
Conference on Computer Vision and Pattern Recognition (CVPR), pp. 770–778. IEEE (2016)
14. Kong, F.F., Song, B.B.: Improved panoramic traffic monitoring object detection of YOLOv3.
Comput. Eng. Appl. 56(8), 20–25 (2020)
15. Arthur, D.: K-means++: the advantages of careful seeding. In: Eighteenth Acmsiam Sympo-
sium on Discrete Algorithms, New Orleans: Society for Industrial and Applied Mathematics,
pp. 1027–1035 (2007)
16. Rezatofighi, H.: Generalized intersection over union: a metric and a loss for bounding box
regression. In: Proceedings of the 2019 IEEE Conference on Computer Vision and Pattern
Recognition (CVPR), pp. 658–666. IEEE (2019)
17. Zheng, Z H.: Distance-IoU loss: faster and better learning for bounding box regression. In:
Association for the Advance of Artificial Intelligence (AAAI), pp. 12993–13000 (2020)
18. Li, Y.P., Hou, L.Y.: Motion object detection in automatic driving based on YOLOv3 [J].
Comput. Eng. Des. 40(4), 1139–1144 (2019)
19. Liang, H., Wang, Q.W.: Review of small object detection technology. Comput. Eng. Appl.
1(13), 12–20 (2020)
Research on Ctrip Customer Churn Prediction
Model Based on Random Forest

Zhijie Zhao1,2 , Wanting Zhou1,2(B) , Zeguo Qiu1,2 , Ang Li1,2 , and Jiaying Wang1,2
1 School of Computer and Information Engineering,
Harbin University of Commerce, Harbin 150028, Heilongjiang, China
2 Heilongjiang Provincial Key Laboratory of Electronic Commerce and Information Processing,

Harbin University of Commerce, Harbin 150028, Heilongjiang, China

Abstract. With the rapid development of the tertiary industry, the online reser-
vation market has great potential. The research on the customer churn factor of
customer-centric hotels is of great significance to the development of Ctrip. In
order to qualitatively analyze the causes of Ctrip’s scheduled customer churn, this
paper uses the current value and potential value in the customer value system to
determine the influencing factors of Ctrip’s customer churn. It introduces a random
forest algorithm to construct the Ctrip customer churn prediction model. Finally,
the confusion matrix and ROC curve are used to evaluate the performance of the
model. The results show that the random forest algorithm can better solve the
two-classification problem of customer churn prediction, and the accuracy of the
prediction model reaches 94%. In the analysis of influencing factors, the random
forest algorithm effectively avoids the collinearity interference between factors in
the traditional research methods and sets the objective weight, The weight of its
influencing factors can provide a scientific basis for Ctrip to formulate targeted
retention strategies.

Keywords: Customer churn prediction · Customer value · Random forest ·


Confusion matrix

1 Introduction

In 2020, China will build a well-off society all-around way, and the national GDP will
increase year by year. The people’s pursuit of a better life is not limited to material satis-
faction. The rapid economic development makes people more pursue spiritual pleasure,
and China has entered a new era of mass tourism. The general construction of network
infrastructure in China has caused a significant impact on the traditional offline tourism
industry. Multi-level and diversified tourism demand and consumer market patterns have
been formed. The Online Travel Agency (OTG) has developed ahead of schedule and
tourism consumption has become the rigid demand of the public. According to the 47th
Statistical Report on China’s Internet Development released by China Internet Network
Information Center, as of December 2020, the number of online travel booking users
in China had reached 342 million, accounting for 34.6% of the total Internet users [1].

© The Author(s), under exclusive license to Springer Nature Switzerland AG 2022


A. E. Hassanien et al. (Eds.): BIIT 2021, LNDECT 107, pp. 511–523, 2022.
https://doi.org/10.1007/978-3-030-92632-8_48
512 Z. Zhao et al.

According to the report on China’s online tourism industry in 2020, the tourism market
presents a pattern of “one super, multiple powers” for the OTA platform. The mainstream
online tourism booking platforms Meituan, Qunar tourism, Ctrip Travel, Tongcheng
Yilong, etc., are highly competitive. Among them, Ctrip Trave occupies 40.7% of the
market share, far ahead of other giants. However, due to the rapid development of the
network, the homogenization of products is serious. The competition among industries
has become increasingly fierce, and the amount of customer churn has increased sharply.
Therefore, to maintain the industry’s leading position, it is important to study customer
churn for Ctrip booking.
According to the customer life cycle theory, customer churn is inevitable. Research
has found that, on the one hand, a company’s customer churn will generate opportunity
costs due to reduced sales, and on the other hand, it will reduce attracting new customers.
For enterprises, the cost of redeveloping a new customer is about 300 to 600 dollars,
which is 5,6 times the cost of retaining an old customer [2]. Still, enterprises can extend
the customer life cycle to the greatest extent through reasonable marketing strategies, and
effectively retain the original customers has become the primary concern of enterprises.
This paper uses the hotel reservation data released by the OTA giant “Ctrip Travel”
as the basic data set to study the customer churn prediction of Ctrip hotel reservations.
Referring to the current value and potential value of customer lifetime value in the
customer value system to determine various variables to construct the characteristic
dimensions of Ctrip’s scheduled customer churn prediction, determine the prediction
model of customer churn based on random forest algorithm, and finally determine the
influencing factors of customer churn. The model can be used to analyze the factors
of Ctrip’s customer churn. For Ctrip, it is helpful to retain customers, reduce customer
churn and save costs.

2 Related Research

2.1 Research on Customer Value

Customer value has become the center and focus of customer relationship management.
The “Two-Eight Law” indicates that 80% of the profits of enterprises come from 20%
of important customers. Identifying customer value has become the research focus of
enterprises with customer competitiveness at this stage. Verhoef et al. (2001) [3] proposed
for the first time that customer value is composed of two parts: current customer value
and potential customer value. Current value refers to the value created by customers
for the enterprise based on existing cooperation, and potential value refers to the total
profit contributions that customers can bring to the enterprise in the future. Liu Xiao
[4] used K-means and rough domain sets to classify aviation customers’ value and
established aviation customers’ value evaluation index system from the dual perspectives
of customers’ current and potential value. Tian Bo built a prediction model for Ctrip
customer churn based on the XGBoost algorithm and constructed a customer lifetime
evaluation system using current value and potential value [5].
Research on Ctrip Customer Churn Prediction 513

2.2 Research on Customer Churn


Many scholars have been widely concerned about the research on customer churn at home
and abroad since its rise in the 1990s. At present, research on customer churn mainly
focuses on telecommunications, finance, and other industries, and e-commerce, as an
emerging industry, has also been favored by most scholars. Amin A. et al. (2019) [6] pro-
posed a novel CCP method based on distance factor classifier deterministic estimation.
Using different latest evaluation methods for different publicly available telecommuni-
cation technology (TCI) data sets, the data sets are divided into two categories according
to the distance factor: data with higher certainty and data with lower certainty, which
are used for prediction customers who exhibit churn and non-churn behavior. Kwon H.
et al. (2021) [7] regard users who received a refund before the payment and users who
received a refund seven days after the probation period as lost users. It is subject to topic
modeling and the text message is vectorized to explain the contribution of each variable
to the model prediction. Finally, analyze the contribution of these variables, which is
helpful to identify the signs of user loss in advance.
Unlike other areas, in the e-commerce market, customers do not sign any contract
with merchants. For customers, they can withdraw from the merchant anytime and
anywhere. Alexandros D. et al. (2020) [8] introduced a prototype algorithm, which
regularly uses past purchase transaction data and subscription-based business logic to
recalculate the churn probability of each customer. The results show that no matter
what kind of group they belong to, the algorithm can significantly capture the purchase
intention of repeated customers. With the vigorous development of online tourism, the
competition in the industry is becoming more and more fierce. The tertiary industry
with customers as the core pays more and more attention to customer churn. Feng Y.
et al. (2008) [9] analyzed the current situation of excessive customer churn and fierce
competition for new customers in the online tourism industry and specific strategies for
customer churn management in the tourism industry. Eunil P. et al. (2020) [10] used big
data to analyze the feedback and comments of returning customers in the online hotel
reservation industry, predict the causes of customer churn, or understand customers’
intention through dynamic analysis.
In summary, there are many research methods for the construction of a customer
churn prediction model. Considering that the random forest algorithm has good accuracy,
generalization, and robustness in model prediction and regression, fast training speed,
simple feature selection and calculation of feature weight, etc., this paper introduces the
random forest algorithm to study the Ctrip customer churn prediction, model. Firstly,
the characteristics of the customer churn prediction model are selected based on the
customer value system and determine the indicators of influencing factors. The random
forest algorithm is used to construct the prediction model of Ctrip customer churn, and
the evaluation index is used to evaluate the model. Analyzing the influencing factors of
customer churn helps Ctrip retain customers and extend the life cycle of customers.

3 Ctrip Customer Churn Prediction Model Construction


This paper mainly uses the customer value system to select the dimensions of customer
churn characteristics, determines the influencing factors index system, uses the random
514 Z. Zhao et al.

forest algorithm to build a Ctrip customer churn prediction model, and use the evaluation
index to compare and evaluate the model, and finally measures to retain customers.
According to the data processing sequence, the model is mainly divided into data pre-
processing, model construction and evaluation, churn reason analysis, and retention
measures. The research framework of this article is shown in Fig. 1.

Data Preprocessing Model Construction


and EValuation Based
Data source on Random Forest
Analysis of
Principles of Random Ctrip's
Forest Algorithm Customer
Feature selection
based on customer Churn
Evaluation index Reasons and
value
determination Retention
Measures
Outlier and missing Model performance
value processing evaluation and result
analysis

Fig. 1. Flow chart of Ctrip customer churn prediction based on random forest.

3.1 Data Source and Preprocessing


Data Source. As a leading comprehensive travel service company in China, “Ctrip Trav-
el” provides comprehensive travel services to more than 250 million members every day.
We can analyze user behavior data to explore potential information resources for this
massive amount of website visits. Therefore, the data selected in this paper comes from
a “customer churn probability prediction” competition publicly held by Ctrip in 2016.
The data has been desensitized, but the overall data does not affect Ctrip’s customer
churn prediction analysis. For a week, the data set is the Ctrip hotel visit data from
May 15, 2016, to May 21, 2016. The data set contains 689,945 pieces of data with a
total of 51 dimensions. It records the browsing records of each consumer on the Ctrip
website, including hotel-related characteristics, customer behavior characteristics, and
order-related characteristics. Hotel-related characteristics include current hotel char-
acteristics, current hotel historical cancellation rate, visits within 24 h, etc.; customer
behavior characteristics include preference characteristics, value characteristics, city
characteristics, etc.; order-related characteristics include check-in date, average price,
order quantity, etc. These data show to a certain extent the customer’s preference for
booking a room. The label field in the dataset is used as an output variable to indicate
whether the customer is churn. The variable value of the churned customer is 1, and the
variable value of the non-churn customer is 0. Customer churn refers to the fact that the
original customers of an enterprise stop buying enterprise goods or receiving enterprise
services and instead receive competitors’ goods or services. For the Ctrip hotel dataset
selected in this paper, online hotel reservation customer churn is also a type of general
customer churn. Online hotel reservation customer churn is a kind of customer churn
under a non-contractual relationship. Under this relationship, it isn’t easy to find the
definition of the relationship between enterprises and customers. Therefore, this data set
Research on Ctrip Customer Churn Prediction 515

defines customer churn as interaction with online hotel web pages but ultimately does
not purchase behavior. There are 500588 churn customers with browsing records and
have not placed an order and 189357 customers who have not to churn in the dataset.

Feature Selection Based On Customer Value. Customers with different values have
different contributions to the enterprise. This paper refers to the customer value eval-
uation system proposed by Shen Ziyao et al. [11], divides the customer value into the
customer lifetime value improvement system of current value and potential value, and
finally constructs a customer lifetime value evaluation system with 16 characteristic
dimensions in combination with the existing data indicators in the data set, As shown in
Table 1.

Table 1. Customer lifetime value evaluation system.

Primary indicators Secondary indicators Data indicators


Current value Income value Customer Value_Last 1 year
(customer_value_profit)
Customer Value (ctrip_profits)
Potential value Growth value User conversion rate (cr)
User price preference_24 h most viewed hotel
prices (delta_price1)
User price preference_24 h browse hotel average
value (delta_price2)
Star preference (starprefer)
Consumption capacity index
(consuming_capacity)
Session ID, sid = 1 can be considered as the
time of new visit (sid)
Customer loyalty Time from the last order within one year
(lasthtlordergap)
Time from last visit within one year (lastpvgap)
Annual user orders (ordernum_oneyear)
Landing time within 24 h (landhalfhours)
Order cancellation rate of users within one year
(ordercanceledprecent)
Number of orders canceled by users in a year
(ordercanncelednum)
Price sensitivity index (price_sensitive)
Annual visits (visitnum_oneyear)
516 Z. Zhao et al.

Data Outlier and Missing Value Processing. Due to the lack of multi-dimensional
data in the original data set, the lack of situation of each variable is shown in Fig. 2.
Most of the missing values of the 16 influencing factors based on the current value and
potential value are concentrated at about 30%. For attributes with missing values greater
than 30%, they still belong to the case of high missing values. In order to reduce the
impact of filling values on the original data, the high-dimensional extreme value – 999,
which has no practical significance, is selected to fill in the missing value.

Fig. 2. Data missing value situation.

The three influencing factors with less than 30% missing value, such as login duration
within 24 h, last visit duration within a year, and annual visit times, are filled in different
ways according to their distribution patterns. The same distribution pattern of influencing
factors is shown in Fig. 3. For the features with the same skew distribution, the missing
values are filled with the median.

Fig. 3. Distribution of influencing factors.


Research on Ctrip Customer Churn Prediction 517

User price preference_24 h most viewed hotel price, user price preference_24 h
browse average hotel price, customer value, and customer value_last one year. Generally
speaking, the four influencing factors related to value are positive, and the negative values
in the data are regarded as outliers. The median replaces the outliers related to the price
of user preferences, and the mode replaces the outliers related to the customer value.

3.2 Construction and Evaluation of Ctrip Customer Churn Prediction Model


Based on Random Forest
Principle of Random Forest Algorithm. Random forest algorithm is a machine learn-
ing algorithm based on a decision tree for classification and regression, which was
proposed by Leo Breiman [12] in 2001. Its essence is randomly sampling the rows and
columns in the sample set and using the decision tree to generate many classification
trees. Finally, the prediction results of the classification trees are averaged, or the major-
ity voting principle is used to obtain the decision results, called a random forest. The
classification model formula of random forest is as follows [13]:
n
H (X ) = arg MaxY I (hi (X ) = y) (1)
i=1

Among them, H (X ) represents the final result of the model, I(•) represents the
indicative function, n represents the number of decision trees in the forest, hi (X ) is each
classification tree, and Y is the output variable. Random forest, as a nonlinear modeling
tool, can avoid overfitting. In the process of Ctrip’s customer churn data processing, the
random forest algorithm is used to establish a prediction model, which does not require
data pre-processing such as normalization and standardization of sample data, and the
weight of characteristic factors can be simply obtained to determine the importance of
affecting customer churn.

Determination of Performance Evaluation Indicators for Ctrip’s Customer Churn


Prediction Model. For binary classification problems, the confusion matrix is usually
used to evaluate the model prediction results. The confusion matrix [14] is shown in
Table 2:

Table 2. Confusion matrix.

Real situation Predictor variable


1 0
1 TP FN
0 FP TN

Where TP means true positive, that is, the actual value is 1, and the predicted result
is also 1; FP means false positive, that is, the actual value is 0, and the predicted result
is 1; TN is true negative, that is, the actual value is 0, and the predicted result is also 0;
FN means false negative, that is, the actual value is 1, and the prediction result is 0. Five
518 Z. Zhao et al.

indicators are used to evaluate the model based on these four values: accuracy, precision,
recall, F1-score, and AUC.
Accuracy is used to calculate the proportion of the number of correctly predicted
samples in the total samples, see formula 2; Accuracy is used to calculate how many
customers are churn in the total number of customers predicted to be churn, see formula
3; Recall rate refers to how many customers churn are predicted to be churn successfully,
see formula 4; F1 score is a comprehensive index of accuracy and recall, see formula 5.
It can be seen that the higher F1 score is, the higher the accuracy and recall will be; AUC
can measure the quality of a prediction model and it is determined by the area between
the ROC curve and coordinate axis, which is between 0.1 and 1. The larger the value,
the better the performance will be.

Accuracy = TP+TN
TP+TN +FP+FN (2)

Precision = TP
TP+FP (3)

Recall = TP
TP+FN (4)

2×Pr ecision · Recall


F1 − score = 1
2
1 = Pr ecision + Recall (5)
Pr ecision + Recall

The ROC curve and AUC can reflect the global performance index of the prediction
model. The ROC curve is composed of the horizontal axis false positive rate (FPR) and
the vertical axis true positive rate (TPR). True positive rate is the percentage of customers
who are actually churn and predicted to be churn, which is equivalent to the recall rate,
as shown in formula 6. False positive rate is the ratio of the number of customers who
are actually not churn but misjudged as churn, accounting for the total number of actual
non-churn customers, see formula 7.

TPR = TP
TP+FN (6)

FPR = FP
TN +FP (7)

The more the ROC curve deviates to the upper left corner, the better the performance
of the model, the greater the true positive rate and the smaller the false positive rate. The
AUC value is the area below the ROC curve. The larger the AUC area, the better the
model performance of the prediction classifier.

Model Performance Evaluation and Result Analysis. In order to further verify the
accuracy of the prediction model based on random forest, eliminate the error caused by
the inconsistency of the original data set to each prediction classification model. Based
on the same pre-processed Ctrip customer churn data, this paper uses four classification
prediction methods: logical regression, naive Bayes, support vector machine, and deci-
sion tree to compare and analyze the prediction with the random forest algorithm model.
The five index results of the model evaluation are shown in Table 3.
Research on Ctrip Customer Churn Prediction 519

Table 3. Performance indicators of five predictive classification method models.

Method Accuracy Precision Recall F1-score AUC


Logistic regression (lr) 0.73 0.55 0.42 0.32 0.63
Naive Bayes (gnb) 0.52 0.32 0.69 0.44 0.60
Support vector machine (svc) 0.59 0.33 0.48 0.39 0.57
Decision tree (dtc) 0.92 0.88 0.82 0.85 0.93
Random forest (rfc) 0.94 0.96 0.79 0.87 0.97

Fig. 4. ROC curves of five models.

It can be seen from Table 3 that for the five indicators of accuracy, precision, recall,
F1-score, and AUC, the values of random forest are higher than the other four prediction
models. The accuracy of random forest is 94%, the precision is 96%, and the recall rate
is 79%, with the highest AUC of 0.97. Similarly, it can be seen from the ROC curves of
the five models in Fig. 4 that the ROC curve of the random forest (rfc) is the uppermost
curve among the five prediction models. The more to the upper left corner, the higher
the true positive rate and the smaller the false positive rate. Its model performance is the
best, with the largest area of 97%. The prediction accuracy of customer churn is high,
the misjudgment rate is low, and the AUC is 97%. The AUC areas of logistic regression,
naive Bayes, and support vector machine are close, all about 60%. It can be seen that
the random forest model is better than other models for the two-classification problem
of customer churn and performs well in model generalization and accuracy testing.
520 Z. Zhao et al.

4 Ctrip Customer Churn Reason Analysis and Retention


Countermeasures
It is known from Table 3 that the prediction accuracy of random forest is the highest,
and the prediction model of Ctrip customer churn based on the random forest has the
best performance. This paper mainly studies the main reasons for the churn of Ctrip’s
scheduled customers, so the weight of each factor generated in the random forest algo-
rithm is used to analyze the reasons for the churn. As shown in Table 4, the weight value
of the churn factor of the optimal random forest prediction model.

Table 4. Loss factor weight values of the optimal random forest prediction model.

Serial number Churn factor Weights


1 customer_value_profit 0.133476
2 ctrip_profits 0.124215
3 visitnum_oneyear 0.083267
4 cr 0.077732
5 lastpvgap 0.070912
6 sid 0.067575
7 lasthtlordergap 0.048166
8 delta_price2 0.044261
9 delta_price1 0.043803
10 consuming_capacity 0.043525
11 starprefer 0.042588
12 ordercanncelednum 0.042262
13 ordercanceledprecent 0.017746
14 landhalfhours 0.016944
15 price_sensitive 0.007949

According to Table 4, the random forest algorithm based on all factors of the value
system eliminates the number of annual user orders based on customer loyalty indicators
in the potential value of Ctrip reservations. For the online hotel reservation industry,
homogenization is serious. Compared with the annual visits of users under the same
customer loyalty index, the annual orders of users have little impact on the loyalty of the
online hotel reservation industry. Compared with offline physical stores, users” annual
orders reflect users’ loyalty. Only when the merchants are satisfied, the number of annual
orders will increase. For online, in the Internet era, traffic is king, and the satisfaction of
merchants is reflected in the number of visits. For customers with more extensive visits
and fewer orders, such customers are still high-value customers, maintaining customer
stickiness and loyalty to merchants. On the contrary, customers with fewer visits and
more orders may only be interested in the activities launched by the merchant at this
Research on Ctrip Customer Churn Prediction 521

time and make a purchase, but they have not made a purchase reservation the rest of
the time. Such customers are more likely to lose due to external factors. Therefore,
the number of annual user orders for Ctrip’s scheduled customer churn factor has been
eliminated. In addition, all influencing factors in current value and potential value are
directly proportional to the loss of Ctrip’s scheduled customers.
Analyzing the experimental results, the factor that has the greatest impact on Ctrip’s
scheduled customer churn is the value of customers in the past year, with a weight of
0.133476, followed by customer value with a weight of 0.124215; and the third is the
number of annual visits, with a weight of 0.083267. For Ctrip, the loss of a customer
is more related to its value and the number of annual visits. For high-value customers,
Ctrip has developed a membership system to retain customers, which increases the cost of
customer transfer and makes it easier to retain customers; Customers with a high number
of annual visits indicate that customers’ loyalty satisfaction with Ctrip is higher than that
of other companies and that such customers are not easy to lose. Therefore, for Ctrip, the
most important factors affecting customer churn are the customer’s value in the past year,
customer value, and the number of annual visits. From the weight of the experimental
results, it can be seen that the biggest impact on the loss of Ctrip’s scheduled customers
is the current value of customers. It shows that the most important thing that affects
the loss of Ctrip’s scheduled customers is improving the current value of customers,
keeping customers in a high-value stage, and avoiding churn. On the contrary, for Ctrip,
the three factors that negatively impact customer churn are user cancellation rate within
one year, 24-h login duration, and price sensitivity index. Their weights are 0.017746,
0.016944, and 0.007949 respectively. In particular, the price sensitivity index has the
smallest weight. For the Ctrip reservation system, customers pay more attention to factors
related to the hotel’s measures such as check-in experience, environment, sanitation, and
transportation, and pay less attention to the price. Moreover, most customers who book
rooms are business and tourism customers, and they do’ not pay much attention to the
price sensitivity index.
According to the customer value system, the factors affecting Ctrip’s scheduled cus-
tomer churn are the revenue value in the customer’s current value, the growth value in
the potential value, and the customer loyalty in the potential value. Given the analysis
of customer churn factors, in terms of retaining Ctrip customers, enterprises should pay
attention to customers and competitors, understand the current value of customers and
competitors’ products, understand customer needs, and develop more satisfactory ser-
vices and housing for customers to improve their overall competitiveness. The greater the
customer’s contribution to the enterprise, the higher the customer’s value. Ctrip should
attach more importance to high-value customers, face up to customer value, provide per-
sonalized services to high-value customers, and increase customer orders. The website
is the most crucial face for online enterprises. Optimize the website’s design, increase
Ctrip’s own promotion, and design different website links and jumps for customers of
all ages to increase the traffic of users to improve customer satisfaction and loyalty.
522 Z. Zhao et al.

5 Conclusion
In the online travel market, where market competition continues to intensify, customer
competition is indispensable. Compared with the massive cost of attracting a new cus-
tomer, most enterprises have begun implementing effective customer management on
existing customers. Customer management in the tourism industry provides a new way
of thinking. This paper uses the current value and potential value in the customer value
system to construct an index system for Ctrip’s customer churn factors. It uses the ran-
dom forest algorithm to construct a Ctrip customer churn prediction model, objectively
obtains the weights of factors that affect Ctrip’s reservations and provides Ctrip with
measures to retain lost customers and cause analysis. It is hoped that Ctrip can extend the
life cycle of customers and obtain greater profits. The research results show that: (1) The
prediction accuracy evaluation of four machine learning algorithms: logistic regression,
naive Bayes, support vector machine, and decision tree through the confusion matrix
and ROC curve shows that the accuracy and AUC value of the random forest algorithm
is the largest. It proves that the random forest model has higher usability in classifica-
tion and prediction problems. (2) The random forest algorithm is used to analyze the
influencing factors of customer churn, which effectively overcomes the interference of
the weight setting and the complex linear relationship between factors in the traditional
research on the influence of customer churn. It has more advantages over traditional
classical methods in prediction feasibility and accuracy. (3) According to the analysis
results of influencing factors, it can be seen that the most weighted is the revenue value
of the current value of the customer, indicating that the most significant factor affecting
Ctrip’s customer loss is the revenue value. From the analysis results, Ctrip has provided
effective measures to retain customers to obtain greater profits.

Acknowledgment. The Heilongjiang Provincial Science Fund Project supported this work (No.
LH2019F044).

References
1. China Internet Network Information Center.: The 47th Statistical Report on China’s Internet
Development, vol. 2 (2020)
2. Xiao, J., He, X., Teng, G., Xie, L.: Research on cost-sensitive semi-supervised integrated
model of customer churn Prediction. Syst. Eng. Theory Pract. 41(01), 188–199 (2021)
3. Donkers, V.B.: Predicting customer potential value an application in the insurance industry.
In: Decision Support Systems (2001)
4. Liu, X., Wang, X.: Research on aviation customer value classification based on K-means and
neighborhood rough set. Oper. Res. Manage. 30(03), 104–111 (2021)
5. Tian, B., Jiang, S., Gui, B.: Research on customer churn prediction based on customer
segmentation. Commun. World 27(06), 183–184 (2020)
6. Amin, A., et al.: Customer churn prediction in telecommunication industry using data
certainty. J. Bus. Res. 94, 290–301 (2019)
7. Kwon, H., et al.: Lifelog data-based prediction model of digital health care app customer
churn: retrospective observational study. J. Med. Internet Res. 23(1), e22184–e22184 (2021)
Research on Ctrip Customer Churn Prediction 523

8. Alexandros, D., Charalampos, A.: Designing a real-time data-driven customer churn risk
indicator for subscription commerce. Int. J. Inf. Eng. Electron. Bus. 12(4), 1–14 (2021)
9. Feng, Y.: Research on management methods of customer churn in tourism industry. Bus. Res.
08, 138–140 (2008)
10. Park, E., Kang, J., Choi, D., Han, J.: Understanding customers’ hotel revisiting behaviour: a
sentiment analysis of online feedback reviews. Curr. Issues Tourism 23(5), 605–611 (2020)
11. Shen, Z., Yuan, X.: integrated energy service customer identification based on parallelized
K-means. Power Eng. Technol. 40(02), 107–113 (2021)
12. Breiman, L.: Random forests. Mach. Learn 45(1), 5–32 (2001)
13. Kang, Y., Chen, Y., Gu, S., Yao, X., Huan, Q., Tang, Y.: Evaluation of sustainable utilization
of regional water resources based on random forest. Hydropower Energy Sci. 32(03), 34–38
(2014)
14. Zhang, R., Zhang, Y., Zhao, H., Ding, Z.: Quantitative comparison of the effects of bank
credit evaluation models. Financ. Supervision Res. 01, 66–85 (2021)
Business Intelligence
and Communications
An Evolutionary Game Analysis of Product
Crowdfunding Opportunistic Behavior
Considering Price Acquisition Model

Guang Yang1(B) , Yan Wen2 , and Kaiwen He3


1 Institute of Business Economics, Harbin Business University, Harbin, Heilongjiang, China
2 Faculty of Economics, Harbin Business University, Harbin, Heilongjiang, China
3 Northeast Asia Service Outsourcing Research Center, Faculty of Finance, Harbin Business

University, Harbin, Heilongjiang, China

Abstract. With the development of business intelligence and information tech-


nology, the advantages of crowdfunding mode become more and more distinct
among the financing methods of Internet finance. The opportunistic behavior of
sponsors in the product crowdfunding model is widespread. In order to realize the
benign development of the industry, its regulation and supervision should be paid
attention to. So, considering the value acquisition mode of crowdfunding platform,
the evolutionary game model of sponsor groups and platform groups is established,
and local equilibrium analysis and simulation test are carried out. It is found that
the way to solve the problem is to make the effect of opportunism behavior less
than that of self-discipline behavior, and negative incentive constraints have a weak
impact on the initiators of opportunism behavior. At the same time, the impor-
tant factors influencing the initiators’ decision-making of self-discipline behavior
and the platform’s active supervision decision-making are obtained. Finally, some
feasible suggestions are given for the mechanical design of the platform.

Keywords: Product crowdfunding · Opportunistic behavior · Evolutionary


game · Value acquisition model

1 Introduction
In the context of unstable development and the poor survival rate of small and medium-
sized enterprises, the problem of “difficult and expensive financing” has been widely
concerned. In particular, COVID-19 has once again caused a major impact on them.
At this time, the advantage of crowdfunding model becomes more and more distinct
among the financing methods of Internet finance. Product crowdfunding has a high
participation rate, which is a friendly boost for the development of small and medium-
sized enterprises. The reason is that product crowdfunding adopts the form of “pre-sale
+ group purchase”.
On the one hand, it can obtain start-up funds and promotion, and on the other hand,
it can carry out reasonable planning and coordination for the production of product
quantity. However, there is a serious information asymmetry between the platform,

© The Author(s), under exclusive license to Springer Nature Switzerland AG 2022


A. E. Hassanien et al. (Eds.): BIIT 2021, LNDECT 107, pp. 527–536, 2022.
https://doi.org/10.1007/978-3-030-92632-8_49
528 G. Yang et al.

sponsors, and investors. With the support of the third-party agent operating company,
the sponsors have prominent problems of opportunistic behavior, such as false publicity
is not in line with the real thing, exaggerated product quality is not in line with the
price, hiding negative information and other moral hazard problems. Therefore, the
opportunistic behavior of product crowdfunding sponsors is regarded as a key academic
research object.
However, the existing relevant research also has some regrets. First, the feasibility
of decision theory analysis from a single perspective in actual operation remains to be
discussed. In the research process, each subject’s mutual influence of interest relation-
ship and decision-making behavior should be considered. That is, the value acquisition
mode and the dynamic evolution of behavioral decision-making should be considered.
The value acquisition model is designed to solve these three problems: “who to charge”,
“what to charge” and “how much to charge” [1]. At the same time, the complete infor-
mation assumption of the complete information static game model is inconsistent with
the actual conditions. The decision of the platform group and the initiator group is a
dynamic evolution process, which the evolutionary game model should analyze. Sec-
ond, platform type incentive reward type in both positive practice condition, according
to current rules, through the horse race mechanism to allocate resources has grown up
in the form of platform, basically can achieve a fair and just, a short period of time
can’t change, and for the evaluation standard of self-discipline behavior, it is difficult
to determine. Accordingly, most of the temporary deposits for the later project quality
assessment compensates for investors’ complaints, etc. Therefore, the negative incentive
of punishment can be discussed and strengthened. Third, most of the relevant literature
studies using evolutionary game theory models use global stability analysis instability
analysis, and the equilibrium points obtained are no more than (0,0) and (1,1). Consid-
ering local equilibrium, that is, stability analysis with added constraints can draw more
effective conclusions.
Therefore, this paper will focus on the dynamic evolution process of the initiator’s
autonomous behavior and opportunistic behavior decision-making and the platform’s
active and negative regulatory decision-making in product crowdfunding. Considering
the interesting relationship between crowdfunding subjects, namely the platform value
acquisition model, the finite rational evolutionary game model is used to analyze the local
stability. A stable equilibrium point is explored and simulated using MATLAB. Through
the analysis of the evolution process, the platform mechanism design is expected to
provide optimization suggestions, maintain the high development momentum of product
crowdfunding.

2 Evolutionary Game Model


2.1 Model Assumes
Hypothesis 1. The launch of the product crowdfunding project involves two partici-
pating groups: the initiator group and the platform group, which do not have complete
information and are bounded rational. Each time, one unit is randomly selected from the
two groups to play the game, and the two groups can learn and accumulate information
to achieve a comparative advantage.
An Evolutionary Game Analysis 529

Hypothesis 2. The set of strategies adopted by the initiator group is {S1 self-discipline
behavior, S2 opportunism behavior}, and the set of strategies adopted by the platform
group is {P1 active regulation, P2 negative regulation}. “Self-disciplined behavior [2]”
means that the sponsors provide products that meet the standards in full accordance
with the platform crowdfunding product rules; “Opportunistic behavior” refers to the
initiator’s use of more information in his/her possession to exaggerate the quality, launch
products that are not newly developed, sell them on other platforms, hide adverse news
and other behaviors. “Active supervision” refers to the strict examination and full man-
agement of enterprise qualification, trademark and patent, product quality and page
design in the early stage. If the examination fails, the project will not be allowed to go
online and the violation will be fined. “Negative supervision” refers to the relaxation or
lowering of audit standards in the early stage, and also includes the basic supervision to
deal with investors’ rights protection in the later stage of the project to ensure the normal
operation of the platform’s reputation.

Hypothesis 3. The probability of the initiator choosing the self-disciplined behavior is


x, and the probability of the initiator choosing the opportunistic behavior is 1–x; The
probability of active supervision of the platform is Y, and the probability of negative
supervision of the platform is 1–y. (0 < x < 1, 0 < y < 1).

Hypothesis 4. W1 represents the successful income of the promoter of self-disciplined


behavior through platform crowdfunding, including specific monetary income such as
project funding and perceived value income such as brand promotion and customer
attraction. W2 represents the successful earnings of the sponsors of opportunistic behav-
iors through platform crowdfunding, including more trading volume brought by false
publicity and excessively high price earnings caused by exaggerated quality. C1 repre-
sents the cost of the initiator of self-regulation. C2 represents the cost of the initiator
of an opportunistic act. Because the production quality is lower than standard products
and other reasons to reduce the cost, so C1 > C2 . D1 represents the platform’s active
supervision cost, including the labor costs of management personnel, etc. D2 represents
the cost of negative supervision of the platform. D1 > D2 because it chooses to hire fewer
workers, etc. C3 represents the loss caused by opportunistic behavior to the promoter,
including the decline of the promoter’s product brand image. D3 represents the damage
that opportunistic behavior can cause to the platform, including a decline in credibility, a
decrease in user engagement, and negative perceptions of other crowdfunded products.
m represents the probability of not being approved by the platform. R represents the fees
charged by the platform to the sponsors, including platform service fee, publicity and
promotion fee, and overall operation fee, etc. Since the basic premise of the sponsor’s
participation in the project is to make a profit, R < W1 R < W2 . n represents the prob-
ability of the opportunistic promoter being complained, including all investors’ normal
rights protection actions. C4 is used to pay compensation on behalf of the alleged oppor-
tunist promoter, including cash compensation, loss of reshipment, etc. F is for actively
monitoring penalties for opportunistic behavior, including fines.
530 G. Yang et al.

2.2 Model Building


The game’s payoff matrix is constructed by the total revenue of the initiator group
and the platform group, as shown in Table 1. Among them, the value promoter in the
service ecosystem has an important influence on the evolution result of value co-creation,
and the platform plays the role of value promoter. When there is no revenue regulator
in the system, the platform will adopt a revenue distribution mode conducive to its
interests [4]. In product crowdfunding, due to the network externality of the two-sided
market, most platforms set the value acquisition mode as charging to sponsors and free
to investors. So, considering the value of the raised platform access mode, in building
the model, the platform’s benefits linked to sponsor income, namely when the originator
choose opportunism behavior, a platform for active regulation, when the originator to the
probability of m is platform to check the discrepancy, the authors not only loss mW2 , also
because of the project could not be online platform will mR profit losses. What needs
to be explained is that “failed by the platform” and “opportunistic sponsor complained”
are independent of each other, and the time is divided into early and late stages, with no
crossover between the probabilities.

Table 1. Revenue matrix for sponsors and platforms.

Platform
Positive regulation P1 Negative regulation P2
Originator Self-discipline behavior W1 − C1 − R,R − D1 W1 − C1 − R,R − D2
S1
Opportunistic behavior (1 − m)W2 − C2 − C3 − W2 − C2 − C3 − R −
S2 (1 − m)R − nC4 − (m + nC4 , R − D2 − D3
n)F,(1 − m)R − D1 − D3
+ (m + n)F

3 Evolutionary Game Analysis


According to the properties of the Malthusian equation, the process of replicating the
dynamic evolutionary game can be expressed by the system dynamics equation as fol-
lows, and the replicated dynamic equation has a stability theorem: F(x) = dx/xt = 0 and
F’(x) < 0 is an evolutionarily stable strategy [3].

ẋ = x(1 − x)[(mW2 − mR + mF + nF)y + W1 − C1 − W2 + C2 + C3 + nC4 ]
ẏ = y(1 − y)[(mR − mF − nF)x − mR − D1 + D2 + (m + n)F]
−W1 + C1 + W2 − C2 − C3 − nC4
Set y0 =
m(W2 − R) + (m + n)F
mR + D1 − D2 − (m + n)F D1 − D2
x0 = =1+
mR − (m + n)F mR − (m + n)F
An Evolutionary Game Analysis 531

Table 2. Results of local stability analysis.


532 G. Yang et al.

According to the above analysis, the five local equilibrium points of system evolution
are (0, 0), (0, 1), (1, 0), (1, 1), (x0, y0), where x = x0, y = y0, 0 The stability of the local
equilibrium points of the system can be judged by checking the sign of the determinant
and trace of the Jacobi matrix of the dynamic system. When the sign is opposite, the
stability is obtained, and the evolution-stable strategy ESS is obtained. The Jacobi matrix
A of the system is as follows:
 
A11 A12
A=
A21 A22

A11 = (1 − 2x)[(mW2 − mR + mF + nF)y + W1 − C1 − W2 + C2 + C3 + nC4 ]

A12 = x(1 − x)(mW2 − mR + mF + nF)

A21 = y(1 − y)(mR − mF − nF)

A22 = (1 − 2y)[(mR − mF − nF)x − mR − D1 + D2 + (m + n)F]

DetA = A11 ∗ A22 − A12 ∗ A21

TrA = A11 + A12 + A21 + A22

Based on the replication dynamic analysis of the initiator’s group behavior decision
and the platform group supervision decision above, the local equilibrium analysis is
discussed in the following 9 cases.
It can be clearly seen from Table 2 above that the local equilibrium point is (1, 0) in
case 1, 2 and 3; (0,0) in case 4, 5, 7 and 8; (0,1) in case 6; and there is no local equilibrium
point in case 9. The following conclusions can be drawn:

Conclusion 1. The condition for the initiator group to evolve into self-disciplined behav-
ior is that the utility of opportunistic behavior is less than that of self-disciplined behavior,
that is, W2 − C2 − C3 − nC4 < W1 − C1 . The evolution process can be accelerated by
increasing W1 , decreasing W2 , and increasing n.

According to the local equilibrium point in case 1, 2 and 3 is (1, 0), it can be seen that
the initiator group stably chooses the self-disciplined behavior, and W2 − C2 − C3 − nC4
< W1 − C1 , where, the cost of the initiators of self-disciplined behavior C1 , the cost of the
initiators of opportunistic behavior C2 , the loss caused by opportunistic behavior to the
initiators C3 and the compensation cost of the complained opportunist initiators C4 will
not change in a short time. If the initiators want to be guided to choose the self-disciplined
behavior, then in the mechanism design, it is necessary to increase the revenue W1 of the
sponsors of self-disciplined behaviors through platform crowdfunding, reduce the rev-
enue W2 of the sponsors of opportunistic behaviors through platform crowdfunding, and
improve the probability of the sponsors of opportunistic behaviors being complained n.
An Evolutionary Game Analysis 533

Conclusion 2. The conditions for the final evolution equilibrium of the platform to be
actively regulated are as follows: the sponsor’s group stably chooses the opportunistic
behavior decision, the platform charging service fee is less than the fine and the regulatory
cost difference is less than the difference between the fine and the service fee, that is
(1 − m)W2 − C2 − C3 + mR − nC4 − (m + n)F > W1 − C1 ; mR − (m + n)F < 0;
D1 − D2 < (m + n)F − mR The evolution process can be accelerated by increasing D2 ,
increasing F, decreasing R, and increasing n.

In the 9 cases of partial equilibrium analysis, only in case 6, the platform evolves
the active regulation into a stable strategy. It can be seen from one side that it is difficult
for the platform to adopt the stable active regulation strategy. The supervision cost is
one reason, and the value acquisition mode of the platform is another reason. In order to
make the platform choose active supervision, D1 − D2 < (m + n)F − mR, except the
probability m of not being approved by the platform and the cost D1 of active supervision
of the platform will not change in a short time, when the mechanism is designed, It can
increase the cost of the platform’s negative supervision (D2 ), increase the punishment
of opportunism by active supervision (F), reduce the fees charged by the platform to the
sponsors (R), and improve the probability of the sponsors being complained (n).

Conclusion 3. “Increase W1 , decrease W2 , increase D2 , increase F, decrease R, and


increase n” mentioned in Conclusion 1 and 2 can also be adopted to adjust the sponsors’
opportunistic behaviors and the negative supervision of the platform, so as to slow down
the evolution process.

Even in the case of 4 to 8 promoter group the opportunism behavior as evolutionary


stable strategy, in the case 1–5, and 7–8 platform group will be negative regulation as
evolutionary stable strategy, at this time also can be adjusted to the evolution speed,
make group of sponsors to opportunistic behavior, platform to slowing down of the
evolution of negative regulation, this time will extend, In order to achieve the long-term
sustainable development of the crowdfunding industry, it is necessary to modify the
mechanism design in the time gained and finally guide the sponsors to choose self-
disciplined behaviors stably. Therefore, based on the conditions of local equilibrium
analysis, the strategies to slow down the evolution rate of initiator groups to opportunistic
behaviors and passive management of the platform can be comprehensively referred to
conclusions 1 and 2.

4 Evolutionary Game Simulation Analysis


Evolutionary game is developed from biological population competition [4], so we can
draw lessons from population competition simulation and use MATLAB for data sim-
ulation to verify the theoretical derivation above. Through the change of data value, a
conclusion is drawn by comparison. The reason is that the control variable method is
adopted, and only the value of a certain variable is adjusted, which intuitively shows the
acceleration of the process of evolutionary stability strategy.
Prove conclusion 1: In order to pay attention to the evolution process speed of the
sponsors’ group self-regulation behavior, case 1, 2 and 3 all meet the research conditions.
534 G. Yang et al.

Here, case 2 is selected as the data graph of the control experiment. At this time, W1
= 8, W2 = 10, C1 = 6, C2 = 5, C3 = 2, C4 = 3, D1 = 6, D2 = 5, m = 0.4, R = 2,
n = 0.5, F = 0.5. When other conditions remain unchanged, only one place of data is
modified for comparison. When W1 = 10 is increased, 1-(b) is obtained; when W2 = 9
is decreased, 1-(c) is obtained; when n = 0.8 is increased, 1-(d) is obtained, as shown
in Fig. 1 below. It is obvious that Fig. B, C and D all accelerate the evolution process,
that is, increasing W1 , decreasing W2 and increasing N make the initiator group reach
the stable evolution strategy of self-regulation earlier.

Fig. 1. The dynamic evolution process of sponsor group with time in case 2

Prove conclusion 2: In order to pay attention to the platform group and actively
monitor the evolution process speed, condition 6 meets the research conditions, at which
time W1 = 8, W2 = 15, C1 = 6, C2 = 5, C3 = 2, C4 = 3, D1 = 6, D2 = 5, m = 0.1,
R = 2, n = 0.5, F = 3. When other conditions remain unchanged, only one data is
modified for comparison. When D1 = 5.5 is increased, 2-(b) is obtained; when F = 3.5
is increased, 2-(c) is obtained; when R = 1 is decreased, 2-(d) is obtained; when n =
0.8 is increased, 2-(e) is obtained, as shown in Fig. 2 below. It can be seen that Fig. B,
C, D and E all accelerate the evolution process, that is, the increase of D2 , the increase
of F, the decrease of R and the increase of N make the platform group reach the stable
evolution strategy of active supervision in advance.

Fig. 2. The dynamic evolution process of the platform group with time in case 6
An Evolutionary Game Analysis 535

Prove conclusion 3: Opportunism behaviors for the sponsors and platform, the regu-
lation of the negative regulatory steps to simplify the proof, the originator of the strategy
of local stability selection opportunism behavior and platform using negative regulatory
situation, the originator and platforms can be observed at the same time, the dynamic
evolutionary process of 4, 5, 7, 8 can be as a condition of research, here, case 4 is selected
as the data graph of the control test. At this time, W1 = 8, W2 = 15, C1 = 6, C2 = 5,
C3 = 2, C4 = 3, D1 = 6, D2 = 5, m = 0.1, R = 2, n = 0.5, F = 1. Under the condi-
tion that other conditions remain unchanged, only one data is modified for comparison.
When W1 = 10 is increased, the result is 3-(b); when W2 = 13 is decreased, the result
is 3-(c); when D2 = 5.5 is increased, the result is 3-(d) (the maximum value of t on
the horizontal axis in Fig. D is 50; the rest Fig. is 20); when F = 1.5 is increased, the
result is 3-(e); when R = 1 is decreased, the result is 3-(F). When n = 0.8 is increased,
4-(g) is obtained, as shown in Fig. 3 below. Can see from Fig. b−g group of sponsors
and platforms have slowed the evolution process, but the platform group slow amplitude
is relatively obvious, the originator group only slow and small, namely, increasing W1
and W2 , D2 , increase, reduce the R, F improve n can allow the originator group delay
to reach the steady evolution of opportunistic behavior strategy, Moreover, the stable
evolution strategy of passive supervision is delayed for the platform group, which has a
larger effect on the platform group and a smaller effect on the initiator group.

Fig. 3. Dynamic evolution of the two populations with time in case 4

5 Conclusion
First, the only way to solve the initiator’s opportunistic behavior is to make the utility of
opportunistic behavior smaller than that of self-disciplined behavior. Effective informa-
tion integration and high sharing among partners are the key factors for the success of
the alliance [5]. Therefore, the mechanism design of contemporary crowdfunding mode
should focus on how to improve the effectiveness of self-disciplined behavior and how to
reduce the effectiveness of opportunistic behavior, rather than reward and punishment.
536 G. Yang et al.

The ways to improve the effectiveness of self-regulation are as follows: On the


one hand, increase the originator self-discipline behavior through the platform the raise
income of success, here not only cash, include brand promotion, customers, drainage,
etc., under the background of network externalities, measuring leading customer value is
the key to form recommend network [6], such as Jingdong the raise to advocate to build
a from resource, ecological, business service to the investment of closed-loop, Make the
crowdfunding platform into a real ecosystem of entrepreneurship incubation [7]. On the
other hand, the main ways to reduce the effectiveness of opportunistic behavior are to
improve the moral level of sponsors through the guidance of the government, to reduce
the perceived value of the benefits brought by opportunistic behavior; Or simplify the
process of investor rights protection, the platform to visit investors actively.
The active supervision of the platform plays a key role in the healthy development of
the crowdfunding industry. First of all, the government could impose fines and stop oper-
ations on negative regulatory platforms. Second, expand the platform penalty revenue;
Finally, the fee charged by the platform to the sponsors should be reduced. Considering
the current value acquisition mode of the crowdfunding platform, the platform revenue
can be transferred to the incubation brand to enter online shopping malls, offline stores,
etc., and obtain benefits from other channels. The platform business should adopt an
overall perspective to maximize the comprehensive revenue.

Acknowledgment. This research was supported by the 2020 Ideological and Political Work
Research Project of Harbin University of Commerce (2020SZY002) and Humanities and Social
Science Research Planning Foundation of Ministry of Education of China (21YJAZH099).

References
1. Chen, Y.L.: Research on Business Model of Platform Enterprises in Two-Sided Market. Wuhan
University, Wuhan (2014)
2. Wang, W.Y., Zhang, N.: An evolutionary game analysis of the opportunistic behavior of P2P
online lending platform’s illegal operation and government supervision. Enterp. Econ. 37(10),
163–172 (2018)
3. Chen, J.H., Wang, H., Zhang, Y.Q.: An evolutionary game analysis of stakeholder value co-
creation in service ecosystem environment. Oper. Res. Manage. Res. 28(11), 44–53 (2019)
4. Smith, J.M.: Evolution and The Theory of Games. Cambridge University Press, Cambridge
(1982)
5. Zhang, S., Hu, X.: Game analysis on logistics cloud service discovery and combination. Int. J.
Serv. Sci. Technol. 8(10), 193–202 (2015)
6. Zhao, J., Li, Y., Ding, Y., Liu, C.: The value of leading customers in a crowdfunding-based
marketing pattern. PLoS ONE 14(4), 1–18 (2019)
7. Yuan, Y., Guo, H.H.: Building innovative platform to incubate trend design – dialogue with
Gao Zheng, head of JD crowdfunding. Decoration 01, 31–36 (2017)
Research on the Model of Word-of-Mouth
Communication in Social Networks Based
on Dynamic Simulation

Zhipeng Fan1,2(B) , Wen Hu1,2 , Wei Liu1 , and Ming Chen1,2


1 Harbin University of Commerce, Harbin 150028, China
fanzhipeng@hrbcu.edu.cn
2 Heilongjiang Provincial Key Laboratory of Electronic Commerce and Information Processing,

Harbin 150028, China

Abstract. Word-of-mouth communication plays an important role in consumers’


purchasing decisions. With the rapid development of the Internet, social networks
have become an important way for word-of-mouth communication. This study
draws on the dynamic model of the spread of infectious diseases and establishes
a model of word-of-mouth communication in social networks. By analyzing the
characteristics of word-of-mouth communication in social networks, the process
of withdrawing from the communicator group and becoming a communicator
again is introduced. This article uses a dynamic simulation method to simulate
and analyze the process of word-of-mouth communication in social networks. The
experimental results show that the withdrawal of the communicator group into a
communicator again can effectively increase the time of word-of-mouth commu-
nication in social networks and reduce the rate of decrease in the popularity of
word-of-mouth communication. The simulation analysis of word-of-mouth com-
munication in social networks provides theoretical support for enterprises to make
product promotion decisions.

Keywords: Word-of-mouth · Dynamic simulation · SEIR

1 Introduction
In the context of the Internet era, Internet Word of Mouth has profoundly affected con-
sumers’ willingness to purchase products and purchase decisions by virtue of faster
transmission speed and wider coverage. When consumers make purchase decisions for
new and never-experienced products, they tend to actively search for product-related
information through the Internet to obtain product reviews and suggestions from expe-
rienced consumers [1]. Among various Internet communication platforms, social media
has not only become a daily communication tool for more and more consumers, but
word-of-mouth recommendations from the social circle of acquaintances are generally
more likely to be adopted by consumers because of their high credibility.
Word-of-mouth communication is the informal exchange of information about prod-
ucts, services, or brands between non-commercial consumers. Compared with traditional

© The Author(s), under exclusive license to Springer Nature Switzerland AG 2022


A. E. Hassanien et al. (Eds.): BIIT 2021, LNDECT 107, pp. 537–546, 2022.
https://doi.org/10.1007/978-3-030-92632-8_50
538 Z. Fan et al.

mass media communication, word-of-mouth communication among consumers has the


characteristics of strong pertinence, high credibility, and low cost of communication.
For companies, product pricing decisions and final benefits also depend to a large extent
on the effect of word-of-mouth communication [2]. Positive word-of-mouth will posi-
tively impact the company, enhance its brand value, and increase its autonomy in pricing.
Negative Internet word-of-mouth will cause great damage to corporate brands, reduce
consumer trust incorporate products, and affect corporate sales.

2 Related Work

In the communication process, the spontaneity and uncertainty of word-of-mouth com-


munication make it impossible for companies to control the information dissemination
process, and the effect of word-of-mouth communication is usually difficult to evaluate
and predict. On the other hand, the social network between consumers as a communica-
tion channel of information plays a key role in the effect of word-of-mouth communica-
tion on social media platforms. The changes in the social network between consumers
have an important impact on the effect of word-of-mouth communication [3].
In terms of word-of-mouth communication models, most of the existing studies are
based on the information dissemination model to model the process of online word-of-
mouth communication. Online word of mouth is information, and Online word-of-mouth
communication on social networks is essentially information dissemination. From the
perspective of information dissemination: The paper [4] is based on the infectious dis-
ease transmission model, combined with multi-agent modeling methods, and explores
the process of negative word-of-mouth information dissemination in online communi-
ties through simulation experiments. On this basis, a negative word-of-mouth threshold
propagation model is constructed [5]. Based on the information cascade theory, the paper
[6] studied the tree-like communication structure of online community word-of-mouth
information. It proposed the main communication structure indicators such as the scale
of communication and the depth of communication. The paper [7] considered both pos-
itive and negative word-of-mouth, and proposed a SIPNS model based on the infectious
disease model, and analyzed the word-of-mouth communication process and the profit
influencing factors of word-of-mouth marketing. It can be seen that in terms of word-of-
mouth communication model research, infectious disease models such as SIR and SIS
and the SI model is still the mainstream.
The existing research on online word-of-mouth communication models is mostly
based on the social relationships between users of social networks, such as the follow-be-
followed relationship in the Weibo network [8], and the friend relationship in the WeChat
network. As long as there is a friendship, it is usually assumed that the probability of
word-of-mouth communication is the same [9]. In practice, due to the credibility of word-
of-mouth sources, the probability of spreading varies from person to person. Because of
the deficiencies of existing research, this article is based on the SEIR infectious disease
model, considers the phenomenon of re-spreading in word-of-mouth communication,
and explores the influence of various influencing factors in social networks on online
word-of-mouth communication.
Research on the Model of Word-of-Mouth 539

3 SEIRS Propagation Model


3.1 Dynamics Simulation of Infectious Diseases

The principles of network information dissemination and disease transmission are very
similar [10]. First of all, both transmission mechanisms are transmitted through infectious
channels. The spread of disease means that healthy people are infected after they come
into contact with patients. In the online network, people who don’t know the topic
information are infected after contact with the disseminator, so that the information can
be further disseminated [11]. Second, the traditional infectious disease model divides
the population into healthy people, infected people, and cured people. Similarly, in
a complete online information dissemination network, users can also be divided into
groups of people who do not know the information, people who know that they have not
yet chosen to disseminate, people who know and choose to disseminate behavior, and
people who are immune to the information [12]. Therefore, this article will study the
information dissemination mechanism in online social networks based on the infectious
disease model.

3.2 Build the Model

Hypothesis: The total number of people is fixed during the process of information dis-
semination in social networks. The number of friends and relationships on Weibo remains
unchanged. After disseminating the information, the user will not contact the same infor-
mation again. The user’s level of interest in the information will not remain unchanged
and will be affected by the surrounding friends. And the trust relationship between friends
will also affect the user’s credibility of the information. The main body of information
dissemination is mainly composed of the disseminators and receivers of the information.
In information dissemination, the information disseminator and information receiver are
generally regarded as a node respectively. In Weibo, all users have a certain probability
of seeing this information, and there is a certain probability to forward, comment, or be
directly immune. Therefore, based on this dissemination mechanism and users’ potential
behavioral responses, comprehensively considering the microblog information dissem-
ination process, the nodes in the microblogging network are divided into the following
five forms:

1) S-node: The initial participating group refers to the initial group that has the
ability to obtain information, can see word-of-mouth information, and may conduct
communication behaviors.
2) Node E: A group of lurkers, a group of users who have been exposed to word-of-mouth
information but have not yet made a dissemination decision.
3) Node I: Communicator group refers to the user group that has browsed word-of-mouth
information in social networks and made clear decisions on dissemination.
4) R node: immune group refers to the user group that has completed or refused to spread
after obtaining word-of-mouth information.
5) S’ node: Participate in the group again. It means that after the immune group exits the
communication, there is a certain probability that µ will become the initial participant
540 Z. Fan et al.

group again and participate in the word-of-mouth information dissemination after re-
acquiring the word-of-mouth information or continue to refuse the dissemination.

According to the expressions of the above 5 types of nodes, construct the SEIRS
theoretical model, as shown in Fig. 1:

Fig. 1. SEIRS model

Use N to represent the number of all nodes in the social network. It is assumed that
the number of nodes in a certain period of time is constant, that is, the total number of
users at different nodes remains N . S(t) represents the number of unknown nodes at time
t, E(t) represents the number of latent morphological nodes at time t, I (t) represents
the number of nodes in the propagation node state at time t, and R(t) represents weak
immunity at time t The number of state nodes, O(t) refers to the number of nodes exiting
at time t. Based on the analysis of the above theoretical model, a dynamic differential
equation is established:

dS(t)
= −µ1 S(t)I (t) + µR(t)
dt
dE(t)
= µ1 S(t)I (t) − µ2 E(t)
dt
dI (t)
= µ2 E(t) − µ3 I (t)
dt
dR(t)
= µ3 E(t) − µR(t)
dt

S(t) + E(t) + I (t) + R(t) + O(t) = N

3.3 State Transition Rules


1. Initial participation group. Refers to groups in social networks that have the ability
to obtain word-of-mouth information.
2. The rules for the transformation of latent groups. Groups with the ability to obtain
information are interested in word-of-mouth information with a probability of µ1,
generate browsing, like, comment and other behaviors, and become a potential
communication group. They can further conduct the communication behavior of
word-of-mouth information forwarding and become a group of latent people.
Research on the Model of Word-of-Mouth 541

3. The rules of communicator group transformation. After obtaining the word-of-mouth


information, the lurking group will generate word-of-mouth information forwarding
and disseminating behavior with a probability of µ2 based on the judgment of the
trust relationship and their own experience, and become word-of-mouth information
disseminators.
4. The rules for the transformation of immune group status. With the passage of time,
according to the theory of information forgetting, the communicator group will
withdraw from the communication behavior with a probability of µ3, transform
into an immune group, and no longer do word-of-mouth information dissemination
behavior.
5. Because of the delay of word-of-mouth communication, the immune group can
obtain word-of-mouth information again through other nodes in the social network.
Because of the enhanced trust in word-of-mouth information, it becomes the initial
participating group again with the probability of µ and continues to participate in
the spread of word-of-mouth information.

4 Experimental Results and Analysis

This article uses MATLAB software to simulate the proposed word-of-mouth commu-
nication model. µ2 is the probability that a latent person turns into a communicator.
Therefore, it has a greater impact on the number of word-of-mouth communicators. We
first simulate and analyze the changes of µ2. By changing different µ2 sizes, observe
the changes in the number of latent groups and word-of-mouth communicators groups.
The simulation result is shown in Fig. 2.

Fig. 2. The impact of µ2 changes on the number of word-of-mouth communicators

From Fig. 2, we can see that with the continuous increase of µ2 from 0.1 to 0.7, the
balance point of the number of lurking nodes first increases and then decreases. In other
words, when the probability of turning a lurker node into a communicator node gradually
increases, the extreme value of the lurker node first increases and then decreases, instead
542 Z. Fan et al.

of continuously increasing. Therefore, for the promotion and exposure of specific word-
of-mouth topics in actual social networks, the higher the exposure, the better. Instead,
the best value should be grasped in advance, and the communication efficiency should
be maximized while spending the appropriate publicity price. As the probability of µ2
increases, the extreme value of the communicator group is also increasing, but as time
goes by, the number of communicator groups will also decline rapidly. This is also in line
with the law of the actual word-of-mouth dissemination process. However, many infor-
mation recipients are affected by the trust relationship in social networks and become
information disseminators, disseminating word-of-mouth information. However, as an
ordinary communication group, the word-of-mouth information is usually forwarded
and spread only once, and few people will spread the word-of-mouth information that is
forwarded multiple times. Therefore, as time goes by, many communicators will trans-
form into an immune group and will not spread word-of-mouth information again in a
short period of time. Therefore, if you want to make word-of-mouth topics more popular
quickly in real word-of-mouth communication, you need to increase efforts to convert
more lurkers into communicators.

Fig. 3. The impact of return probability µ on word-of-mouth communication

In order to verify the impact of the re-transformation of the immune group into the
initial group on word-of-mouth communication, we verified the changes of each group
in the process of word-of-mouth communication when the return probability µ = 0
and µ = 0.01. The experimental results are shown in Fig. 3. From Fig. 3, we can see
that no matter whether the return probability is 0 or not, the number of communicators
will increase rapidly and reach a peak. Over time, the number of communicator groups
will also rapidly decline. However, the difference is that if the return probability is 0,
the number of communicators will drop rapidly and gradually approach 0. It can be
seen from Fig. 3 that when the return probability is 0.01, the number of communicators
will slow down and will gradually stabilize while maintaining a certain number of com-
municators. In actual word-of-mouth communication, if the communicator group only
spreads word-of-mouth information once, the popularity of word-of-mouth information
Research on the Model of Word-of-Mouth 543

will drop rapidly after reaching its peak, and it will be replaced by the heat of other
information.
In order to further verify the impact of return probability µ on word-of-mouth com-
munication, we simulated the situation where µ increased from 0.01 to 0.05. As shown
in Fig. 4, Fig. 4a shows the relationship between the initial population number and µ.
Figure 4b shows the relationship between the number of latent groups and µ. Figure 4c
shows the relationship between the number of communicators and µ. Figure 4d shows
the relationship between the number of exiting communicators and µ. From the figure,
we can see that the greater the probability of return, the more positive impact on word-
of-mouth communication will be. The number of word-of-mouth communicators is at a
higher level, which is more conducive to the spread of word-of-mouth.

Fig. 4. The influence of transfer probability on the number of broadcasters

To verify the influence of each transition probability on the number of communica-


tors, we conduct simulation experiments on the changes in the number of communicators.
As shown in Fig. 5. It can be seen from Fig. 5a that as the value of µ1 increases, the
greater the probability of transforming from the initial group to the latent group, the
number of communicator groups also increases, and the communicator group can reach
a faster Maximum value. By comparing the effects of changes in various parameters in
Fig. 4, it can be found that the probability value of µ1 has the greatest impact on the
544 Z. Fan et al.

number of final communicators. It shows that the greater the number of people who
become latent, the more conducive it is to become communicators in the end. In the
actual word-of-mouth dissemination process, it is necessary to increase the exposure
of the information as much as possible so that more people can receive word-of-mouth
information, thereby increasing the number of lurkers. µ2 is the probability that the
latent group transforms into the communicator group. It can be seen from Fig. 5b that
the larger µ2 is, the greater the number of communicators will be, which is the same as
the actual situation. But through simulation, we can find that although with the increase
of µ2, the number of communicators will increase rapidly. However, as time goes by,
the number of communicators will rapidly decay, and eventually, the number will stabi-
lize. When the µ2 probability is small, although the number of communicators grows
slowly, the number of communicators stays longer. After the final number stabilizes,
the number of communicators is not much different from when µ2 is high. This also
shows that in actual word-of-mouth communication, while pursuing a higher transfer
probability from lurkers to communicators, we should also find ways to increase the
duration of the number of high communicators. If the number of communicators can be
maintained at a high level for a long time, it will bring more word-of-mouth exposure,
which is equivalent to the spread of word-of-mouth. µ3 is the transformation probability
of the immune population. From Fig. 5c, it can be seen that this probability has a greater
impact on the number of communicators. In actual word-of-mouth communication, if
the communicator quickly loses interest in word-of-mouth information, the number of
communicators will rapidly decay.
For the return probability µ introduced in this article, it can be seen from Fig. 5d that
it has a very large impact on the number of communicators. When µ keeps increasing
from 0.1 to 0.7, it has little effect on the time for the number of communicators to reach
the maximum. The impact on the maximum number of communicators is also small.
But it has a very large impact on the stable number of communicators. This also shows
that in the process of word-of-mouth communication, if a part of the immune group
gets word-of-mouth information again to become a group of word-of-mouth commu-
nicators after they withdraw from the communication, it will stabilize the number of
word-of-mouth communicators with a higher number. This will achieve better word-
of-mouth communication effects. Therefore, in the actual word-of-mouth communica-
tion process, after a period of time of word-of-mouth communication, certain measures
should be taken to increase the exposure of word-of-mouth information again. Some
of the groups who have withdrawn from word-of-mouth communicators can be trans-
formed into potential communicators again through certain incentive measures. In this
way, word-of-mouth communicators can be stabilized at a higher value, and a better
word-of-mouth communication effect can be achieved.
Through the above simulation analysis, it can be seen that in the process of word-
of-mouth communication in social networks, various elements in users, information
and media will have different effects on the efficiency and speed of word-of-mouth
communication.
Research on the Model of Word-of-Mouth 545

Fig. 5. The impact of changes in transition probabilities on word-of-mouth communication

5 Conclusion
This article is based on the model of infectious disease dynamics; combined with the
characteristics of actual word-of-mouth communication, new user status nodes are added,
and a new social network information dissemination model is constructed. The influence
of the transition probability of each group in word-of-mouth communication on word-
of-mouth communication is analyzed by simulation. The experimental results show
that the model constructed in this paper can truly reflect the law of social network
information dissemination. The communication model constructed in this article can
provide theoretical support for enterprises in formulating word-of-mouth communication
strategies to achieve the best word-of-mouth communication effect.

Acknowledgment. This research was supported by the Heilongjiang philosophy and Social
Science Fund Project (21GLC186).

References
1. Guerreiro, J., Pacheco, M.: How green trust, consumer brand engagement and green word-
of-mouth mediate purchasing intentions. Sustainability. 13, 7877 (2021)
546 Z. Fan et al.

2. Cheng, X.S.: The paradox of word-of-mouth in social commerce: exploring the juxtaposed
impacts of source credibility and information quality on SWOM spreading. Inf. Manage.
58(7), 103505 (2021)
3. Hussain, S., Ahmed, W., Jafar, R.M.S.: eWOM source credibility, perceived risk and food
product customer’s information adoption. Comput. Hum. Behav. 66, 96–102 (2017)
4. Cai, S.Q., Wang, W., Zhou, P.: Research on the dissemination of negative word-of-mouth
information in network communities based on multi-agents. Comput Sci. 43(4), 70–75 (2016)
5. Cai, S.Q., Yuan, Q., Zhou, P.: Research on the linear threshold propagation model of negative
word of mouth under corporate response. J. Syst. Eng. 32(2), 145–155 (2017)
6. Deng, W.H., Yi, M.: Research on the tree communication of online community word-of-
mouth information based on the information diffusion cascade theory. Chinese J. Manage.
14(2), 254–260 (2017)
7. Li, P., Yang, X., Yang, L.X.: The modeling and analysis of the word-of-mouth marketing.
Phys. A Statist. Mech. Appl. 493, 1–16 (2018)
8. Edmond, A., Michael, A., Susan, L.A.: An approach for combining ethical principles with
public opinion to guide public policy. Artif. Intell. 287, 303–349 (2010)
9. Li, Z., Xu, Y., Li, K.: The influence factors of collective intelligence emergence in knowledge
communities based on social network analysis. Int. J. Intell. Syst. 9(1), 23–43 (2019)
10. Yu, W., Shanshan, C., Xinchu, F.U.: Review and prospect of propagation dynamics models.
Commun. Appl. Math. Comput. 32(2), 267–294 (2018)
11. Thomas, W.C.: Equivalent probability density moments determine equivalent epidemics in a
sirs model with temporary immunity. Theor. Popul. Biol. 113, 1–9 (2017)
12. Pu, C., Li, S., Yang, X.X.: Traffic-driven SIR epidemic spreading in networks. Phys. A Statist.
Mech. Appl. 446, 29–137 (2015)
Spatial Correlation Analysis of Green Finance
Development Between Provinces
and Non-provinces Along the Belt and Road
in China

ChenYang Zheng1 and Xiaohong Dong2(B)


1 School of Finance, Harbin University of Commerce, Harbin 150000, China
2 College of Economics and People’s Livelihood and Welling, Zhejiang Shuren University,

Hangzhou 130000, China

Abstract. Based on the panel data of China’s green finance development from
2008 to 2018, this paper constructs the green finance linkage network with the
gravity model. It uses the social network analysis method to analyze the network
structure characteristics and the green finance linkage network differences between
The Belt and Road and non-B&R provinces in China. The conclusions are as
follows: the spatial acceptance effect of the provinces along The Belt and Road is
relatively stronger, and the intermediary role in the green financial development
network is relatively stronger. Among the sections that divide the green finance-
related network, the provinces along The Belt and Road are more in the main
beneficiary section and two-way spillover board, promoting the development and
communication of green finance among the provinces. The paper puts forward
some policy suggestions, such as strengthening the spatial connection of green
finance in different provinces and implementing financial policies with regional
differentiation.

Keywords: Green finance · The Belt and Road · Gravitation model · Social
network analysis

1 Introduction

With the conflict between the traditional energy industry, manufacturing industry, and
environment becoming more intense, Green finance has developed rapidly. Boao Forum
for Asia 2021 has put one theme of “The Belt and Road Initiative”. In this context, it is
of great significance to conduct in-depth research on green finance from the perspective
of The Belt and Road. There are more and more studies related to it. In studying the
impact of green finance on the economy and environment, most scholars start from the
micro-level. For example, taking commercial banks as the research object, some scholars
analyze the impact of green credit on the risk-taking and competitiveness of commercial
banks by using the method of difference of differences and regression analysis (Shao
Chuanlin and Yan Yongsheng 2020; Gao Xiaoyan and Gao Ge 2018). From a macro

© The Author(s), under exclusive license to Springer Nature Switzerland AG 2022


A. E. Hassanien et al. (Eds.): BIIT 2021, LNDECT 107, pp. 547–556, 2022.
https://doi.org/10.1007/978-3-030-92632-8_51
548 C. Zheng and X. Dong

perspective, a few scholars have discussed the role of green finance in the construction
of green technology innovation systems and the process of industrial transformation and
efficiency improvement in China (Yan Jinqiang and Yang Xiaoyong 2018; Gu Beibei
et al. 2021). Some scholars used the mediating effect model to discuss the dual impact of
green credit on the economy and environment (Wang Yanli et al. 2021). Some scholars
used the propensity matching score method to prove the green finance experimental
area (Huang Haifeng and Zhang Jing 2021). In the research on the development status
of green finance, some scholars choose to study it from the perspective of financial
instruments. Some scholars chose the model of maximizing benefits to analyze the
impact of China’s green bond environment and its external characteristics on both sides
of the bond transaction and local government (Ba Shusong et al. 2019). Some studies use
the factor model to verify green incentives in China’s stock market, and green resources
are mainly allocated to the top green enterprises (Liu Yong and Bai Xiaoying 2020).
Some scholars also focus on the regional analysis. For example, some studies selected
the fixed effect space Dubin model, took the green finance in Guangdong Province as
the research object, and got its development characteristics and influencing factors (Yu
Fengjian and Xu Feng 2019). In the research on opening strategies related to green
finance, Some scholars explored the development status of green finance in The Belt
and Road strategy and provided suggestions for the development of green finance under
the strategy (Cao Mingdi and Dong Ximiao 2019; Yu Hongyuan and Wang Wanfa 2021).
From the above research, previous literature reflects the increasing practice of green
finance in China. However, there are several limitations: First, there are still few regional
analyses and studies, and the spatial characteristics of green finance revealed are only
limited to local areas. Second, when analyzing the regional characteristics of green
finance, it ignores the role of national strategy in developing regional green finance. Given
the above shortcomings, this paper makes the following improvements: First, the social
network analysis method is adopted to comprehensively analyze the changing trend of
the whole network and each node. Second, this paper combines the development strategy
of The Belt and Road when analyzing the correlation network. From the perspective of
The Belt and Road strategy, this paper reveals the spatial correlation difference of green
finance development between provinces along the Belt and One Road and provinces
outside the Belt and One Road.

2 Model Construction
2.1 The Gravity Model
This paper chooses the gravity model to construct a regional social network. In order to
increase the applicability of the gravity model to the social network analysis method,
this paper revises it. The model is as follows:

3
 √2
Gi Ii Ci 3 Gj Ij Cj G i Ii
Rij = Kij Kij = √ √ (1)
Dij2 2
Gi Ii + 2 GJ IJ

In Eq. (1), Rij represents the correlation strength of green finance development level
between I and J province. Gi and Ii jointly represent the development level of green
Spatial Correlation Analysis 549

finance in I province. Considering the availability and scientific of data, Gi and Ii are
expressed by the green credit ratio and total investment in environmental pollution control
of a province (Dong Xiaohong and Fu Yong 2018). Ci is an indicator of the spatial
connection of the development level of green finance. The carbon emissions of a province
are closely related to the local economy and environment, so this paper uses the carbon
emissions of province I to represent Ci . Dij represents the distance between province I
and province j, and Kij is the gravitational parameter.
According to the formula (1) model, this paper calculates the gravity matrix of the
intensity of inter-provincial green finance development level correlation. It conducts
binarization processing on the obtained matrix. The threshold value selected for pro-
cessing in this paper is the mean value of matrix columns. The calculation results higher
than the threshold value are denoted as 1, and the results lower than the threshold value
are denoted as 0.

2.2 Network Characteristic Index


Overall Network Characteristics. This paper describes the overall network charac-
teristics through network density. The network density reflects the density distribution
of the association between provinces. The higher the network density is, the closer the
inter-provincial green finance development level is, the more distant it is otherwise. and
vice versa. The network density D is shown as follows:
L
D= (2)
N(N − 1)
In Eq. (2), N is the number of provinces studied in the network, and L represents
the total number of correlation relationships actually existing among provinces in the
network.

Individual Network Characteristics. The characteristics of individual network can


be described by means of centrality index, which reflects the position and influence of
provinces along The Belt and Road and provinces outside The Belt and Road in network.
The centrality index includes three indexes: degree, betweenness, closeness.

The degree is expressed by De, and its formula is as follows:


L
De = (3)
N−1
The betweenness is expressed by Be, and its formula is as follows:
 gjk(i)
j gjk
Be = (N−1)(N−2)
(4)
2

The closeness is expressed by Cl, and its formula is as follows:


N−1
Cl =  (5)
j Dij
550 C. Zheng and X. Dong

In Eq. (3), (4) and (5), the meanings of N and L are consistent with that of Eq. (2).
In Eq. (4), gjk represents the number of shortest paths between province j and k; gjk(i)
represents the number of shortest paths between province j and k through I. In formula
(5), Dij represents the shortest path of the province i and province j.

2.3 Sample Selection and Data Sources


Due to the lack of data, this paper mainly studies 30 regions and provinces in China
except Hong Kong, Macao, Taiwan, and Tibet. The samples were screened from 2008
to 2018. The original data selected in this paper are obtained from China Environmental
Statistics Yearbook and China Carbon Emission Database.

3 Analysis on the Network Characteristics


3.1 Overall Network Characteristics

This paper uses Ucinet software to analyze the overall network characteristics of green
finance development level (as shown in Table 1).

Table 1. Analysis of overall network characteristics

Year R M N Correlation Efficiency


2008 193 870 0.2218 1.0000 0.7586
2009 199 870 0.2287 1.0000 0.7512
2010 200 870 0.2299 1.0000 0.7537
2011 198 870 0.2276 1.0000 0.7586
2012 198 870 0.2276 1.0000 0.7512
2013 193 870 0.2218 1.0000 0.7611
2014 196 870 0.2253 1.0000 0.7586
2015 194 870 0.2230 1.0000 0.7562
2016 190 870 0.2184 1.0000 0.7660
2017 193 870 0.2218 1.0000 0.7562
2018 199 870 0.2287 1.0000 0.7463
Note: R is the abbreviation of Relationship between the number; M is the abbreviation of Maximum
relation number; N is the abbreviation of Network density

As shown in Table 1, the overall density level of China’s green finance development
network has little change. Although the variation trend of network density was not stable
in the sample period, the overall network density remained in the range of 0.21–0.23, with
a small fluctuation range. In the sample investigation period, the maximum number of
network relationships is 200, far less than the maximum number of network relationships
Spatial Correlation Analysis 551

870. This shows that there is still great room for progress in promoting inter-provincial
green finance links.
In this paper, NetDraw, a visualization tool of Ucinet, is used to draw the spatial
association network diagram of green finance development level between provinces and
non-provinces along The Belt and Road in China (as shown in Fig. 1).

2008 years 2010years 2012years

2014 years 2016 years 2018 years

Fig. 1. Spatial association network diagram

Note: ■ means The Belt and Road province ■ means non-B&R provinces

There are obvious regional green finance-related small groups in the overall network
of the provinces along The Belt and Road, such as the three eastern provinces, Gansu
and Ningxia in the northwest, and the southern marginal provinces such as Shanghai,
Zhejiang from Fig. 1. The Belt and Road strategy enhances the relevance of the green
financial network. It can be seen from Table 1 that the network correlation degree from
2008 to 2018 is 1. It shows that there is a connection between the development of green
finance in all provinces of China.

3.2 Centrality Analysis

This paper mainly shows the influence of each province on the green finance development
network in China through three indicators: degree, closeness and betweenness. Table 2
is obtained using data from 2018.

The Difference in Degree. The degree mainly reflects the province’s degree in the data
core position in the national linked data network. According to Table 2, there are 17
provinces exceeded the national average. These provinces are more associated with
other provinces in the degree network. The last five-degree in China are Jilin, Shanghai,
552 C. Zheng and X. Dong

Table 2. Network centrality analysis

Provinces Degree Betweenness Closeness


O I C S C S C S
Beijing 7 5 24.138 7 10.508 22 65.000 6
Shanghai 2 3 10.345 9 0.000 27 72.000 3
Tianjin 5 7 27.586 6 88.384 4 57.000 11
Chongqing 7 6 31.034 5 32.059 16 57.000 11
Heilongjiang 2 3 10.345 9 0.500 25 82.000 2
Jilin 2 2 6.897 10 0.000 27 83.000 1
Liaoning 3 9 31.034 5 98.953 3 56.000 12
Inner Mongolia 11 7 37.931 3 75.569 6 55.000 13
Hebei 10 6 34.483 4 21.139 20 56.000 12
Shanxi 11 4 37.931 3 4.771 24 55.000 13
Shandong 13 8 44.828 1 100.505 2 48.000 17
Henan 10 8 41.379 2 75.510 7 48.000 17
Shaanxi 8 11 44.828 1 109.554 1 48.000 17
Gansu 5 6 24.138 7 15.173 21 61.000 9
Ningxia 5 9 34.483 4 37.250 15 56.000 12
Qinghai 3 4 17.241 8 0.000 27 68.000 5
Xinjiang 0 11 37.931 3 0.000 27 54.000 14
Anhui 11 6 37.931 3 46.971 14 49.000 16
Jiangsu 11 6 37.931 3 77.501 5 50.000 15
Zhejiang 8 6 27.586 6 22.850 19 61.000 9
Hunan 9 9 44.828 1 57.429 10 50.000 15
Jiangx 7 7 24.138 7 56.635 11 62.000 8
Hubei 11 7 41.379 2 50.234 12 48.000 17
Sichuan 9 5 31.034 5 59.377 9 60.000 10
Guizhou 7 7 27.586 6 25.308 18 61.000 9
Fujian 4 8 31.034 5 48.760 13 55.000 13
Guangdong 6 11 41.379 2 65.172 8 55.000 13
Hainan 2 5 17.241 8 0.200 26 71.000 4
Guangxi 4 7 24.138 7 7.878 23 64.000 7
Yunnan 6 6 27.586 6 26.808 17 61.000 9
Mean 6.633 6.633 30.345 40.500 58.933
— — —
Note: O is the abbreviation of outdegree; I is the abbreviation of indegree; C is the abbreviation
of centrality; S is the abbreviation of sort.
Spatial Correlation Analysis 553

Heilongjiang, Qinghai, and Hainan. This fully shows that these provinces are less corre-
lated with other key provinces in the degree network. The reason may be that these places
are remote or their carbon emissions are too high, which affects their development of
green finance. According to Table 2, Shaanxi, Xinjiang, Guangdong, Liaoning, Ningxia
have a higher indegree. They all belong to the provinces along The Belt and Road route.
Drived by the Belt and Road development strategy, their connection to green finance
development has also been enhanced.

The Difference in-Betweenness. Betweenness indicates the control of the province


over other areas in the adjacent association network. According to Table 2, 14 provinces
exceeded the national average. This indicates that these provinces have a stronger ability
to control the communication of green finance among other provinces in the network. In
the 17 provinces along The Belt and Road, only five provinces have higher betweenness
than the national average. This may be because more One Belt And One Road provinces
are on the periphery of the country. They have a lower degree of control over other
provinces in the network.

The Difference in Closeness. Closeness represents the extent to which other regions
within the same network do not control a province. The 12 provinces exceeded the
national average. It shows that these provinces play the role of central actors in the
network. The bottom five provinces are Shandong, Henan, shaanxi, Hubei, and Anhui.
Due to their economic and location limitation, these provinces play the role of marginal
actors in the network. It can be found that the top five provinces are all along The Belt
and Road. Under the influence of The Belt and Road strategy, the efficiency of economic
connectivity between these provinces and other provinces has been enhanced.

3.3 Block Model Analysis


This paper chooses to use block model to reveal the characteristics of spatial spillover
effect. Through the CONCOR program of Ucinet software, the data of 2018 were
adopted. The results are shown in Table 3 and Table 4.

Table 3. Green finance development in four major sectors

Plate Members of the plate N


1 Beijing, Hebei, Tianjin, Shandong, Heilongjiang, Jilin, Liaoning, Inner Mongolia, 10
Henan, Shanxi
2 Gansu, Xinjiang, shaanxi, Sichuan, Ningxia, Qinghai 6
3 Jiangsu, Anhui, Fujian, Zhejiang, Hunan, Jiangxi, Hubei, Shanghai 8
4 Guizhou, Chongqing, Guangdong, Hainan, Guangxi, Yunnan 6
Note: 1 stands for first plate; 2 stands for second plate; 3 stands for third plate; 4 stands for fourth
plate; N is the abbreviation of Number of members
554 C. Zheng and X. Dong

Table 4. Spillover effects of green finance spatial related sectors

Plate 1 2 3 4 N P A C
1 51.000 17.000 7.000 0.000 10.000 31.034 68.000 Br
2 3.000 22.000 0.000 5.000 6.000 17.241 73.333 M
3 6.000 2.000 41.000 14.000 8.000 24.138 65.079 Br
4 0.000 5.000 4.000 23.000 6.000 17.241 71.875 Bi
Note: 1 stands for first plate; 2 stands for second plate; 3 stands for third plate; 4 stands for
fourth plate; N is the abbreviation of Number of members; P is the abbreviation of Percentage of
expected internal relationships (%); A is the abbreviation of Actual Internal Relationship Ratio
(%); C is the abbreviation of Characteristics of plate; Br is the abbreviation of Broker board; M is
the abbreviation of Main beneficiary block; Bi is the abbreviation of Bidirectional overflow plate

The first plate is a brokered board according to the results in Table 4, which both
accepts external contacts and sends contacts to other boards. It plays an important role
of “intermediary” and “bridge” in China’s development network of green finance. The
second plate is the main beneficiary block. The major provinces of this plate are all in
the western region, where the economy is underdeveloped, and the environment is under
great pressure. Its green finance development is greatly affected by other sectors. The
third plate is a brokered board. It also acts as an “intermediary” in the green finance
development network, but its spillover relationship is less than that of the first sector.
The fourth plate is a two-way overflow plate, which sends out the contact and receives
the contact of other plates.
It can be seen that the provinces along The Belt and Road are mainly distributed in the
main beneficiary plate and the two-way overflow plate, according to Table 3. Most of the
provinces along The Belt and Road are located in marginal areas. They are remote, and
the conflict between the natural environment and economic development is serious. The
implementation of The Belt and Road strategy has enhanced economic exchanges and
cooperation between provinces along The Belt and Road and other Chinese provinces
or countries along The Belt and Road.

4 Conclusions and Policy Recommendations


4.1 Conclusions

Based on 30 provinces’ data from 2008 to 2018, this paper constructs the spatial cor-
relation network of green finance development in China and investigates its network
structure characteristics. The results are as follows:

(1) The development of green finance in China constitutes an overall network. But
overall network density was relatively low and generally remained stable during the
sample period. It shows that there is still much room for progress in the connection
of green finance development in China’s provinces.
Spatial Correlation Analysis 555

(2) Provinces along The Belt and Road have a higher degree and closeness. This means
that the spatial acceptance effect of provinces along The Belt and Road is relatively
stronger, and the intermediary role is relatively stronger in the network. Their close-
ness is relatively low, indicating that the degree of control over other provinces in
the green finance network is relatively low.
(3) The Belt and Road provinces are mainly distributed in the main beneficiary plate
and the two-way overflow plate. The main members of these two plates are the
western region and the southern marginal provinces. Their economic exchanges
and cooperation with other regions have been strengthened due to The Belt and
Road.

4.2 Policy Suggestions


To strengthen the spatial spillover effect of green finance in provinces, the types and
numbers of green financial products should be increased to ensure a sound market system
of green financial products. (1) It should give full play to the promoting effect of The
Belt and Road strategy on green financial linkage among different regions. Blockchain
and other network technologies to reduce information asymmetry and adverse selection
and the collaborative development of green finance can be promoted.
(2) The government should take measures by local conditions when considering the
development policies of green finance. Most of the provinces along The Belt and Road are
located in the country’s periphery, with a relatively low level of economic development
and great environmental pressure. The government can provide preferential policies such
as low-interest loans and tax breaks for green finance enterprises. Different provinces
should give full play to their advantages, the northwest region of wind, solar energy,
southwest hydropower, and other clean energy development.
For provinces along The Belt and Road, which are the main beneficiaries, the gov-
ernment should focus on their green finance development environment, promote legal
system construction, and market mechanism perfection. (3) For the provinces along The
Belt and Road in the two-way overflow board, the government should promote inno-
vation in traditional energy industries and the development of new green industries to
enhance the spatial spillover effect of “power sources”. For the provinces not along The
Belt and Road with brokers, it is necessary to strengthen the adjustment of industrial
structure and the construction of local green financial service institutions to better play
its “intermediary” role in the network.

Acknowledgment. Fund Project: China Postdoctoral Science Foundation Project “Research on


measuring the development level of green finance” (Project No.: 2018M631940); Key research
project of economic and social development in Heilongjiang Province “Research on green financial
derivatives supporting the development of green agriculture in Heilongjiang Province” (Project
No.: 18217).

References
Chuanlin, S., Yongsheng, Y.: Is green finance a “double-edged sword” for commercial banks to
take risks? J. Guizhou Univ. Finance Econ. 1, 68–77 (2020). (in Chinese)
556 C. Zheng and X. Dong

Xiaoyan, G., Ge, G.: Research on the relationship between green credit scale and commercial
banks’ competitiveness. Econ. Issues 7, 15–21 (2018)
Jinqiang, Y., Xiaoyong, Y.: Promoting the construction of green technology innovation system
with green finance. Fujian Forum (Humanit. Soc. Sci. Edition) 3, 41–47 (2018)
Gu, B., Chen, F., Zhang, K.: The policy effect of green finance in promoting industrial transforma-
tion and upgrading efficiency in China: analysis from the perspective of government regulation
and public environmental demands. Environ. Sci. Pollut. Res. 28(34), 1–18 (2021)
Wang, Y., et al.: The dual impacts of green credit on economy and environment: evidence from
China. Sustainability 13(8), 4574 (2021)
Huang, H., Zhang, J.: Research on the environmental effect of green finance policy based on the
analysis of pilot zones for green finance reform and innovations. Sustainability 13(7), 3754
(2021)
Shusong, B., Yujia, C., Weihao, Z.: Green bond theory and China’s market development analysis
[J]. J. Hangzhou Normal Univ. (Soc. Sci. Edition) 1, 91–106 (2019)
Yong, L., Xiaoying, B.: Green incentives in China’s stock market: the perspective of sustainable
development. Econ. Manage. 12(1), 155–173 (2020)
Fengjian, Y., Feng, X.: The development of green finance and its influencing factors in Guangdong
Province from the spatial perspective: an empirical study based on the fixed-effect spatial Durbin
model. Sci. Technol. Manage. Res. 39(15), 63–70 (2019)
Zhang, D., Mohsin, M., Rasheed, A.K., Chang, Y., Taghizadeh-Hesary, F.: Public spending and
green economic growth in BRI region: mediating role of green finance. Energy Policy 153
(2021)
Mingdi, C., Ximiao, D.: Green finance and The Belt and Road initiative: evaluation and prospect.
J. Renmin Univ. China 33(04), 2–9 (2019)
Hongyuan, Y., Wanfa, W.: Green The Belt and Road construction: progress, challenges and
deepening paths. Stud. Int. Stud. 02, 114–129 (2021)
Xiaohong, D., Yong, F.: Analysis on the spatial dynamics of green finance and green economy
coupling development. Ind. Technol. Econ. 37(12), 94–101 (2018)
Study on Decision-Making Behavior of Effective
Distribution of Fresh Agricultural Products
After COVID-19 Epidemic

Yu Yang1,2,3 , Xin Rong1,2,3(B) , Jianjun Li1,2,3 , and Fang Yang4


1 School of Computer and Information Engineering,
Harbin University of Commerce, Harbin 150028, China
2 Heilongjiang Cultural Big Data Theory Application Research Center, Harbin 150028, China
3 Heilongjiang Key Laboratory of E-Commerce and Information Processing,
Harbin 150028, China
4 East University of Heilongjiang, Harbin 150066, China

Abstract. Under the background of COVID-19, this paper constructs an evolu-


tionary game model of strategic management of logistics enterprises and local
governments from cost and revenue to ensure the effective distribution of fresh
agricultural products. It analyzes the evolution path of decision-making behavior
and the stable parts that promote the model to achieve equilibrium. The results
show that the decisions of game players will affect each other, and the final strategy
choice is directly related to the input cost, the value-added amplification factor
of income, the reward, and punishment. Because the model eventually has two
evolutionary stable states, strengthening the cooperation between logistics enter-
prises and local governments will help to improve the efficiency of the effective
distribution of fresh agricultural products and provide a theoretical reference for
the effective distribution of fresh agricultural products under other public health
events. At the same time, it is also of certain significance to further solve the
problem of farmers’ income in the three rural issues.

Keywords: New crown epidemic · Fresh agricultural products · Effective


distribution · Evolutionary game

1 Introduction
As fresh and green agricultural and sideline products, fresh agricultural products are
related to farmers’ income and to residents’ three meals a day, which is closely related to
people’s quality of life and modern agricultural development [1]. Although the work of
agriculture, rural areas, and farmers has entered a new development stage, ensuring stable
agricultural production and supply and increasing farmers’ income is still the focus, and
the effective circulation of fresh agricultural products has undoubtedly become the focus
of extensive attention [2].
Under the background of the new crown epidemic, as the speed of virus transmission
is fast and the scope of the spread is large, it is necessary to strictly control the traffic

© The Author(s), under exclusive license to Springer Nature Switzerland AG 2022


A. E. Hassanien et al. (Eds.): BIIT 2021, LNDECT 107, pp. 557–566, 2022.
https://doi.org/10.1007/978-3-030-92632-8_52
558 Y. Yang et al.

personnel and vehicles. For the fresh agricultural products with high requirements such
as freshness and aging, the longer the transportation time, the greater the consumption
[3]; Secondly, the continuous innovation of production technology makes the output of
agricultural products reach new highs, resulting in a large number of unsalable fresh
agricultural products. Due to the suspension of work in some areas, the distribution
efficiency of fresh agricultural products has been greatly reduced [4].
To sum up, to ensure the effective distribution of fresh agricultural products during
the epidemic, the decision-making behavior of both parties is studied with the leading
logistics enterprises and local governments as the game players. In addition, the existing
research mainly discusses the impact of the epidemic on agricultural production and the
countermeasures. It does not consider the main body’s strategic choice in the effective
distribution process of fresh agricultural products under the background of COVID-19.
On this basis, this paper proposes an evolutionary game model for the effective distri-
bution of fresh agricultural products after COVID-19 and studies the strategic choice of
participants and the conditions for achieving stable conditions. Then, the measures to
improve the distribution efficiency of fresh agricultural products under COVID-19 are
discussed. Finally, the corresponding suggestions and solutions are put forward.

2 Construction of Evolutionary Game Model

2.1 System Background

In the evolutionary game theory, the two groups of logistics companies and local gov-
ernments will take certain actions in order to survive better. When all individuals in a
group choose the same behavior, the mutants will either be. The system is eliminated, or
the strategy is changed to adapt to the system environment [5]. This requires participants
to constantly adjust their strategies to adapt to the system in the dynamic changes of
the system, which is exactly in line with the research on the decision-making behavior
of logistics enterprises and local governments. When a logistics company chooses to
“actively invest”, it will have a strong social awareness and be able to actively connect
with areas where agricultural products are unsellable, continue to invest in the logistics
resources needed, and fight the epidemic together with the people of the whole coun-
try; if it chooses “passive investment”, the logistics company for their benefit, they will
not be willing to participate in it, or perfunctory. When the local government chooses
“coordination”, it will adopt a certain way to encourage logistics enterprises to invest in
logistics resources in slow-sale areas during a special period; if it chooses “uncoordi-
nated”, the government will adopt other methods to solve the transportation problem of
agricultural products.

2.2 Basic Assumptions

From the perspective of input costs and benefits of both sides of the game, in order
to better calculate the parameters and rationalize the evolutionary game model, the
following assumptions are put forward:
Study on Decision-Making Behavior 559

– H1 : Logistics enterprises and local governments that maintain limited rationality will
make decisions according to the actual situation.
– H2 : Because logistics enterprises know their investment level but do not know whether
the local government will encourage and how much the incentive level is; However,
local governments do not know whether logistics enterprises will actively invest in
logistics resources, which makes the information obtained by both parties asymmetric.
– H3 : Based on the dynamic nature that both sides of the game achieve equilibrium
through continuous trial and error, it is considered that the decision-making behavior
of logistics enterprises and local governments is a dynamic process.
– H4 : The strategies adopted by logistics enterprises are “positive investment” and
“negative investment”, while the strategies adopted by local governments are
“coordination” and “disharmony”.
– H5 : In the game process, the proportion of individuals that logistics enterprises choose
to actively invest in the population at time t is x, while the proportion of individuals
that local governments choose coordination strategies at time t is y.

2.3 Model Establishment

In order to facilitate the analysis of different strategies and the evolution paths of both
sides of the game under different circumstances, the relevant parameters of the model
are set as shown in Table 1.

Table 1. Related parameter setting

Definition Symbol
Basic income of logistics enterprises without active investment A > 0 A
The basic income when local government is uncoordinated B > 0 B
The cost of logistics enterprises’ active investment C1 > 0 C1
The cost of local government’s coordination strategy C2 > 0 C2
Total revenue of local government coordination D > B > 0 D
The local government that adopts the coordination strategy will give some rewards E
to the logistics enterprises that actively invest E > 0
When the local government adopts the coordination strategy, it will punish the F
logistics enterprises with negative investment F > 0
The cost-benefit conversion coefficient of local government facing the problem of α
agricultural products distribution when enterprises put in negative investment
0<α<1
Amplification factor of value added when logistics enterprises actively invest β > 1 β

Based on the above analysis and assumptions, the income matrix of the game between
logistics enterprises and local governments is analyzed and calculated [6], as shown in
Table 2.
560 Y. Yang et al.

Table 2. Evolutionary game income matrix

Local government
Coordinate Uncoordinated
Logistics enterprises Active engagement βA + E − C1 , D − C2 βA + C1 , B
Negative input A − F, αD − C2 +F A, αB

According to Malthusian equation [7]. For logistics enterprises, the income of


choosing active investment at time t is as follows:
U11 = y[βA + E − C1 ] + (1 − y)(βA − C1 )
The income of choosing negative investment at time t is as follows:
U12 = y(A − F) + (1 − y)A
At time t, the total income of logistics enterprise group is as follows:
U1 = xU11 + (1 − x)U12
Therefore, the replication dynamic equation of logistics enterprises is as follows:

ẋ = (β + U11 − δ)x − (β + U1 − δ)x = x(U11 − U1 ) = x(1 − x)[y(E + F) + (β−1)A − C1 ]

The local government chooses the coordination strategy at time t to obtain the
following benefits:
U21 = x(D − C2 ) + (1 − x)(αD + F − C2 )
At t time, the benefits of choosing the uncoordinated strategy are as follows:
U22 = xB + (1 − x)αB
At time t, the total income of the local government population is as follows:
U2 = yU21 + (1 − y)U22
Similarly, the dynamic equation of local government replication is as follows:
ẏ = y(U21 − U2 ) = y(1 − y){x[(1−α)(B + D) − F] + α(D − B) − C2 + F}

3 Evolutionary Game Analysis and Suggestions for Effective


Distribution
3.1 Stability Analysis
Through the analysis of the actual situation of the effective distribution of fresh agricul-
tural products during the period of COVID-19, if the local government adopts the coordi-
nation strategy to the logistics enterprises, the income of the active input of the logistics
Study on Decision-Making Behavior 561

enterprises will generally be higher than the passive input of the enterprises, otherwise,
the coordination strategy of the local governments will not have any significance; If the
local government adopts an uncoordinated strategy to the logistics enterprise, the income
from the positive investment of the logistics enterprise will be less than the income from
the negative investment of the enterprise. This highlights the impact of local govern-
ment coordination on the behavior choice of logistics enterprises to a certain extent.
Similarly, suppose the logistics enterprises adopt the strategy of active investment. In
that case, the local government’s revenue from adopting the coordination strategy will
be greater than that from adopting the uncoordinated strategy, which improves the local
government’s enthusiasm to coordinate the logistics enterprises to a certain extent; This
highlights the impact of local government coordination on the behavior choice of logis-
tics enterprises to a certain extent. Similarly, suppose the logistics enterprises adopt the
strategy of active investment. In that case, the local government’s revenue from adopt-
ing the coordination strategy will be greater than that from adopting the uncoordinated
strategy, which improves the local government’s enthusiasm to coordinate the logistics
enterprises to a certain extent. To sum up, the constraints of the equilibrium solution of
logistics enterprises and local governments in the game system are as follows:
βA + E − C1 > A − F, βA − C1 < A, D − C2 > B, αD − C2 + F < αB
Because the simultaneous replication of dynamic equations can better reflect the
speed and direction of their evolution [8], there are five local equilibrium points in the
system by solving the dynamic replication equations:
E1 (0, 0), E2 (0, 1), E3 (1, 0), E4 (1, 1), E5 (x∗ , y∗ )
Since the equilibrium point obtained from the replicated dynamic equation is only
the local asymptotically stable point in the evolution process, the Jacobian matrix in the
system is analyzed by using the method proposed by Friedman [9], and the stability in
the evolutionary equilibrium is obtained. If the trace of the Jacobian matrix is less than
zero and the determinant is greater than zero, it is an evolutionary stability strategy [10].
After calculation, the symbols of determinant and trace are shown in Table 3.

Table 3. Analysis results of equilibrium point stability

Equilibrium point Determinant Trace Result


E1 (0, 0) + − ESS
E2 (0, 1) + + Unstable point
E3 (1, 0) + + Unstable point
E4 (1, 1) + − ESS
E5 (x∗ , y∗ ) − 0 Saddle point

Because the broken lines of stable, unstable, and saddle points are the boundary lines
of game evolution under different strategies, the trend chart of interaction between local
government and logistics enterprises can be obtained.
562 Y. Yang et al.

3.2 Parameter Change Trend Analysis

By analyzing the parameters that affect the evolution of the system one by one, and
drawing the corresponding trend diagram. The linkage reaction of parameters is not
considered here. For the linkage reaction of parameters, the influence of main parameters
on the system evolution results is observed when the conditions are satisfied.

Parameter A, B, D. When A increases, E5 moves vertically downward (see Fig. 1(a)),


the area of region E5 E2 E1 E3 decreases and the system gradually converges to the ideal
mode. When B increases, the E5 level moves to the right (see Fig. 1(d)), the area of
region E5 E2 E4 E3 decreases, and the probability of the system converging to the ideal
mode decreases. When D increases, E5 level moves to the left (see Fig. 1(c)), area of
region E5 E2 E1 E3 decreases and area of region E5 E2 E4 E3 increases, which is conducive
to the benign evolution of the system.It can be seen from the above analysis that with
the increase of basic income of logistics enterprises, the more income they get when
they actively invest, and the more willing they will choose to actively invest; When the
revenue of local government adopting the uncoordinated strategy increases gradually,
the local government will be more inclined to adopt the uncoordinated strategy to obtain
more revenue; If the total revenue of the local government is increasing, based on the
principle of maximizing the interests, the local government will eventually choose to
adopt the coordination strategy.

(a) increase, reduce (b) increase, reduce

(c) increase, reduce (d) increase, reduce


Fig. 1. Variation trend of parameters
Study on Decision-Making Behavior 563

Parameter E, F. When E increases, E5 moves vertically downward (see Fig. 1(a)), the
area of region E5 E2 E1 E3 decreases and the area of region E5 E2 E4 E3 increases, and the
probability of the system converging to the ideal mode increases, which is conducive to
the benign evolution of the system. When F increases, E5 moves to the left and down(see
Fig. 1(c)), the area of region E5 E2 E1 E3 decreases, while the area of region E5 E2 E4 E3
increases, and the system converges to the ideal mode faster.To sum up, the promotion
of the reward value of the enterprise’s positive investment and the increase of the penalty
for the enterprise’s negative investment make the logistics enterprises more and more
tend to the positive investment strategy.

Parameter C1 , C2 . C1 refers to the cost that logistics enterprises pay when they actively
invest. When C1 increases, E5 moves vertically upward (see Fig. 1(b)), the area of region
E5 E2 E1 E3 increases. Both sides of the game gradually tend to the most unsatisfactory
state, which is not conducive to the benign evolution of the system.C2 refers to the
cost of local government coordination. When C2 increases, E5 level moves to the right
(see Fig. 1(d)), the area of region E5 E2 E1 E3 increases and the area of region E5 E2 E4 E3
decreases, and the probability of the system converging to the ideal mode decreases.
From the above analysis, it can be seen that if the cost is increasing, the strategy choice
of both sides of the game is not conducive to the benign evolution of the system.

Parameter α, β. α refers to the cost-benefit conversion coefficient when the government


faces the problem of agricultural products distribution under the influence of the major
epidemic. When α increases, the E5 level moves to the left (see Fig. 1(c)), the area
of region E5 E2 E1 E3 decreases while the area of region E5 E2 E4 E3 increases, and the
probability of the system converging to the ideal mode increases, which is conducive to
the benign evolution of the system. When β increases, E5 moves vertically downward
(see Fig. 1(a)), the area of region E5 E2 E1 E3 decreases and the area of region E5 E2 E4 E3
increases, and the probability of the system converging to the ideal mode increases,
which is conducive to the benign evolution of the system. The analysis shows that the
greater the cost-benefit conversion coefficient of the government, the greater the benefits
of local government coordination, the stronger the willingness of this behavior, which
is conducive to solve the problem of effective distribution of fresh agricultural products
during the epidemic period; Similarly, when the value-added amplification coefficient of
logistics enterprises becomes larger, the willingness to make active investment becomes
stronger with the increase of revenue.
To sum up, when the parameters B, C1 and C2 increase, E5 moves up or right. Under
this condition, the system gradually evolves to (0,0); When the parameters A, E, β, D,
F and α increase, E5 moves down or left. Under this condition, the system gradually
evolves to (1, 1).

3.3 Simulation Verification

In order to verify the stability of the equilibrium point, the simulation value is set
according to the relevant constraints, and then the evolution path of logistics enter-
prises and local governments under the influence of the major epidemic is simulated.
564 Y. Yang et al.

Order A = 2.5, B = 2, C1 = 0.7, C2 = 0.9, D = 3, E = 3, F = 0.2, α = 0.6, β = 1.2.


According to the simulation results, there are two evolution equilibrium points.

Parameter B, C1 , C2 Increases. At this time, the proportion of active investment in


the evolution process of logistics enterprises gradually decreases, as shown in Fig. 2(a);
The proportion of local governments choosing to coordinate is gradually approaching
0, as shown in Fig. 2(b); Finally, the stable point of the two evolutions is (0, 0), that
is, the logistics enterprises choose negative investment, and the government chooses
uncoordinated strategy. At this time, both the government and the logistics enterprises
will passively deal with the effective distribution of fresh agricultural products during
the epidemic period, and the situation is not very ideal, as shown in Fig. 2(c).

Fig. 2. The evolutionary stability results of stable point (0, 0)

Fig. 3. The evolutionary stability results of stable point (1, 1)

Parameter A, E, β, D, F, α Increases. At this time, the proportion of active investment


in the evolution process of logistics enterprises gradually increases, as shown in Fig. 3(a);
The proportion of local governments choosing to coordinate is gradually approaching 1,
as shown in Fig. 3(b); Finally, the stable point of the two evolutions is (1, 1), that is, the
Study on Decision-Making Behavior 565

logistics enterprises choose to actively invest, and the government chooses to coordinate.
The government and logistics enterprises will jointly solve the effective distribution of
fresh agricultural products during the epidemic period, which is the ideal situation, as
shown in Fig. 3(c).
The simulation results show that the model is feasible. Accordingly, the develop-
ment suggestions of the participants are given respectively. For logistics enterprises,
the unnecessary waste of some resources can be reduced by increasing the batch of
single transportation, to reduce the transportation cost invested by enterprises; Or by
reducing unnecessary links in transportation and selecting direct transportation mode if
possible, to reduce the cost of loading, unloading and handling of agricultural products.
In addition, by increasing the rendering of humanistic feelings against epidemic dis-
ease, enterprises can realize that the investment in logistics resources in unsalable areas
of agricultural products will produce greater social benefits, far exceeding the benefits
created by enterprises themselves, and stimulate the initiative of enterprises to actively
invest while deepening the cultural heritage of enterprises [11]. For local governments,
we can build an information service platform for the circulation of agricultural prod-
ucts. Relevant departments can publish the current situation of unsalable agricultural
products, circulation progress, resource investment of various logistics enterprises and
rewards obtained after investment here, and ensure the openness, transparency and shar-
ing of information. The government can improve the enthusiasm of resource investment
of logistics enterprises by comparing the resource investment of various enterprises dur-
ing the epidemic period and using the power of public opinion and the psychology of
comparison among enterprises.

4 Conclusion and Suggestion


In the background of novel coronavirus pneumonia, we analyze the game between logis-
tics enterprises and local governments to effectively solve the transportation problems of
fresh agricultural products and use the evolutionary game to analyze the model analysis
results. The results of MATLAB analysis are verified. Through the parameter allocation
simulation in the model, the evolution results obtained in the process of model verifica-
tion are consistent with the results of model analysis, which proves the feasibility and
accuracy of the model. This study draws the following conclusions:

– the game players must actively participate in the effective transportation process
of fresh agricultural products, jointly seek development and actively adjust the
development strategy.
– From the perspective of the whole system, it is necessary to introduce a third party
to supervise this to promote the cooperation between the two sides of the game and
ensure the efficient distribution of fresh agricultural products.

This study aims to effectively solve the distribution problem of fresh agricultural
products during the COVID-19 period, which can provide a theoretical basis for the
prevention and control of other public health events in the future.
566 Y. Yang et al.

Acknowledgments. This work is partly supported by the project supported by the National Social
Science Foundation (16BJY125), Heilongjiang philosophy and social sciences research planning
project (19JYB026), Key topics in 2020 of the 13th five year plan of Educational Science in
Heilongjiang Province (GJB1320276), Project supported by undergraduate teaching leading talent
training program of Harbin University of Commerce (201907), Key project of teaching reform and
teaching research of Harbin University of Commerce in 2020 (HSDJY202005(Z)), Innovation and
entrepreneurship project for college students of Harbin University of Commerce (202010240059),
School level scientific research project of Heilongjiang Oriental University (HDFKY-200202), Key
entrusted projects of higher education teaching reform in 2020 (SJGZ202 00138).

References
1. Run, H., Ye, Q.: The enlightenment of epidemic situation on the circulation of fresh
agricultural products. Macroecon. Manage. 09, 34–35 (2020)
2. Sun, W., Xu, M.: Mechanism and empirical study on the impact of agricultural product
circulation system on farmers’ income. Commercial Econ. Res. 11, 126–129 (2021)
3. Pan, J., Guo, M.: COVID-19’s influence on the circulation of agricultural products. Bus. Econ.
Res. 11, 155–157 (2020)
4. Zhang, X.: Novel coronavirus pneumonia impact on agricultural product supply chain and
coping mechanism. Agric. Econ. Manage. 04, 45–51 (2020)
5. Smith, J.M., Price, G.R.: The logic of animal conflict. IMA J. Appl. Math. 246(5427), 15–18
(1973)
6. Liu, Z.: The “double subjects” in the regional innovation system in the transition period:
based on the evolutionary game between the government and entrepreneurs. Econ. Issues 05,
113–122 (2020)
7. Huang, M.: Research on cooperative product development and cooperation mechanism of
supply chain based on evolutionary game. Chin. Manage. Sci. 6(18), 155–162 (2010)
8. Liu, X.: Research on the evolutionary game between government and enterprise under the
fiscal and tax incentive policies – taking small and medium-sized science and technology
enterprises as an example. Tech. Econ. Manage. Res. 10, 16–21 (2019)
9. Friedman, D.: Evolutionary game in economics. Econometrica 59(3), 637–666 (1991)
10. Zhang, J., Xi, X.: Evolutionary game analysis on nitrogen emission reduction of local govern-
ments and enterprises under the trading of river basin emission rights. Chin. Manage., 1–12
(2020)
11. Nie, L., Zhang, L.: Evolutionary game analysis and simulation of government and sewage
enterprises in green technology innovation. Econ. Issues 10, 79–86 (2020)
Study on the 2-Mode Network Characteristics
of the Types and Issuing Places of Chinese
Provincial Green Bonds

Yuanhao Xiao1 and Xiaohong Dong2(B)


1 School of Finance, Harbin University of Commerce, Harbin 150000, China
2 College of Economics and People’s Livelihood and Welling, Zhejiang Shuren University,

Hangzhou 130000, China

Abstract. Although China’s green bonds started considerably late, they have
become an important part of international green bonds. According to the develop-
ment status of Green bonds in China. In this article, 2-mode network analysis is
used to construct the network of the types and issuing places of China’s provincial
green bonds. The UCINET software is used to analyze and study the network
centrality and singular value decomposition. This paper studies the functions and
roles of different issuing places in China’s inter-provincial green bond market from
the relationship between the green bond types and issuing places. It is concluded
that all issuing places have generally established connections with the green bond
market, and the eastern region has obvious advantages in the types and quantity
of green bond issuance. There are two specific market clusters in the green bond
market: multi-instrument + central and eastern cluster and green corporate bonds
+ central and western cluster. Some measures are put forward, such as strength-
ening the regional cooperation capacity of green bonds among regions, delegating
power to the lower levels, and stimulating the vitality of green bond issuers.

Keywords: Green bond type · 2-Mode Network · Singular value


decomposition · Market cluster

1 Introduction
In April 2020, the International Conference on Green Finance Leadership webinar was
successfully held. The international community has more and more recognized the con-
cept of green development. Under this background, green finance emerges at the historic
moment. In order to promote the rapid development of green bonds, the government has
introduced many relevant policies. In 2017, for example, the “Guidelines on the Assess-
ment and Certification of Green Bonds (Interim)” was issued, which plays an important
role in effectively reducing the assessment risks of green bonds. The development status
and characteristics of the green bond market are analyzed, and its development trend
is forecast (Zhang J.W. 2019). By studying whether domestic green bonds are labeled,
this paper analyzes the level of “real green” bonds and puts forward suggestions and
measures such as improving the enthusiasm of “real green” bond issuers, building a

© The Author(s), under exclusive license to Springer Nature Switzerland AG 2022


A. E. Hassanien et al. (Eds.): BIIT 2021, LNDECT 107, pp. 567–576, 2022.
https://doi.org/10.1007/978-3-030-92632-8_53
568 Y. Xiao and X. Dong

good investment environment for “real green” bonds, and increasing the number of “real
green” bond investors (Liao Y. et al. 2021). Reboredo J.C. (2018) studied co-movement
between the green bond and financial markets, finding that the green bond market cou-
pled with corporate and treasury bond markets weakly co-moves with stock and energy
commodity markets. GARCH model is used to analyze the yield rate of the green bond
market and compare the difference with the yield rate of the traditional bond market, and
the conclusion is drawn that they are negatively correlated (Xu X. and Li Y. 2018). Based
on the vector autoregression model’s forecast error variance decomposition method, Gao
Y. and Li C.Y. (2021) studied the risk spillover effect among the Chinese green bond
market and the traditional fixed income markets, stock markets, and foreign exchange
markets. Various types of financial market and the risk spillover effects from green bonds
market is stronger than that of other markets. In addition, the risk spillover between the
green bond market and the traditional fixed income market is characterized by great
uncertainty. In the comparative analysis of Chinese and foreign green bond markets, the
literature mostly studies the advantages of foreign green bond markets and the disad-
vantages of domestic green bond markets. Theoretical analysis was used to study the
differences between domestic and foreign green bond markets in bond varieties and
issuing subjects (Gao Q. et al. 2020; Shang W.X. et al. 2017; Shi R. and Chisi S. 2020).
Wang J.Z. et al. (2020) examined the market reaction to China’s green bond issuance.
They found that Enterprise participation in sustainable financing practice increases firm
value in the long run and thus is favored by shareholders. Pham L. and Huynh T.L.D.
(2020) analyzed the link between investor attention and the green bond market perfor-
mance. Their analysis revealed that appropriate information and attention for directing
financial flows towards sustainable investment is important.
The relationship between the types of green bonds and issuing places can be regarded
as a 2-mode network. This paper constructs the green bond types and issuing places
affiliation network, analyzes the characteristics of the 2-mode network between Chinese
provincial green bond types and issuing places, and reveals the relationship between
green bond types and issuing places to provide the basis for the cluster of the green
bond market. Issuing places and green bond types have the characteristics of “duality”,
and the relationship between green bond types and issuing places is regarded as a 2-
mode network. The 2-mode network can integrate two sets of issuing places and green
bond types into the same network structure and analyze the green bond market from the
relationship between the two sets and within the sets.

2 Construction of 2-Mode Network of Green Bond Types


and Issuing Places

2.1 Data Source and Processing

The relevant data in this paper are mainly from China Financial Information Network
Green Bond Database. In order to eliminate the interference to the effective analysis of
the data, 31 provinces (except Hong Kong, Macao, and Taiwan) in 2019 were selected,
and 6 provinces (including Inner Mongolia, Liaoning, Jilin, Heilongjiang, Hainan, and
Ningxia) whose issuance amount of green bonds were 0 were excluded. Five types of
Study on the 2-Mode Network Characteristics 569

green bonds and 25 places of green bonds are taken as the research objects. Therefore,
the two sets of a 2-mode network of green bond types and issuing places established in
this paper are respectively 5 sets of green bond types and 25 sets of issuing places, and
the network relationship is represented by the number of green bonds issued by issuing
places, then, the 25 × 5 2-mode matrix Xij of the issuing places and the green bond types
is obtained. 0 and 1 represent the affiliation relationship, and binarization was used to
process the relational data. First of all, a 2-mode network Xij1 based on the proportion
of green bond issuance amount of different types in each issuing place, the total green
bond issuance amount of the issuing place should be established. And then calculate
the average of all the values of Xij1 . Bigger than the average is equal to 1, and less than
the average is equal to 0. Thus, a 2-mode network Xij2 is obtained. This represents the
affiliation network between each issuing place and the type of green bonds. The 2-mode
network of Xij2 is shown in Table 1.

Table 1. Green bond type affiliation matrix of some provinces in China

Province F C E D A Province F C E D A
Beijing 0 1 0 0 0 Hunan 0 1 0 0 0
Tianjin 0 1 0 1 1 Guangdong 0 0 1 0 1
Hebei 0 0 1 0 0 Guangxi 1 0 0 0 0
Shanxi 0 0 1 0 0 Chongqing 0 1 1 0 0
Shanghai 1 0 0 0 1 Sichuan 0 1 0 0 0
Jiangsu 0 1 1 0 1 Guizhou 1 1 0 0 0
Zhejiang 1 1 0 0 0 Yunnan 0 0 1 0 0
Anhui 0 1 0 1 0 Xizang 0 0 1 0 0
Fujian 1 0 0 0 0 Gansu 1 0 0 1 0
Jiangxi 1 0 0 0 0 Qinghai 1 0 0 0 0
Shandong 1 1 1 0 0 Ningxia 0 0 0 0 1
Henan 1 0 0 0 0 Xinjiang 1 0 0 1 0
Hubei 0 0 1 0 0
Note: F, C, E, D and A are abbreviations for finance bonds, corporatebonds, enterprise bond, Debt
financing instrument and asset-backedsecurity.

2.2 Network Centrality Measure


Centrality is one of the key points of social network analysis and is used to measure the
rights of actors in the network. It mainly includes degree centrality, closeness centrality,
and betweenness centrality.
Degree centrality. For each issuing place, it is the number of types of green bonds
issued in that place. For a green bond type, the degree centrality is the type’s amount
that radiates to each issuing place.
570 Y. Xiao and X. Dong

Closeness centrality. For the issuing place, closeness centrality is the shortest distance
between the type of green bonds it issues and other types of green bonds and the issuing
place. Its closeness centrality Cc (ni ) is expressed as follows:
 g+h −1
i=1 mink d(k, j)
Cc (ni ) = 1 + (1)
g+h−1

In the expression, Cc (ni ) denotes the closeness centrality of the issuing place ni ; g
denotes the number of issuing places; h denotes the number of types of green bonds; k
denotes the type of green bonds issued by issuing place ni ; j denotes other issuing places
or types of green bonds; mink d(k,j) denotes the shortest network distance between k
and j.
For green bond types, closeness centrality is a function of the shortest network
distance from the issuing place it radiates to other issuing place and the green bond type.
Cc (mk ) of green bond type is expressed as follows:
⎡ g+h ⎤−1
mini d(i, j)
Cc (mk ) = ⎣1 + ⎦
j=1
(2)
g+h−1

In the expression, Cc (mk ) denotes Closeness centrality of green bond type mk ; g is


the number of issuing places; h denotes the number of types of green bonds; i denotes
the issuing place adjacent to the green bond type mk ; j denotes other issuing place or
green bond type; mini d(k,j) represents the shortest distance between i and j.
Betweenness degree. It reflects the extent to which actors in the network control
other actors. For the issuing place, the betweenness centrality of the issuing place can
be obtained only if a pair of green bond types simultaneously irradiate the issuing place.
If a bond type radiates to only one issuing place, the green bond type gets (g + h + 2)
“points” of betweenness. The betweenness centrality CB (ni ) is expressed as follows:
1 1
CB (ni ) = (3)
2 ni ,nj ∈mk XNij

In the expression, CB (ni ) denotes the betweenness centrality of the place of issue;
The issuing place nj denotes another issuing place that shares the green bond type mk
with the issuing place ni ; XijN is the number of the same green bond class shared by ni
and nj . Similarly, if a green bond type has the market of only one issuing place, the green
bond type can get (g + h + 2) “points” of betweenness; For all the provinces radiated
by a green bond type, the green bond type can obtain 1/Xkl M betweenness centrality, m is
l
another type of green bond that shares the issuing place with mk , and XijN is the number
of issuing places that mk shares with ml .

2.3 Singular Value Decomposition

Singular value decomposition (SVD) can reduce the number of dimensions of relational
data to find out the common factors behind it and divide the structure at a certain level.
Study on the 2-Mode Network Characteristics 571

For the 2-mode network of green bond types and issuing places, this technology can
be used to find the specific market cluster of green bond types and their corresponding
issuing places to help issue places form a cooperative market community and accelerate
the development of the green economy. The calculation results of the matrix Xij1 using
social network analysis software are shown in Table 2. The contribution rates of the
two singular values in Table 2 are all over 20%, and the cumulative contribution rate is
53.1%, indicating that the index has strong explanatory power.

Table 2. The types of green bonds are affiliation network SVD analysis results

Dimension Singular value Contribution rate/% Cumulative contribution rate/% Ratio


1 4.026 30.2 30.2 1.313
2 3.068 23.0 53.1 1.225

3 2-Mode Network Analysis of the Types of Green Bonds


and Issuing Places

3.1 Analysis of Network Characteristics of the Types of Green Bonds and Issuing
Places

The centrality analysis of matrix Xij2 results in the following (Table 3).
According to Table 2, from the network centrality, Jiangsu, Shandong, and other
eastern regions have a high network centrality. They are at the core of the green bond
type affiliation network structure, while Qinghai and other central and western provinces

Table 3. Measurement results of green bond issuance network centrality of some provinces

Province Degree centrality Closeness centrality Betweenness centrality


Beijing 3.448 31.868 0.000
Tianjin 10.345 37.662 6.283
Hebei 3.448 30.526 0.000
Shanxi 3.448 30.526 0.000
Shanghai 6.879 36.709 5.452
Jiangsu 10.345 40.845 9.034
Zhejiang 6.879 39.726 3.974
Anhui 6.879 31.448 1.864
Fujian 3.448 32.548 0.000
Jiangxi 3.448 32.548 0.000
(continued)
572 Y. Xiao and X. Dong

Table 3. (continued)

Province Degree centrality Closeness centrality Betweenness centrality


Shandong 10.345 49.153 23.238
Henan 3.448 32.548 0.000
Hubei 3.448 30.526 0.000
Hunan 3.448 31.868 0.000
Guangdong 6.879 31.448 2.666
Guangxi 3.448 32.548 0.000
Chongqing 6.879 37.662 4.102
Sichuan 3.448 31.868 0.000
Guizhou 6.879 39.726 3.974
Yunnan 3.448 30.526 0.000
Xizang 3.448 30.526 0.000
Gansu 6.879 34.904 1.726
Qinghai 3.448 32.548 0.000
Ningxia 3.448 28.713 0.000
Xinjiang 6.879 34.904 1.726
Finance bonds 37.931 47.541 44.314
Corporate bonds 34.483 46.032 37.019
Enterprise bond 31.034 43.284 36.248
Debt financing instrument 13.793 33.333 4.407
Asset-backed security 17.241 39.726 13.481

are at the end of the network centrality. This shows that the green bonds of Jiangsu
and other provinces have developed rapidly, which has a spillover effect on the green
bond market of other provinces. The eastern region used to adopt an extensive mode of
economic development, causing serious ecological damage. Moreover, due to the high
level of financial development, it is easier to promote the issuance of green bonds.
On the contrary, the economic development in central and western regions is rela-
tively backward, and the demand for green financing is small. In addition, the level of
financial development is lower than that in the eastern region, and the basic financial
facilities are backward, and the promotion of green bonds is slower. In terms of the
types of green bonds issued, green financial bonds obtained the largest centrality index.
Green financial bond is the main body of green bond issuance. The results of closeness
centrality and betweenness centrality are also consistent.
In general, green financial bonds, green corporate bonds, green enterprise bonds and
other types of green bonds have obtained a large index of centrality. The issuance of
green bonds has established extensive affiliation relationship with the issuing places.
As the Fig. 1 shows, the correlation number of green financial bonds in the affiliation
Study on the 2-Mode Network Characteristics 573

Fig. 1. Green bond market structure correlation chart of China

network of green bond types and issuing places is the largest, and the indexes such as
centrality are the largest. Therefore, the structural advantage of green financial bonds is
the most obvious, occupying the core and absolute advantage in the affiliation network
of green bond types.

3.2 Analysis of Green Bond Market Clusters


Multiple Tools + Central and Eastern Cluster. According to Table 4 and Table 5,
multi-instrument + central and eastern cluster includes 17 issuing places such as Beijing,
Shanghai, Shandong, and Henan, and four types of green bonds, including green financial
bonds and green corporate bonds, green debt financing instruments, and green asset-
backed securities. The cluster has many members and is mainly distributed in the central
and eastern regions. Market links have been established among them. Therefore, its
cluster members mainly cover the central and eastern provinces with better economic
development.

On the one hand, these bonds have fewer restrictions on market issuers, and national
policies support green bonds. For example, on March 6, 2019, the National Development
and Reform Commission and other relevant departments jointly issued a notice on issuing
the “Green Industry Guidance Catalogue (2019 Edition)”. The central and eastern regions
have developed economies and more green financial institutions. In particular, there are
many green bond issuers in the eastern region, which makes them more dependent on
the central and eastern provinces.

Green Enterprise Bond + Central and Western Cluster. According to Table 4 and
Table 5. The type of green bonds of this cluster is mainly green enterprise bonds. Domes-
tic provinces mainly include Hebei, Shanxi, Chongqing, Xizang, and other central and
western provinces. Green enterprise bonds are the fourth-largest type of green bonds in
China in 2019, with great growth potential. Green corporate bonds are mainly issued by
574 Y. Xiao and X. Dong

Table 4. The load value of each distribution on each dimension

D Province Scores D Province Scores


1 2 1 2
1 Beijing 0.156 0.064 Gansu 0.196 –0.278
Tianjin 0.273 0.076 Qinghai 0.142 –0.229
Shanghai 0.205 –0.168 Ningxia 0.063 0.061
Zhejiang 0.298 –0.164 Xinjiang 0.196 –0.278
Anhui 0.209 0.015 2 Hebei 0.102 0.209
Fujian 0.142 –0.229 Shanxi 0.102 0.209
Jiangxi 0.142 –0.229 Jiangsu 0.321 0.334
Shandong 0.400 0.044 Hubei 0.102 0.209
Henan 0.142 –0.229 Guangdong 0.165 0.270
Hunan 0.156 0.064 Chongqing 0.258 0.273
Guangxi 0.142 –0.229 Yunnan 0.102 0.209
Sichuan 0.156 0.064 Xizang 0.102 0.209
Guizhou 0.298 –0.164
Note: Bold indicates the highest score in the dimension. D is the abbreviation of dimension.

Table 5. The load value of the types of green bonds on each dimension

Dimension The types of green bonds Scores


1 2
1 Finance bonds 0.571 –0.702
Corporate bonds 0.626 0.198
Debt financing instrument 0.217 –0.152
Asset-backed security 0.255 0.186
2 Enterprise bond 0.411 0.640

central government agencies, wholly state-owned enterprises, and state holding enter-
prises and are subject to strict issuance conditions. The number of non-state-owned green
bond issuers in the central and western regions is small, indirectly promoting the ratio
of green enterprise bond issuance to the total green bond issuance.
Study on the 2-Mode Network Characteristics 575

4 Conclusions and Measures


4.1 Conclusions

Based on a 2-mode network analysis, this paper constructs the affiliation network of
green bond types and issuing places and analyzes its network centrality and singular
value decomposition. The main conclusions are as follows:
On the whole, green bonds are characterized by a market with the central and east-
ern regions as the core and the western regions as the edge. Jiangsu and Shandong
provinces, economically developed and rich in financial resources, show strong advan-
tages in spillover effect. The green bond cooperation among the issuing places needs to
be conducted around specific provinces.
Green financial bonds and green corporate bonds are the types of green bonds with
large issuance. Their centrality is high, indicating that they play a key role in the affiliation
network of green bond types and issue places. They become factors to be considered
when establishing effective cooperative relations between different issuing places in the
green bond market.
According to the results of singular value decomposition, there are two clusters of
issuing places and green bonds with different characteristics: multi-instrument + central
and eastern cluster and green corporate bond + central and western cluster.

4.2 Measures

Increase opportunities for collaborative green bond issuance between issuing places.
Enhancing inter-provincial cooperation in green bond issuance and related activities
will be conducive to increasing the types of green bonds and promoting the develop-
ment of green finance in China. The government should build a unified green bond
platform, increase the number of intermediary services for green bonds, help backward
provinces develop green bonds, and improve the innovation capacity of green bond
financial instruments.
Focus on developing green bond types that are dominant in each issuing place.
According to the centrality and singular value decomposition results, the position of each
issuing place in the green bond market is different, so the positioning of each issuing place
should be clear. According to their respective economic development level, financial
resources, and location advantages, the green bond types with outstanding advantages
should be developed.
The government should streamline administration and delegate powers, stimulate
green bond issuers’ vitality, and promote the development of green bonds in western
provinces. Due to various factors such as location and economy, there are differences in
the number and vitality of green bond issuers between the central and the west. In the cen-
tral and eastern regions, green financial bonds and green corporate bonds are dominant.
The government should give more autonomy to the issuers of green corporate bonds and
simplify the issuance procedures of green corporate bonds to advance the development
of the green bond in the western region and promote the balanced development of the
market of China’s green bond.
576 Y. Xiao and X. Dong

Acknowledgment. Fund Project: China Postdoctoral Science Foundation Project “Research on


measuring the development level of green finance” (Project No.: 2018M631940); Key research
project of economic and social development in Heilongjiang Province “Research on green financial
derivatives supporting the development of green agriculture in Heilongjiang Province” (Project
No.: 18217).

References
Zhang, J.W.: The current situation, characteristics and trend of China’s green bond market. Banker
212(06), 109–111 (2019). (in Chinese)
Liao, Y., An, G.J., Shang, J., Ma, Y.: A report on the operation of China’s real green bond market
in 2020. Bonds 02, 49–55 (2021). (in Chinese)
Reboredo, J.C.: Green bond and financial markets: co-movement, diversification and price
spillover effects. Energy Econ. 74, 38–50 (2018)
Xu, X., Li, Y.: Study on yield spillover effect of green bond market. Time Finance 717(35),
170–171 (2018). (in Chinese)
Gao, Y., Li, C.Y.: The risk spillover effect between China’s green bond market and financial
market. Financ. Forum 01, 59–69 (2021). (in Chinese)
Gao, Q.X., Jiang, L.Y.: Development status of green bonds in foreign countries and its implications
for China. Environ. Sustain. Dev. 044(006), 114–119 (2019). (in Chinese)
Shan, X.W., Li, Y.H., Wang, Y.X.: Analysis on the development status of green bond market at
home and abroad. Mod. Econ. Inf. 000(013), 288–289, 292 (2017). (in Chinese)
Shi, R., Chisi, S.: International development experience of green bonds and its implications for
China. Rural Econ. Sci. Technol. 31(498), 113–114 (2020). (in Chinese)
Liu, F.J., Chen, D.D., Zhu, J.H., Qian, Z., Li, B.B.: The structure and interaction pattern of
China’s inter-provincial inbound tourism source market: based on 2-mode network analysis.
Prog. Geogr. 8(8), 932–940 (2016). (in Chinese)
Wang, J., et al.: The market reaction to green bond issuance: evidence from China. Pacific-Basin
Finance J. 60, 101294 (2020)
Pham, L., Toan, L.: How does investor attention influence the green bond market? Finance Res.
Lett. 35, 101533 (2020)
Study on the Influence of Internet Payment
on the Velocity of Money Circulation in China

Xiangbin Liu(B) and Qiuming Liu

School of Finance, Harbin University of Commerce, Harbin 150028, China

Abstract. Internet payment has significantly facilitated people’s consumption.


This paper analyzed the long-term equilibrium relationship between the velocity
of money circulation at various levels in China and the Internet payment fre-
quency, financial electronization, and the intention for saving. Constructing an
error correction model explored the different influences of these three factors on
the velocity of money circulation at all levels in China in the short term. Internet
payment has indeed accelerated money circulation in China, and its accelerating
effect on V 0 is greater than that on V 1 and V 2 . We should further optimize the
division of money at various levels, intensify the control on the velocity of money
circulation, and improve relevant laws and regulations on Internet payment. This
has important practical significance for the development of Internet payment and
the intervention of monetary policy in China.

Keywords: Internet payment · Velocity of money circulation · Impulse response


function · ECM model

1 Introduction

Internet payment is a new payment method that relies on network terminals for pay-
ment and settlement, that is, the currency payment or fund flow on the Internet. With
the emergence of Internet payment, new third-party payment products such as Alipay
and Tenpay have been developed, which have brought great convenience to people in
daily life but hit the traditional payment methods to a certain extent. According to the
statistics from iResearch, the Internet payment transactions in China totaled only 365
billion yuan in the first quarter of 2011, soaring to about 6.3 trillion yuan in the second
quarter of 2019. Fluctuations in the velocity of money circulation will affect China’s
macroeconomic policies, thus reflecting the problems in the country’s overall economic
operation. Electronic money is a form of Internet payment, and its development will
affect the supply and demand of money, thereby weakening the effectiveness of macro
policies [1]. Therefore, the stability of the velocity of money circulation has become a
focus in academic research.

© The Author(s), under exclusive license to Springer Nature Switzerland AG 2022


A. E. Hassanien et al. (Eds.): BIIT 2021, LNDECT 107, pp. 577–593, 2022.
https://doi.org/10.1007/978-3-030-92632-8_54
578 X. Liu and Q. Liu

Regarding China’s financial conundrum proposed by MacKinnon a few years ago,


domestic scholars attributed this problem to the decline in the velocity of money circu-
lation [2]. In recent years, scholars at home and abroad have paid attention to the impact
of newly emerged Internet payment methods on the velocity of money circulation. The
continuous increase in the scale and utilization of Internet payment transactions will
inevitably affect the velocity of money circulation in China. Therefore, in this paper, the
factors affecting the velocity of money circulation were analyzed, and the instability of
the money circulation velocity was explained; on the other hand, how Internet payment
affects the velocity of money circulation in China and the extent of the influence were
explored, which is of great significance for the supervision on the Internet payment
industry.

2 Literature Review

Foreign scholars have conducted researches on Internet payment early, but most of them
focus on the influence of the development of electronic money on national monetary
policy. Freedmen (2000) [3] pointed out that electronic money will substitute the money
issued by the central bank, thereby affecting the monetary policy formulated by the
central bank. Sullivan (2002) [4] holds the view that both the money multiplier and
monetary base will be affected by electronic money. On the one hand, the ability of the
central bank to supply money will be hit. On the other hand, the central bank’s monetary
policy’s effectiveness will be weakened. Williamson (2003) [5] believes that electronic
money transactions will reduce the utilization rate of cash, and the central bank will
also impact the settlement of electronic money. Al-Laham et al. (2009) [6] reported that
electronic money increases the velocity of money circulation and changes the monetary
multiplier. Hiroshi et al. (2014) [7]. stated that electronic money has the advantage of
lower transaction cost compared with paper money. From a fintech perspective, Lee et al.
(2018) [8] argued that mobile payments could narrow the income gap.
Most domestic scholars have explored the influence of Internet payment on the
velocity of money circulation from two perspectives: first, from the perspective of elec-
tronic money, network money, or digital cash, Dong Xin (2001) [9] found that network
money can be a substitute of the money in circulation, hence accelerating the circulation
of money. However, Zhou Guangyou (2006) [10] believes that despite the substitution
effect of electronic money on traditional money, the velocity of money circulation has
not been inhibited from increasing by electronic money. Some scholars also pointed out
the complexity of the influence of Internet payment on the velocity of money circulation.
Yin Long (2000) [11] and Chen Yulu (2002) [12] believe that it is difficult to identify the
impact of electronic money on the velocity of money circulation in the short term, which
increases the difficulty in forecasting the velocity of money circulation. Pu Chengyi
(2002) [13] found that digital cash affects the supply of money since it can be regarded
as a substitute for the cash in circulation, thereby slowing down the velocity of money
circulation in the early stage but accelerating it in the later stage, and the velocity of
money circulation changes in a V-shaped pattern. Second, from the perspective of third-
party payment, Fang Yiqiang (2009) [14] believes that the development of third-party
payment will positively affect the velocity of money circulation, and the continuous
Study on the Influence of Internet Payment on the Velocity 579

improvement of the payment systems will accelerate money circulation. Li Nan et al.
(2014) [15]. found that third-party Internet payment impacts the capital market and fund
circulation between the real economy and will promote the velocity of fund circulation.
Li Shujin et al. (2015) [16] reported that third-party payment positively affects the veloc-
ity of money circulation, and its accelerating effect on V0 and V1 is greater than that
on V2. Fang Xing et al. (2017) [17] discovered that the use of electronic money and the
size of third-party payments would positively affect the velocity of money circulation.
The literature review found that the velocity of money circulation has been exten-
sively studied at home and abroad, but there are still controversies. Although the research
on the velocity of money circulation from third-party payments has attracted much atten-
tion, some econometric models, especially the different adjustments in the error correc-
tion models, are insufficiently reasonable, and the short-term and long-term effects on
the velocity of money circulation have not been made clear. In this paper, based on the
previous studies, the long-and short-term influence of Internet payment on the velocity
of money circulation was analyzed.

3 Theoretical Analysis
3.1 Analysis of Factors Affecting the Velocity of Money Circulation
Most foreign research on the factors influencing the velocity of money circulation is
based on Fisher’s equation of money quantity, MV = PY. Fisher pointed out that, among
the influencing factors of the velocity of money circulation, financial development and
personal psychology have a prominent impact. Baumol&Tobin believes that financial
innovation and actual income are the primary factors affecting the velocity of money
circulation. Mitchell believes that the level of economic monetization is the main factor
influencing the velocity of money circulation, which coincides with the view of Gold-
smith, who also believes that the level of financial development is the factor affecting
the velocity of money circulation. Among the domestic scholars, Hou Ying et al. (2012)
[18] variables such as GDP, monetization degree, and inflation rate to investigate the
influencing factors of the velocity of money circulation and found that these influencing
factors maintain a certain equilibrium and stable relationship with the velocity of money
circulation. In Xia Deren’s viewpoint, the degree of economic monetization is the reason
for the decline in the velocity of money circulation [19]. Ai Hongde et al. (2002) [2]
used variables such as economic monetization, the level of financial development, and
savings rate to study the velocity of money circulation and obtained the conclusion that
both the economic monetization and savings rate will accelerate money circulation, but
the level of financial development plays an inhibitory role. Zhou Guangyou (2006) [10]
concluded that among various factors affecting the velocity of money circulation, the
degree of financial digitization, monetary digitization, and cash ratio positively affect the
velocity of money circulation. Zhang Shuming et al. (2007) [20] stated that the degree
of economic monetization and the level of interest rates negatively correlate with the
velocity of money circulation. In contrast, the level of financial modernization and the
savings rate is significantly positively correlated with the velocity of money circulation.
Li Shujin et al. (2015) [16]. studied variables such as the level of economic monetiza-
tion and the degree of financial modernization, discovering that third-party payments
580 X. Liu and Q. Liu

produce different effects on the acceleration of money circulation at various levels. Liu
Da (2017) [21] included financial electronization and savings rate as control variables
affecting the velocity of money circulation. The study showed that third-party payments
have a long-term co-integration relationship with the degree of financial electronization
and savings rate. They play a joint role in the velocity of money circulation.
It can be easily found from the domestic and foreign studies on the factors influ-
encing the velocity of money circulation. These studies mostly used the variables such
as the degree of financial electronization, economic monetization, and savings rate as
the main factors affecting the velocity of money circulation. At the same time, Internet
payment, as a new payment method, can be considered a financial innovation to a certain
extent. Financial innovation will inevitably affect the velocity of money circulation in
China [22]. Therefore, Internet payment will impact the degree of financial electroniza-
tion, the degree of economic monetization, and the savings rate. The level of economic
monetization is generally measured by the ratio of the broad money quantity to the gross
national product, which has a reciprocal relationship with the dependent variable V 2
selected below [23]. Therefore, the degree of financial electronization and intention to
save were selected as the control variables in this paper.

3.2 The Mechanism of Internet Payment Influencing the Velocity of Money


Circulation

Internet payment is essentially the transaction of electronic money between buyers and
sellers. Its low conversion cost can reduce the liquidity between the quantity of money
at various levels, thus blurring the boundaries among M 0 , M 1 , and M 2 . On the one hand,
Internet payment has accelerated the quantity of money at all levels, and on the other
hand, it functions as a substitute for traditional money. Under the combined effects of
these two aspects, Internet payment will inevitably have varying degrees of impact on
the velocity of money circulation at all levels. Particularly, the quantity of money at low-
level M 0 will be replaced by electronic money, and M 0 will be partially converted to the
quantity of money at a higher level. From this perspective, the demand for a low-level
quantity of money M 0 is reduced. According to Fisher’s equation, the velocity of money
circulation V 0 will rise. The changes in the second-level quantity of money M 1 depend
on the quantity of money converted from the lower-level to M 1 on the one hand, and also
hinge on the quantity of money partially converted to the higher-level quantity of money
M 2 . At the same time, part of M 1 is replaced by electronic money. Thus these three parts
will result in the change in the velocity of money circulation V 1 . The changes in the high-
level quantity of money M 2 depend on the part of the quantity of money converted from
M 0 and M 1 on the one hand, and on the other hand, it is partially replaced by electronic
money. Therefore, the changes in the high-level quantity of money M 2 depend on the
effect of these two aspects, and then affect the velocity of money circulation V 2 . The
mechanism of the influence on the velocity of money circulation is shown in Fig. 1.
Study on the Influence of Internet Payment on the Velocity 581

Electronic Money

Partially replaced by
Partially converted toPartially converted to

Partially converted to Partially converted to


M0 M1 M2

V0 V1 V2

Vi (i=0,1,2)

+ -
+

IPF FE SAVING

Fig. 1. Diagram of the mechanism of influence on the velocity of money circulation

4 Model Construction
4.1 Selection of Variables
In this paper, the velocity of money circulation at three different levels V i (i = 0,1,2) was
selected as the explained variable, the Internet payment frequency as the explanatory
variable, and the degree of financial electronization and the intention for saving as the
control variables to establish an ECM model to study the influence of Internet payment
on the velocity of money circulation.

4.2 Variable Description


Velocity of Money Circulation Vi (i = 0, 1, 2). According to Fisher’s equation MV =
PY, where M represents the money supply, V is the velocity of money circulation; P
is the price level, and Y is the actual output, V = PY/M can be obtained by the proper
transform of Fisher’s equation. To facilitate processing, the gross domestic product GDP
was adopted to measure PY, and the equation of the velocity of money circulation derived
from Fisher’s equation can be expressed as:
GDP
Vi = (i = 0, 1, 2) (1)
Mi
582 X. Liu and Q. Liu

Internet Payment Frequency (Hereinafter Referred to as IPF). The Internet pay-


ment frequency is defined as the ratio of the Internet payment transaction scale (referred
to as IPTS) to the quantity of money M 2 , then the equation of IPF can be expressed as:
IPTS
IPF = (2)
M2
Where IPTS includes the Internet payment transaction scale of third-parties and
domestic online banks. The ratio of the two indicates IPF in various payment methods.
When people prefer the Internet payment method, the higher the ratio, and the higher
the IPF. On the contrary, the smaller the value, the lower the IPF. This ratio shows that
the greater the scale of Internet payment transactions in China, the larger the proportion
of Internet payment transactions in the broad money supply. Therefore, the proportion
of Internet payment transactions can reflect the development level of online finance in
China. Internet payment has gradually extended from online payment to offline stores
and other business premises. According to Keynes’s three major motives of monetary
demand, consumers can meet their transaction needs with a small amount of cash, and use
more electronic money for daily transactions. Since 2011, the People’s Bank of China
has adopted a model of issuing Internet payment licenses to standardize the system
of Internet payment industry. Since then, the scale of Internet payment transactions in
China has increased year by year. Based on the idea of Fisher’s equation of the quantity
of money, the following equation was established:

Mt Vt + Me Ve = PY (3)

Where t represents traditional money, and e is electronic money. The increase in IPF
will produce two kinds of effects: either accelerates or converts the velocity of money
circulation. From the above analysis, it can be seen that these two kinds of effects will
increase the velocity of money circulation. The report of iResearch reveals that third-
party Internet payment institutions have the function of realizing high-speed money
circulation in different regions. Based on this, hypothesis 1 was proposed: there is a
positive relationship between IPF and the velocity of money circulation in China.

Finance Electronization (Hereinafter Referred to as FE). As one of the changes


brought by financial innovation, finance electronization has undoubtedly hit the form of
money to certain extent. The following equation was used to measure FE:
M2 − M0
FE = (4)
M2
Where M 2 represents the quantity of broad money, and M 0 represents the cash in
circulation. This ratio reflects the proportion of non-cash currency in the country. The
larger the increase in this proportion, the higher the level of FE. The transition from
the low-level quantity of money M 0 to the higher-level quantity of money M 1 and
M 2 can be achieved through the development and innovation of financial information
technology. Online transactions conducted by various commercial banks in China are
one of the indicators for measuring FE. According to the data from China Electronic
Banking Network, as of the end of 2019, a total of 21.6 billion electronic payment
Study on the Influence of Internet Payment on the Velocity 583

transactions had been conducted through online banking, a year-on-year increase of


33.9%, and self-service in commercial banks has become popular in recent years. At
the same time, 95.858 billion non-cash transactions have been completed in China, a
year-on-year increase of 51.23%. FE in China has been increasingly intensified, and
the transaction model improved by various commercial banks enables the monetary
transactions to be conducted at different times and places, increasing the cost of holding
money and accelerating the velocity of money circulation on the other hand. Therefore,
hypothesis 2 was made: FE positively correlates with the velocity of money circulation
in China.

Saving. The intention for saving refers to people’s comprehensive consideration of the
liquidity of the money they hold and the risk. When the intention for saving increases,
people hold cash with strong liquidity or the intention for current deposits decreases,
and the velocity of money circulation drops accordingly. Since the intention for saving
represents people’s subjective factors, it is difficult to be calculated and measured with
specific data, and appropriate variable from the outside can be used for measurement.
Based on this, “the proportion of residents who prefer saving” was adopted in this paper to
replace the intention for saving. The intention for saving is an individual’s psychological
factor. When people face uncertainties in their future income, they will be “forced to
save”. As a result, people will reduce their consumption behavior, which will lead to
the declined velocity of money circulation. Thus, hypothesis 3 was proposed: SAVING
negatively correlates with the velocity of money circulation in China.

The descriptive statistics of each variable are shown in Table 1.

Table 1. Descriptive statistics of variables

Variable Abbr. Number Mean Standard Minimum Maximum Data Expected


of deviation source symbol
samples
Velocity of V0 51 0.8993 0.1254 0.6649 1.1857 People’s +
money V1 51 0.1516 0.0144 0.1248 0.1843 Bank of
circulation China and
V2 51 0.0486 0.0086 0.0375 0.0870 National
Bureau of
Statistics
Internet IPF 51 0.0056 0.0046 0.0001 0.0134 IResearch +
payment
frequency
Finance FE 51 0.9448 0.0129 0.9016 0.9621 People’s +
electronization Bank of
China
Intention for SAVING 51 0.4248 0.0521 0.2540 0.540 People’s -
saving Bank of
China
584 X. Liu and Q. Liu

5 Empirical Analysis
5.1 Stationarity Test

The data from the first quarter of 2007 to the third quarter of 2019 were selected. To
eliminate the influence of seasonal factors, the X-12 seasonal adjustment method was
adopted, and SA was used o express the adjusted variables. Meanwhile, the logarithmic
form of each variable was used to enhance the stationarity of the time series. Considering
the selected data of time series will show spurious regression due to non-stationary
characteristics, the ADF unit root stationarity test was performed on each variable.
Before the ADF unit root test, it is necessary to check whether the variable has a drift
item or a time trend item. It can be found from Fig. 2 that the drift items and trend
items should be added to LNV 0 SA, LNV 2 SA, LNIPFSA, LNFESA, LNSAVINGSA, and
drift item should be added to LNV 1 SA.

Fig. 2. Trends of various variables

The AIC information criteria were employed to determine the lag order of each
variable. The results are provided in Table 2.
Study on the Influence of Internet Payment on the Velocity 585

Table 2. Test of lag order of each variable

Lag LL LR FPE AIC HQIC SBIC


0 440.723 NA 3.7E−16 −18.4989 −18.410 −18.2627
1 717.910 554.370 1.3E−20* −28.7621* −28.140* −27.1088*
2 746.199 56.579 2E−20 −28.4340 −27.2786 −25.3636
3 771.368 50.336 3.8E−20 −27.9731 −26.2844 −23.4855
4 821.619 100.500* 3.2E−20 −28.5795 −26.3576 −22.6748

It can be seen from Table 2 that the minimum values of AIC and SBIC are on the
same lag order. Thus the optimal lag order can be determined as 1. The results of ADF
unit root test are displayed in Table 3. After the first-order difference, all variables at the
1% significance level show no unit root, and all variables are stationary time series.

Table 3. Unit root test on each variable

Variable 1% 5% 10% t statistic p-value


LNV0 SA −4.156734 −3.504330 −3.181826 −7.950918 0.0000
LNV1 SA −3.571310 −2.922449 −2.599224 −2.997490 0.0421
LNV2 SA −4.156734 −3.504330 −3.181826 −5.615375 0.0001
LNIPFSA −4.156734 −3.504330 −3.181826 −0.730784 0.9649
LNFESA −4.156734 −3.504330 −3.181826 −3.686087 0.0328
LNSAVINGSA −4.156734 −3.504330 −3181826 −3.052128 0.1292
D(LNV0 SA) −4.161144 −3.506374 −3.183002 −7.945147 0.0000
D(LNV1 SA) −3.574446 −2.923780 −2.599925 −7.298664 0.0000
D(LNV2 SA) −4.161144 −3.506374 −3.183002 −8.314906 0.0000
D(LNIPFSA) −4.161144 −3.506374 −3.183002 −5.540109 0.0002
D(LNFESA) −4.161144 −3.506374 −3.183002 −7.212654 0.0000
D(LNSAVINGSA) −4.161144 −3.506374 −3.183002 −7.584809 0.0000
Note: 1%, 5%, and 10% respectively represent the significance level of each variable.

5.2 Co-integration Test

All variables are stationary after the first-order difference, and these variables are inte-
grated of the first-order. Thus, it can be considered that there is a long-term stable
relationship among them, which means that there is a possible co-integration relation-
ship. A test was performed to check this possibility. The main methods of co-integration
test are EG-ADF and Johnsen test method. The EG-ADF test method cannot deal with
586 X. Liu and Q. Liu

the situation where there are multiple co-integration relationships. Besides, this method
consists of two steps and will produce large errors. Therefore, the Johnsen test method
was selected in this paper for the test. The test results are shown in Table 4. When the
critical level of all variables is 5%, the trace statistic of the null hypothesis “there is no
more than one co-integration relationship” is greater than 5% critical level. Hence the
null hypothesis is rejected, and the null hypothesis that “there are no more than two co-
integration relationships” is accepted, indicating that “the velocity of money circulation
at each level has a co-integration relationship with IPF, FE, and SAVING.”

Table 4. Test results of co-integration relationship

Variable Null hypothesis Eigenvalue Trace statistic 5% critical level P value**


LNV0 SA There is no 0.4936 69.8045 47.8561 0.0001
co-integration
relationship*
There is no more 0.4050 36.4659 29.7971 0.0074
than one
co-integration
relationship*
There are no more 0.1862 11.0221 15.4947 0.2101
than two
co-integration
relationships
There are no more 0.0187 0.9273 3.8415 0.3356
than three
co-integration
relationships
LNV1 SA There is no 0.4585 62.8368 47.8561 0.0001
co-integration
relationship*
There is no more 0.3298 32.7724 29.7971 0.0221
than one
co-integration
relationship*
There are no more 0.1645 13.1657 15.4947 0.1089
than two
co-integration
relationships
There are no more 0.0851 4.3566 3.8415 0.0369
than three
co-integration
relationships
(continued)
Study on the Influence of Internet Payment on the Velocity 587

Table 4. (continued)

Variable Null hypothesis Eigenvalue Trace statistic 5% critical level P value**


LNV2 SA There is no 0.4688 66.4498 47.8561 0.0004
co-integration
relationship*
There is no more 0.3731 35.4562 29.7971 0.010
than one
co-integration
relationship*
There are no more 0.1722 12.5750 15.4947 0.1314
than two
co-integration
relationships
There are no more 0.0654 3.3152 3.8415 0.0686
than three
co-integration
relationships
Note: * means the null hypothesis is rejected at the 5% significance level; ** means the
determination of p value by Mackinnon-Haug-Michelis (1999) is used for reference.

5.3 Error Correction Model

Suppose there is a long-term equilibrium relationship among the related variables. In


that case, the short-term changes in the variables are partially adjusted to achieve the
long-term equilibrium relationship, and this idea is manifested in the error correction
model. Since there is a co-integration relationship among variables, there is a long-term
equilibrium relationship. Still, such a long-term equilibrium relationship is obtained
through continuous correction in short-term fluctuations. The error correction model is
needed for further analysis. Before constructing the error correction model, it is necessary
to determine the co-integration rank for the co-integration system, that is, to determine
the number of linearly independent co-integration vectors. It can be seen from the above
analysis that there exist two co-integration relationships in each co-integration equation.
Therefore, the rank of the co-integration system can be determined as 2. The results of
the error correction model were obtained through EVIEWS 10.0, as follows:

D(LNV0 SA) = −0.019907 + 0.267716[LNIPFSA(−1)] + 7.540770D[LNFESA(−1)]


(−1.040461) (2.104173∗)(3.031038***)
−0.283936D[LNSAVINGSA(−1)] − 1.084777ECM (−1)
(−1.600566**) (−7.044151 ∗ ∗∗)
(5)
588 X. Liu and Q. Liu

D(LNV1 SA) = −0.024428 + 0.245697D[LNIPFSA(−1)] + 4.189033D[LNFESA(−1)]


(−1.800968*) (2.758398 ∗ ∗∗)(2.378673**)
−0.133990D[LNSAVINGSA(−1)] − 0.566431ECM (−1)
(−1.093107**) (4.323493 ∗ ∗∗)
(6)
D(LNV2 SA) = −0.020174 + 0.247099D[LNIPFSA(−1)] − 5.798657D[LNFESA(−1)]
(−1.247258) (2.301570 ∗ ∗)(−2.706620***)
−0.209610D[LNSAVINGSA(−1)] − 1.107563ECM (−1)
(−1.424920**) (−7.567724 ∗ ∗∗)
(7)

Note: the numbers in the brackets represent t statistics, and *, **, *** represent 10%,
5%, and 1% significance levels, respectively.
The elasticity coefficient of the first-order lag of error correction term ECM is sig-
nificant at the 1% level, indicating that IPF, FE, and SAVING all significantly affect the
velocity of money circulation at all levels. When the velocity of money circulation at each
level in the short-term deviates from the long-term equilibrium relationship, the existing
adjustments of −1.0848, −0.5664, and −1.1076 will restore this long-term equilibrium
state. From the error correction model, it can be seen that IPF is significantly positively
correlated with the velocity of money circulation at all levels at the 10% significance
level, which is consistent with Hypothesis 1. The elasticity coefficient of IPF shows that
its accelerating effect on V 0 is stronger than that on V 1 and V 2 . This is mainly because,
in the short-term, Internet payment reduces the demand for cash held by people so that
it has a greater positive, stimulating effect on V 0 than it has on V 1 and V 2 .
FE shows a positive promotion effect on V 0 and V 1 , but it has a negative inhibitory
effect on V 2 , not consistent with the Hypothesis 2. There may be the following rea-
sons. With the increasingly strengthened financial electronization in China, residents
or companies have a positive expectation on the financial products with preventive and
speculative motives covered by money supply M 2 , such as financial bonds, commercial
paper, and transferable certificates of deposit, so they tend to accept these financial prod-
ucts. From this perspective, the holding of these financial products is relatively stable,
and it may lead to a decrease in the velocity of money circulation V 2 .
The estimation results of the model reveal that the residents’ intention for saving has
a significant negative correlation with the velocity of money circulation at all levels at
the 5% level, which is consistent with Hypothesis 3. Since the error correction model
is not applicable to the analysis of the changes in each variable, the impulse response
function was used to analyze the dynamic fluctuations in each variable.

5.4 Impulse Response Function Analysis

It can be seen from Fig. 3 that in the impulse analysis with lag periods of 20, the impact of
V 0 on itself in the current period is 11.2%, and then drops rapidly to −1.8% in the second
period, after which the value fluctuates and remains at 1%. Regarding other variables
Study on the Influence of Internet Payment on the Velocity 589

.12

.10

.08

.06

.04

.02

.00

-.02

-.04

-.06
2 4 6 8 10 12 14 16 18 20

LNV0SA LNIPFSA
LNFESA LNSAVINGSA

Fig. 3. Impulse response of LNV 0 SA

influencing V 0 , the impact of IPF, FE, and SAVING on V 0 in the current period is zero.
The impact of IPF on V 0 in the second period is −3.7%, and reaches the peak of 1.9%
in the fourth period, then gradually fluctuates until the 18th period, stabilizing at the
0.6% level; the impact of FE reaches the bottom of −2.5% in the second period, then
fluctuates to a peak of 1.2% in the third period, and finally remains at −0.4% in the 17th
period; the impact SAVING in the fourth period reaches the bottom of −4.1%, which
fluctuates until the 16th period when it finally stabilizes at −0.18%. Specifically, the
impact of IPF on V 0 increased from −3.7% in the first quarter of 2008 to 1.9% in the
fourth quarter of 2009. The impact of FE reached the minimum value of −2.5% in the
first quarter of 2008 and the maximum value of the impact was 1.2% in the first quarter
of 2009.The minimum value of SAVING, which gradually showed its impact after the
second quarter of 2009, was −4.1%.

.10

.08

.06

.04

.02

.00

-.02
2 4 6 8 10 12 14 16 18 20

LNV1SA LNIPFSA
LNFESA LNSAVINGSA

Fig. 4. The impulse response of LNV 1 SA

In the impulse analysis of V 1 , lag 20 was also sent for analysis. It can be seen from
Fig. 4 that the impact of V 1 on itself in the current period is 9.1%. Then it drops to the
bottom of 3.6% in the second period and remains fluctuating until the 12th period when it
finally stabilizes at 4.2%. Similarly, the influence of other variables on V 1 was analyzed,
590 X. Liu and Q. Liu

and it was found that the impact of these factors on V 1 in the current period is 0. The
impact of IPF on V 1 reaches a maximum of 0.7% in the second period, after which it
fluctuates until the 11th period, maintaining a 0.1% level; FE has a 2.2% impact on V 1
in the third period, and this Fig. drops to the minimum of 0.2% in the fourth period,
finally stabilizing at 0.8% in the 13th period; the impact SAVING in the third period has
reached the maximum of 2.3%, after which the value reduces until the 15th period when
it finally remains at 1%. Specifically, the impact of IPF on V 1 reached a peak of 0.7%
in the first quarter of 2008. From the fourth quarter of 2008 to the third quarter of 2009,
the impact of FE decreased from 2.2% to 0.2%. SAVING declined from a peak of 2.3%
in the fourth quarter of 2008 until stabilizing at 1% around the fourth quarter of 2017.

.10

.08

.06

.04

.02

.00

-.02

-.04
5 10 15 20 25 30 35 40 45 50

LNV2SA LNIPFSA
LNFESA LNSAVINGSA

Fig. 5. Impulse response of LNV 2 SA

In order to better show the dynamic changes of the impulse response of each variable,
in the impulse response analysis of V 2 , lag 50 was selected. From Fig. 5, it can be
found that the impact of V 2 on itself is 0.9% in the current period, after which it has
been fluctuating until it gradually stabilizes at 3.3% in the 46th period. Regarding other
variables affecting V 2 , the impact of IPF, FE, and SAVING in the current period is 0.
The impact of IPF reaches a peak of 1.9% in the fourth period, then remains fluctuating
and gradually decreases until it finally stabilizes at 1.3% in the 38th period; the impact
of FE drops to the bottom of −3.2% in the second period, and then remains fluctuating
until the 44th period when it finally maintains the level of −1.7%; the impact SAVING
in the third period reaches a maximum of 1.7%, which falls to −0.8% in the 4th period,
and reaches a steady state in the 40th period, remaining at the level of 0.4%. Specifically,
the impact of IPF on V 2 peaked at 1.9% in the fourth quarter of 2007; The impact of
FE in the second quarter of 2007 was the least −1.7%; The impact of SAVING peaked
at 1.7% in the third quarter of 2007.

6 Conclusions and Policy Suggestions


In this paper, the factors affecting the velocity of money circulation in China, including
the Internet payment frequency (IPF), financial electronization (FE), and the intention
Study on the Influence of Internet Payment on the Velocity 591

for saving (SAVING) were thoroughly studied and analyzed through the co-integration
test, ECM model and impulse response analysis. On the whole, Internet payment fre-
quency and financial electronization have a positive correlation with the velocity of
money circulation in China, while the intention for saving shows a negative and inhibitory
effect on the velocity of money circulation in China. In the long term, the velocity of
money circulation at all levels in China maintains a stable relationship with the Internet
payment frequency, financial electronization and the intention for saving. By establish-
ing an ECM model and impulse response analysis, it can be concluded that the Internet
payment frequency, financial electronization and the intention for saving will produce
varying degrees of influence on the velocity of money circulation in China in the short
term. With the expansion of the scale of Internet payment, the early simple attribution
of Mackinnon’s Chinese financial conundrum to the decline in the velocity of money
circulation is no longer applicable. There are complex factors behind the velocity of
money circulation.
In the long run, there is a stable relationship between the activity of Internet payment
and the velocity of money circulation at all levels in China. In the short run, Internet
payment indeed will promote V 0 , V 1 and V 2 . Therefore, it can be said that Internet
payment has accelerated money circulation in China. However, the accelerating effect
of IPF on V 0 is stronger than that on V 1 and V 2 , which can be ascribed to two aspects.
First, the scale of Internet payment transactions is constantly expanding; second, Internet
payment leads to the conversion of low-level money to high-level money, which gradually
blurs the boundary between the quantity of money at various levels. When the velocity
of money circulation at all levels in the short-term deviates from the equilibrium and
stable relations with various factors, greater efforts are required to bring the velocity of
money circulation at all levels back to this equilibrium state, which means the increase
in the instability of the velocity of money circulation, thereby affecting the formulation
of monetary policies. The following suggestions are proposed:
First, the amount of money at all levels can be subdivided. The rapid development of
Internet payment and the increasing scale of transactions will affect the division of the
amount of money at various levels, thereby influencing the effectiveness of monetary
policies, and will bring certain challenges to the supervision of Internet payment. With
the expansion of the scale of Internet payment and the strengthening of financial elec-
tronization, the monetary authority should further optimize the division of the amount of
money at all levels, consider Internet payment, and use it as a reference for formulating
policies.
Second, it is necessary to understand the impact of the velocity of money circulation
on monetary policies. Since the velocity of money circulation has an important influence
on the effectiveness of monetary policies, the supply of money should be considered.
Various factors affecting the velocity of money circulation need to be considered in the
formulation of monetary policies. In addition, attention should also be paid to the impact
of changes in the velocity of money circulation on the economy. When the economic
operation is under pressure, such as inflation, the central bank needs to respond to the
velocity of money circulation and make corresponding expectations on the changes in
the velocity of money circulation to accurately regulate the money supply and mitigate
economic pressure such as inflation.
592 X. Liu and Q. Liu

Third, laws and regulations on Internet payment should be improved. China’s central
bank needs to fulfill its supervisory duties to ensure the proper development of Internet
payment. It cannot excessively suppress the development of Internet payment, nor allow
the uncontrollable development of Internet payment. Characterized by convenient use
and low transaction costs, Internet payment itself plays a certain role in promoting
the sound development of China’s economy. Therefore, China’s central bank should
encourage, support, and correctly guide the development of Internet payment in China
and take preventive measures to develop relevant policies to solve the problems in the
development of Internet payment.

References
1. Xie, P., Liu, H.: ICT, mobile payments and electronic money. J. Financ. Res. 10, 1–14 (2013)
2. Hongde, A., Nan, F.: An empirical analysis of monetary velocity in China. J. World Economy.
(08), 53–59 (2002)
3. Freedmen, C.: Monetary policy implementation: past, present and future will electronic money
lead to the eventual demise of central banking? Int. Financ. 3(2), 221–227 (2000)
4. Sullivan, S.M.: Electronic money and its impact on central banking and monetary policy.
Hamilton College Working Paper (02/1) (2002)
5. Williamson, S.D.: Payment systems and monetary policy. J. Monetary Econ. 2, 475–495
(2003)
6. Al-Laham, M., Al-Tarwneh, H., Abdallatn.: Development of electronic money and its impact
on the central bank role and monetary policy. Issues Inf. Sci. Inf. Technol. (6), 339–349 (2009)
7. Fujiki, H., Tanaka, M.: Currency demand, new technology, and the adoption of electronic
money: micro evidence from Japan. Econ. Lett. (1), 5–8 (2014)
8. Lee, H., Han, K.: A study on mobile easy payment service based on fintech to reduce smart
divide and income gap. Int. J. Adv. Sci. Technol. 116, 35–48 (2018)
9. Dong, X., Z, H.: Challenge of networks money to the central bank. Econ. Theory Bus. Manage.
(07), 21–25 (2001)
10. Zhou, G.Y.: The impact on money circulation velocity of electronic money: an empirical
study with cointegration method. China Econ. Q. 03, 1219–1234 (2006)
11. Yin, L.: The effect of e-money on the central bank. J. Financ. Res. 04, 34–41 (2000)
12. Chen, Y., B, W.: The development of e-currency and the analysis on the risk faced by the
central bank. Stud. Int. Financ. (01), 53–58 (2002)
13. Pu, C.: The influence of digital cash on currency supply and currency speed. J. Financ. Res.
05, 81–89 (2002)
14. Fang, Y.Q.: The development of payment system and the change of velocity of currency
circulation. Shanghai Financ. 09, 27–31 (2009)
15. Li, N., H, X., and X, E.: The impact of payment system innovation on chinese monetary
system. Finan. Forum. 19(11), 29–34 (2014)
16. Li, S., Zhang, X.: The influence of the third-party on the velocity of chinese monetary
circulation. Finan. Forum. 20(12), 25–33 (2015)
17. Fang, X., Guo, Z.R.: Third-party internet payment, currency circulation speed and monetary
policy effectiveness-research based on TVP-VAR model. Inquiry Econ. Issues. 03, 183–190
(2017)
18. Hou, Y., Chen, J.N.: Research on correction of velocity of money and influence factors: based
on empirical analysis of VEC. Financ. Theory Pract. 02, 107–111 (2012)
19. Xia, D.R.: Changes of monetary velocity of economic monetization process. J. Central Univ.
Finan. Econ. 04, 24–28 (1991)
Study on the Influence of Internet Payment on the Velocity 593

20. Zhang, S.M., Du, W.L.: A quantitative study on the velocity of money in China. J. Shandong
Univ. (Philosophy and Social Sciences) 02, 93–98 (2007)
21. Liu, D.: Does the “internet payment” speed up the monetary circulation velocity? J. Central
Univ. Financ. Econ. 02, 32–42 (2017)
22. Liang, D.P., Qi, Z.Y.: Research on the impact of financial innovation on currency circulation
speed. Econ. Sci. 02, 27–34 (2004)
23. Liu, M.Z.: China’s M2 /GDP (1980–2000): trend. Level and Determinants. Econ. Res. J. 02,
3–12 (2001)
System Ordering Process Based on Uni-, Bi-
and Multidirectionality – Theory and First
Examples

Bernhard Heiden1,2(B) , Bianca Tonino-Heiden2 , and Volodymyr Alieksieiev3


1 Carinthia University of Applied Sciences, 9524 Villach, Austria
b.heiden@cuas.at
2 University of Graz, Heinrichstraße 25/VI, 8010 Graz, Austria
3 Leibniz University Hannover, An der Universität 1, 30823 Garbsen, Germany

Abstract. The decentral or self-organization versus the center is an ancient


dichotomy also referring to the arising osmotic paradigm. The motivation of this
paper is to get insight into this dichotomy in the production logistics and episte-
mology knowledge area. In this paper, we investigate how the thought concept of
uni-, bi- to multidirectional can be used as a general explanation tool explaining the
order and also explaining this dichotomy. Exemplary applications to these theses
are then given in production logistics and as a generalization in epistemology.

Keywords: Osmotic paradigm · Multidirectionality · Epistemology

1 Introduction
A theory related to whole-part relations is the holon theory by Arthur Koestler [1], where
he first defined the word holon, which stands for the hole and the part in one word. In the
later development, especially with production and computational research, the holon was
also related to the then-emerging decentral production. Some approaches like the holon
theory and similar ones like bionic and fractal manufacturing are related primarily to
flexible production’s decentralizing problem [2]. The theory of self-organization on the
other side is related to nonlinear phenomena, especially the chaos theoretic approaches,
e.g., by Prigogine [3], Lorenz, van der Pool, the theory of laser, and the synergetics
of Herman Haken [4, 5] and interdisciplinary approaches [6]). This theory can com-
bine natural sciences and social sciences through increasingly nested relations. On the
other hand, chaos theory deals with deterministic approaches in a rather mathematical
form, and this may be a first step in understanding nature more deeply. The interdis-
ciplinary methods inherited from chaos theory, self-organizational, and systems theory
also use Einstein’s idea [7] to relate different applications or disciplines with the same
mathematical expressions.
An example is logic calculus - beginning, e.g., with predicate logic of first-order [8]
- allowing us to combine very different areas of knowledge in the language. The origin
or orgiton theory framework is in the development process [14, 15], and consists of

© The Author(s), under exclusive license to Springer Nature Switzerland AG 2022


A. E. Hassanien et al. (Eds.): BIIT 2021, LNDECT 107, pp. 594–604, 2022.
https://doi.org/10.1007/978-3-030-92632-8_55
System Ordering Process Based on Uni-, Bi- and Multidirectionality 595

origins, orgitons or cybernetical units of mass, energy, and information. The hierarchic
ordering there is quite similar to the holon theory [1]. However, the focus is more on
lively functional elements, although, in the generalization, it is universally applicable
and can be best understood by a dynamic graph theory approach, where knots and edges
are interchangeable and can be regarded by this as phase changes.
This paper aims to argue for the plausibility of uni-, bi- and multidirectional order
increase - as a proposition of orgiton theory - as a systemic property that can be motivated
by the no-intersection theorem in chaos-theory. We then further want to answer how to
use this in the epistemology of science as a sustainability framework and in logistics and
production modeling. The method used is a system theoretic argumentation introducing
and using basic axioms and principles.
The paper begins in Sect. 2 with the definition of the higher-order gaining principles.
In Sect. 3, we then give an application concerning an epistemology of science. Next, in
Sect. 4, we provide the application in production and logistics, with an orgiton model
in principle, Witness modeling, and some first considerations towards a further general-
ization. Finally, in Sect. 5, we conclude and give a short outlook on the extension of this
approach towards more connected graphs.

2 Higher-Order Gaining Principle

The no-intersection theorem in chaos theory states, according to [5; p. 76]: “Two distinct
state-space trajectories cannot intersect (in a finite period of time). Nor can a single
trajectory cross itself at a later time.”
This theorem is especially closely connected to differential equation systems of a
certain order and the Lipschitz condition, to be continuous and differentiable to a certain
degree. This restricts the application to differentiable functions and the assumption of
being continuous, which means related or having a “connection”.
Relating now the state space of the Lorenz equations according to Fig. 1, we can
draw in the lower two-dimensional diagrams a two-directional mapping, which indicates
a snapshot of bidirectionality. In this case of a bi-bi directionality as the Lorenz-attractor
has two fix-points or nodes. The seeming intersection in the lower dimensions expresses
the coupling, which is invariant in all the views but changes with perspective, as the
information is incomplete in the lower dimensions. Due to the no-intersection theo-
rem, we gain the complete and integrated information only by integration, hence the
unidirectional trajectory in the state-space.
Still, in the meta-view, there occurs a unifying process towards what we understand
as one system. The important fact is that we see by this mechanism that a system can
be coupled along more than one dimension, meaning interdependence, bidirectionally.
Altogether, this connectivity can be observed on lower dimensions with different per-
spectives as the synchronicity of a specific order. The views may then change the relative
directionality but not the orgitonal order of unidirectionality (), bidirectionality (),
or multidirectionality (), meaning the possible multiplicity with at least two ele-
ments. More precisely, in addition to the change of direction, also at least one repetition
in the same direction. The further increase in production efficiency can then be under-
stood in general as the osmotic principle [9–11], which denotes a multidirectional path
596 B. Heiden et al.

in, e.g. logistics and production. In biology or the biotechnological, it leads in the long
term to a production maximisation in diverse, or as we would say multidirectional, eco-
logical niches [12]. Here, the complementary principle of nature also applies, which has
recently been rediscovered by Grossberg [13] in the neuro-computational modeling of
our brain.

Fig. 1. Lorenz attractor - solutions for r = 25, p = 10, b = 8/3, x2(0) = 0, y2(0) = 1, z2(0) =
20, x2, y2, z2… State-space coordinates, t1… Time variable (0..15) x2 ˙ = p*(y2-x2); y2 ˙ = r*x2-
y2-z2*x2; z2=˙ x*y-b*z2; Lorenz equation according to [5]. The arrows depicted here denote the
directional state of a state-trajectory or its at-the-moment dynamic movement. The above-shown
simulation is done with Mathcad 15, analogously to [5]. Additionally, to demonstrate the meaning
of directionality in state-space trajectories, there are here depicted arrows () that indicate the
directionality about time, as each, with one arrow connected, point pair in the picture corresponds
to a time sequence in points. Hence each arrow represents onedirectionality () in time as well as
in the state-space-room. We can also see that a higher dimensional direction may appear different in
the lower dimensions by simultaneously conserving the bidirectional coupling or bidirectionality
(). The Mathcad 15 is provided in [26].

In orgiton theory [14, 15], which is a self-organization theory, that is based on cyber-
netic units, which can be understood as a cyclic interconnection between mass, energy,
and information as potential variables, there is the notion of uni- and bi-directionality,
which indicate the potential increase in order by increasing orgitonal evolution, or
indices. Relating this to the production and logistics area with Axiom 1, the following
can be stated:

Order is potentially increased by diversifying the modi operandi (options or Axiom 1


degrees of freedom of action or state).
System Ordering Process Based on Uni-, Bi- and Multidirectionality 597

With relation to the Lorenz equations, which may be valid more widely in nonlinear
applications, Axiom 2 can be formulated:

Its coupling (of the Lorenz equation) shows in the “synchronicity” or “state Axiom 2
space parallelity” of the autocorrelation function.

Figure 1 shows that the two points depicted there correlate with each other after
specific time intervals, as they repeat their trajectory approximately. The same is anal-
ogously valid for the parallelity in state-space. When looking at the same points from
different perspectives as done in Fig. 1, we see that those points’ properties may change
their movement concerning other observation parameters. However, they stay coupled
to one “single line”, which in effect is the meaning of the no-intersection theorem. In
the higher dimension, in this case, three-dimensional, the order gets “unidirectional”,
although, in two dimensions, it brings different shapes of directions. The coupling is
then indicated in the higher-order unification.
This means that the bifurcation of directions, and their repetitive approximative
behavior in lower dimensions, indicates order, self-order, or autocorrelation in time and
above the state space.
An interesting note may be given on the autocorrelation function, which can be
regarded as bi-directionality, related to a self-cycle, meaning that the trajectory is return-
ing approximately to itself (cf. [27, 28] where this is related to order increase). In cyber-
netics, this is also called a feedback cycle. Hence we see, that a feedback cycle is a
bidirectional entity and can also be regarded as a holon or orgiton. Rudolf E. Kálmán,
in his famous paper [16], has related this to his important terms of observability (one-
directional) and controllability (bi-directional). A similar notion is given in management
theory by Otto Scharmer in theory U, where he denotes a cycle of a creative act using
first observing and then prototyping, which leads to effective, innovative products [17,
18]. These products then mean a higher-order step.
So simultaneously, at the same time, two variables are coupled with their direction
or opposite direction when there occurs a cycle. The overall cyclic behavior leads then
to the abstraction of the two-directional ().
As seen in the Lorenz attractor, different states can simultaneously indicate even
higher order, each achieved by bifurcation. Thus, there are two node attractors in this
case, leading to two coupled cyclic behaviors in one meta-coupled trajectory.
This can be understood as a bifurcated bidirectional trajectory or a bi-bi directional
one.

A bidirectional trajectory (⇌) can be interpreted as a one-step higher Axiom 3


dimensional coupled unidirectional (⇀) one.
598 B. Heiden et al.

Due to Axiom 3:

(T)the bidirectional is a result of Principle 1


(a) increasing complexity in lower dimensions, or more specificity
of particular elements, and
(b) it is also a result of coupling through the higher dimension.

In general hence

(T)the higher bidirectional or multidirectional is both coupling and Axiom 4


reduction (⇋⇀).

The coupling links the “two directions”, employing the higher order, the sum-
function, integral, etc. The higher-order gives the system more momentum (see also
Sect. 3) and hence increases its Echtzeit (real-time), or range of Raumzeit (room-
time) validity. This system’s growth in the room-time range is also evidence of the
solidification-fluidization theorem [19], which links the growth of an open system with
a larger scaling range of controllability.

Principle 1 can be extended, now understood as an algorithm, systematically Principle 2


towards higher dimensions, by this yielding a higher original or orgitonal
state or higher-order holons or orgitons.

So Principle 2 is in itself a bidirectional self-extension or meta-view.

3 Bidirectionality in Science

According to Fig. 2, the knowledge-gaining process is divided by the time arrow. We can
imagine dual parameters A and A’ in the imaginary dimension, like ideas, innovations,
applications, and natural science or basic research. A and A’ are complementary, anal-
ogously to the imaginary axes in the complex Gaussian numbers plane. The parameters
and cones are connected in the present - or presencing - point of the continuous-time
arrow. So this duality of antipodes is a matter of coupled bidirectionality. Figure 2 can
also be denoted as a Sanduhr (sandglass), indicating real-time processes with different
cone angles for many influencing parameters.
System Ordering Process Based on Uni-, Bi- and Multidirectionality 599

Fig. 2. Epistemological interpretation of the intersection and no-intersection ambivalence of real


processes, especially for emergent knowledge generation.

To motivate a development that creates emergently higher knowledge, the differences


of parameters A to A’ are a driving force. This especially pictures the necessity of coupled
bidirectionality as a driving force. Hence this means bidirectionality of a parameter of
the same size but with opposite direction.
So the living process is comprised of pragmatic and theoretical needs, which are
mutually active. When asked which parameter is more necessary, the answer is both
[20]. The same applies to forms of truth. Pragmatic truth, semantic truth, and syntactic
truth are three forms of truth. The syntactic truth is related to structure, especially in
disciplines like mathematics, law, and logic. The semantic truth is associated with Tarski
and Quine (e.g. [21, 22]). Finally, the pragmatic truth is related to immediate action if
no theory is available or insufficient.
As a consequence of Fig. 2, these are drivers for epistemological elements, forming,
depending on the configuration knowledge, solidification. Moreover, the term “truth”
or practical relevance is the timely valid attractors, indicated by the snapshot arrows in
Figs. 1 and 2. The attractors vice versa are the invariant in the dynamic system, relatively.
If there is a decoupling, the process depicted does not proceed or gets less momentum,
as indicated by the curve-shaped momentum arrows in Fig. 2.
The size of the system finally is also a part of the systems development properties.
As denoted in [19], or in Axiom 4, the system continues and increases its real-time
realization process by gaining inertia. This can be as well social mass as the physical
one. The total system becomes increasingly coupled and higher dimensional, meaning
that it has an overall potential or “meta”-order.
600 B. Heiden et al.

With regard to sociology, Luhmann comes to an interesting result [23; p. 484 f.


and FN 205] that there are two process types in social systems for structure change:
teleological and morphogenetic. The first is bidirectional and the second unidirectional
concerning selectivity-increase, and he supposes them to describe all possible structural
changes in evolution. According to our theory, the system order can then progress further
towards multidirectional.

4 Sustainable Energy Conficiency Model


In this section, we first motivate for an at most simple orgiton model, which shall then,
later on, be extended in scale to finally discriminate between the central and decentral
organization, corresponding to the previous given discussion about uni-, bi- and multidi-
rectional, and their order relations more deeply. The topic in question, which we under-
stand as order here, is how energy efficient such systems are and what this means regard-
ing the no-intersection theorem, respectively the uni-, bi- or potential multidirectional
order increasing orgiton theory proposition.

4.1 Orgiton Modelling

This part describes the elementary production or logistic scalable orgiton model for
studying their energy usage, simulated in the simulation software Witness. The basic
model consists of three nodes (Fig. 3a). The advanced model consists of four and virtual
three nodes (Fig. 3b) and four nodes (Fig. 3c). The idea is to provide a model that
describes the central and decentral dichotomy or problem as a holon or an orgiton. For
this, it has the properties of uni- (), bi- () and multidirectional (), especially
when nested of lower-order forms (cf. also Fig. 3a-c).

4.2 Witness Simulation Example

The orgiton model, modelled in the production and simulation software Witness, is based
on the primary facility location problem (cf. e.g. [24]) and consists of (1) producer, (2)
supplier, and (3) customer in three possible arrangement combinations (see Fig. 3).
In Fig. 3, centralised production facilities (“Producer”) are shown. The raw materials
are delivered to the production from the supplier (“Supplier”). A production process runs
in one single location, and finished products are delivered afterwards to the customer
(“Customer”). The above-mentioned “transportation processes” are performed using
trucks (“Truck1”, “Truck2”) through the routes (“R1”, “R2”). Since the return shipments
are not taken into consideration for this special case, it yields us to consider the relations
“Supplier”-“Producer”, and “Producer”-“Customer” as unidirectional. Anyhow the back
travelling is included by the factor 2 in the “PC” equation.
System Ordering Process Based on Uni-, Bi- and Multidirectionality 601

Fig. 3. Elementary scalable production or logistic orgiton model in Witness: a) with centralised
production facilities (); b) with decentralised coupled production facilities (); c) with decen-
tralised coupled and “meta”-uncoupled production facilities (). The Witness 14 files are
provided in [26].

In Fig. 3b, decentralised coupled production facilities are shown. In this example, the
raw materials are delivered from the supplier (“Supplier”) to the first production facil-
ity (“Producer1”), where the first production phase is performed (e.g. pre-assembling).
Afterwards, the semi-finished products are transported to the second production facility
(“Producer2”), where the final product is performed. Afterwards, they are transported
to the customer (“Customer”) similar to the in Fig. 3a shown example. The transport
processes are performed with three trucks (“Truck1”, “Truck2”, “Truck3”). In this exam-
ple, routes “R1”, “R3” can, similar to the in Fig. 3a shown example, be considered as
unidirectional. However, we consider a route “R2” as bidirectional since both facilities
are considered as one company, and hence, are coupled. It refers to Axiom 4, described
above. According to it, bidirectionality is both coupling and reduction since in cou-
pled facilities, the interaction level grows (cf. Principle 1), leading to more cooperation.
Practically this means, e.g. the dynamic allocation of production volumes to reduce peak
utilisation.
In Fig. 3c, decentralised coupled and “meta”-uncoupled production facilities ()
are shown. The two production facilities are decentral and cooperative (), which we
also call conficient. The meta-relation from customer to producer is meta-uncoupled
602 B. Heiden et al.

() once triggered, and there is a meta-uncoupling () inside the company as central
control. Here, the raw materials are delivered with a truck (“Truck1”) from the supplier
(“Supplier”) to the two production facilities (“Producer1, Producer2”), which are inde-
pendent of each other, where the whole production cycle is performed. Afterwards, the
final products are dispatched to the customer (“Customer”), using trucks “Truck2” and
“Truck3”. This example can be considered as a “scaled model” of the in Fig. 3a, shown
example, where all routes are unidirectional. In general, in the case of an uncoupled ()
and united () model, the order of the whole system increases, according to Axiom 1,
with the increasing number of routes (edges), which leads to increasing complexity and
hence a potential higher order orgiton or holon, as it is more diverse.

4.3 Results

The first results of the considerations, especially concerning Fig. 3, are that the order
according to subfigure or case “a, b, c” is that the energy consumption is increasing
linearly due to the chosen scale of comparable length and the same production-logistics
sequence according to the summarising sentence that “a(A) product is delivered to the
customer via supply and production”, which comprises 3, 4(3), 4 knots from “a–c”.
Hence the complexity is in “b, c” increased compared to “a”, but “c” can be regarded
as a unidirectional and bidirectional case versus “b”, a bidirectional one, as it is unity
on a higher level. There are two sorts of bifurcations: One towards cooperation and
bidirectional communication (), the other towards non-cooperation and unidirectional
() communication. In their nesting, they get multidirectional ().

5 Conclusion

First, in Sect. 2, we have developed four axioms and two principles related to the princi-
ple of uni-, bi- and multidirectional, with recourse to the chaos-theoretic no-intersection
theorem. According to Sect. 3, from epistemic knowledge gaining to social mass related
processes, increasingly fluidization occurs through coupling and an overall inertialisa-
tion, by gaining knowledge momentum, leading to an increasingly higher order, of a
system that is increasingly real-time connected, multidirectional and conficient. In [25],
we defined conficient as the property of being cooperative and efficient. According to
Sect. 4, it seems that, as a first result, cooperation is leading to more communication pro-
cesses that are subdividing the internal process of the manufacturer, and hence, allowing
for a flexible adaption according to the geometry. This has the advantage that the order
increases, and with it the possibilities room. Compared to the higher level no-cooperation
case “c”, case “b” is relatively less conficient or less efficient and cooperative. Com-
pared to the less complex case “a,” it is – as a possibility - most efficient, understanding
here efficiency as the ability to cope with complexity. That means that there is a range
of optimality of cooperation and its “dual” concerning its meaning in condensed form:
conficient or cooperative and efficient systems.
Overall for both investigated cases, as for the general case, the no-intersection the-
orem predicts unification of higher-order systems. That means that the unification is
hence of less order, as it is unidirectional, but it has the potential to increase in order,
System Ordering Process Based on Uni-, Bi- and Multidirectionality 603

with further bifurcation towards multidirectionality. As an outlook, the role of increas-


ingly nested systems in developing conficient systems has to be investigated towards the
properties of how they behave in a dynamic environment.
The research limits are that we used analogies concerning the form of dynamical sys-
tems like the Lorenz equations that are restricted mathematically to differential equation
systems of a specific order. “What could be the relation to fractal systems, and are there
major differences, or how do they relate?” These questions could be the starting point
for valuable future investigations in this field. Other open research questions related
to sustainability in logistics and production are: “What chaos theoretic examples give
exemplary enumerable evidence of the above said theses that intensify the empirical as
well as theoretical proof of the uni-, bi- and multidirectionality, direction potential order
increase in general open systems?”.
Finally, an essential part of increasing sustainability of the supplier, producer, cus-
tomer triangle will be Artificial Intelligence (AI) in each field. The systemic reason is
that it is a vital universal tool mainly, meaning, scale-independent applicable. However,
far more important is the self-applicability of AI, meaning that it is a tool to make tools
or a tool-o with orgiton theory. And as a by-product of this paper, argued with Principles
1–2, this will probably substantially improve conficiency and can hence be regarded as
a promising research outlook, for which researchers are invited cordially to participate
in the search () and “re”-search () and above the “meta”-search () for the
positive and sustainable future of humanity.

References
1. Koestler, A.: The Ghost in the Machine. Anchor Press Ltd. (1967)
2. Tharumarajah, A., Wells, A.J., Nemes, L.: Comparison of the bionic, fractal and holonic
manufacturing system concepts. Int. J. Comput. Integr. Manuf. 9, 217–226 (1996)
3. Glansdorff, P., Prigogine, I.: Thermodynamic Theory of Structure, Stability and Fluctuations.
Wiley Interscience a division of John Wiley & Sons Ltd, Brüssel (1971)
4. Dürr, H.P., et al.: Selbstorganisation. Die Entstehung von Ordnung in Natur und Gesellschaft.
Piper Verlag, München (1986)
5. Hilborn, R.C.: Chaos and Nonlinear Dynamics - An Introduction for Scientists and Engineers.
Oxford University Press, New York (1994)
6. Götschl, J.: Self-organization: new foundations for a more uniform understanding of reality
(Original in German: ‘Selbstorganisation: Neue Grundlagen zu einem einheitlicheren Real-
itätsverständnis’). In: Vec, M., Hütt, M.T., Freund, A. (eds.) Self-organization - A system of
thought for nature and society (Original in German: ‘Selbstorganisation - Ein Denksystem
für Natur und Gesellschaft’), pp. 35–65. Böhlau Verlag, Köln (2006)
7. Einstein, A., von Smoluchowski, M.: Untersuchungen über die Theorie der Brownschen
Bewegung - Abhandlung über die Brownsche Bewegung und verwandte Erscheinungen.
Verlag Harri Deutsch, Frankfurt am Main (2001)
8. Mates, B.: Elementare Logik - Prädikatenlogik der ersten Stufe. Vandenhoeck und Ruprecht
in Göttingen (1978)
9. Heiden, B., Tonino-Heiden, B., Alieksieiev, V.: Artificial Life - Investigations about a Uni-
versal Osmotic Paradigm (UOP). In: Arai, K. (ed.) Intelligent Computing, LNNS, vol. 285,
pp. 595–605. Springer Nature (2021). https://doi.org/10.1007/978-3-030-80129-8_42
10. Villari, M., Fazio, M., Dustdar, S., Rana, O., Ranjan, R.: Osmotic computing: a new paradigm
for edge/cloud integration. IEEE Cloud Comput. 3, 76-83–76-83 (2016)
604 B. Heiden et al.

11. Heiden, B., Volk, M., Alieksieiev, V., Tonino-Heiden, B.: Framing Artificial Intelligence (AI)
Additive Manufacturing (AM). Procedia Computer Science, vol. 186, pp. 387–394. Elsevier
B.V. (2020). https://doi.org/10.1016/j.procs.2021.04.161
12. Moser, A.: Bioprozeßtechnik. Springer Vienna (1981)
13. Grossberg, S.: Conscious Mind. How Each Brain Makes a Mind. OXFORD UNIV PR,
Resonant Brain (2021)
14. Heiden, B., Tonino-Heiden, B., Wissounig, W., Nicolay, P., Roth, M., Walder, S., Mingxing,
X., Maat, W.: Orgiton Theory (unpublished) (2019)
15. Heiden, B., Tonino-Heiden, B.: Philosophical Studies - Special Orgiton The-
ory/Philosophische Untersuchungen - Spezielle Orgitontheorie (English and German Edition)
(unpublished) (2021)
16. Kálmán, R.E.: Contributions to the Theory of Optimal Control. Boletin de la Sociedad
Matematica Mexicana 5, 102–119 (1960)
17. Scharmer, O., Käufer, K.: Leading from the Emerging Future - From Ego-System To Eco-
System Economies - Applying Theory U to Transforming Business, Society, and Self. Berrett-
Koehler Publishers Inc., San Francisco (2013)
18. Heiden, B.: Wirtschaftliche Industrie 4.0 Entscheidungen - mit Beispielen - Praxis der
Wertschöpfung. AV Akademiker Verlag (2016)
19. Heiden, B., Tonino-Heiden, B.: Emergence and Solidification-Fluidisation. In: Arai, K. (ed.)
LNNS, vol. 296, pp. 845–855. Springer Nature Switzerland AG (2022). https://doi.org/10.
1007/978-3-030-82199-9_57
20. Götschl, J.: Zur Epistemologie der Selbstorganisation: Von Konvergenzen zu Korrelationen
zwischen Systemwissenschaften der Natur und Systemwissenschaften vom Menschen. In:
Fabisch, H., Fabisch, K., Kapfhammer, H.-P. (eds.) Die Welten von Psyche und Soma - Zur
Verbindung von Psychoanalyse und Neuropsychiatrie - Gedenksymposion für Frau Univer-
sitätsprofessor Dr.in med. Dr.-in phil. Margarete Minauf im Meerscheinschlössl in Graz am
25. November 2017, pp. 85–104. Verlag Dr. Kovac (2020)
21. Tarski, A.: The Semantic Conception of Truth: and the Foundations of Semantics. Res. 4,
341–376 (1944)
22. Quine, W.v.O.: Wort und Gegenstand (Word and Object). Reclam (1993)
23. Luhmann, N.: Soziale Systeme. Suhrkamp Verlag AG (2018)
24. Farahani, R.Z., Abedian, M., Sharahi, S.: Dynamic Facility Location Problem. In: Zanjirani
Farahani, R., Hekmatfar, M. (eds.) Facility Location. Contributions to Management Science,
pp. 347–372. Physica, Heidelberg (2009). https://doi.org/10.1007/978-3-7908-2151-2_15
25. Heiden, B., Walder, S., Winterling, J., Perez, V., Alieksieiev, V., Tonino-Heiden, B.: Uni-
versal Language Artificial Intelligence (ULAI). In: Schulz, F. (ed.) Advances in Artificial
Intelligence Research. Nova Science Publishers, Incorporated (2020)
26. Heiden, B.: Support Material for the Paper ‘System Ordering Process Based on Uni-, Bi- and
Multidirectionality – Theory and First Examples’ (2022, this paper). https://github.com/Ber
nhardHeiden/Directionality-Principle. Accessed 3 Dec 2021
27. Heiden, B., Tonino-Heiden, B.: Key to artificial intelligence (AI). In: Arai, K., Kapoor, S.,
Bhatia, R. (eds.) IntelliSys 2020. AISC, vol. 1252, pp. 647–656. Springer, Cham (2021).
https://doi.org/10.1007/978-3-030-55190-2_49
28. Heiden, B., Leitner, U.: Additive manufacturing – a system theoretic approach. In: Drstvenšek,
I. (ed.) ICAT 2018, Maribor, pp. 136–139. Interesansa - zavod, Ljubljana (2018)
The Impact of Intellectual Property Protection
on China’s Import of Computer and Information
Service Trade: Empirical Research Based
on Panel Data of 34 Countries or Regions

Hui-ying Yang and Shi-kun Pang(B)

School of Economics, Harbin University of Commerce, Harbin 150028, China

Abstract. The 14th Five-Year Plan indicates the need to improve the scientific
and technological innovation system and mechanism based on a sound intellec-
tual property protection and application system. Based on the panel data of 34
countries or regions from 2007 to 2019, the article uses the extended trade gravity
model to empirically study the impact of China’s intellectual property protection
on the import of computer and information services trade. The results show that
the improvement of China’s intellectual property protection level has effectively
restrained the imitation of domestic member-consumers and has a significant pro-
motion effect on the growth of computer and information service trade imports.
The market expansion effect of intellectual property protection is greater than the
market power effect.

Keywords: Intellectual property protection · Computer and information service


trade · Extended trade gravity model

1 Introduction
The “14th Five-Year Plan” clearly pointed out the need to improve the intellectual prop-
erty protection and application system, better protect and encourage high-value patents,
and cultivate patent-intensive industries. Under the “14th Five-Year Plan” background,
the importance of intellectual property protection in China is still self-evident. The intel-
lectual property rights involved in the “TRIPS” agreement include industrial designs,
patents, and integrated circuit layout designs. This kind of power is the right to self-
apply, grant, and be protected for high-tech products. As the share of high-tech products
and services in international trade continues to increase, it becomes more and more
important to identify the relationship between intellectual property protection and these
non-traditional trades. The trade-in computer and information services is recognized by
the “EBOPS (Expanded Balance of Payments Service Classification)” as a type of trade

Fund Project: This article is a phased research result of the National Social Science Fund Project
“China-Eurasian Economic Union FTA Creation Research from the Perspective of the ’Silk Road
Economic Belt’” (project number: 18JL094).

© The Author(s), under exclusive license to Springer Nature Switzerland AG 2022


A. E. Hassanien et al. (Eds.): BIIT 2021, LNDECT 107, pp. 605–612, 2022.
https://doi.org/10.1007/978-3-030-92632-8_56
606 H. Yang and S. Pang

in services. With the continuous improvement of residents’ wages and the continuous
expansion of middle-income groups, the demand for modernized high-tech services is
also increasing. From 2007 to 2019, China’s computer and information service trade
imports increased from 2.208 billion dollars to 26.86 billion dollars each year. At the
same time, China’s computer and information service trade imports accounted for the
proportion of total service trade imports from 1.7% to 5.36% year by year. However,
under the epidemic’s impact, China’s service trade deficit will decrease in 2020, including
a decrease in imports of computer and information services. However, the great domestic
demand for such modern high-tech services is more in line with the development trend
of the world’s service trade structure.
China’s intellectual property protection system has been strengthened and improved
year by year. Because of its technology-intensive product trade characteristics, the com-
puter and information service trade is destined to have an inseparable relationship with
intellectual property protection. In 2020, under the background that China’s computer
and information service trade imports will decrease in 2020, resulting in a larger gap
in this supply service, can we continue to strengthen China’s intellectual property pro-
tection to promote the import of computer and information services, to make up for the
impact of the new crown epidemic on imports? This is worthy of our in-depth study.
However, the similar growth trends of the two do not indicate that intellectual prop-
erty protection has a significant role in promoting the import of computer and infor-
mation services. First of all, as a major imitating country, service providers provide
software processing, system integration, repair, and maintenance services across bor-
ders. This process is easy to imitate, leading to patent and even integrated circuit layout
design infringements, although intellectual property rights are protected. The system
is gradually being improved, but whether the protection of intellectual property rights
can eliminate the doubt that exporting countries will face the threat of imitation is still
unknown. Second, after the protection of intellectual property rights reaches a certain
level, will it increase the effect of market expansion and expand the scale of imports? It
remains to be verified whether the exporting country has created a market power effect by
raising services to suppress imports. Based on the above considerations, this article aims
to provide suggestions for improving China’s intellectual property protection system
and optimizing the import structure of service trade by studying the impact of China’s
intellectual property protection on the import of computer and information services.

2 Empirical Research

In 2019, OECD countries and APEC countries or regions (excluding countries that over-
lap with OECD organizations) accounted for 88.19% of the world’s exports to China in
this industry. They have been stable since 2015. Around this data, it can be seen that
these developed countries have a large proportion of China’s exports to China in the
The Impact of Intellectual Property Protection 607

computer and information service industries. Therefore, this paper selects more repre-
sentative panel data of 34 countries or regions1 in 31 OECD countries and 3 additional
APEC countries or regions from 2007 to 2019, and establishes an extended trade gravity
model to conduct empirical research on the research content of this paper.

2.1 Model Construction


As we all know, the classic trade gravity model was proposed by Tinbergen in 1962. He
believes that the law of universal gravitation can be extended to the field of international
trade, and the trade flow between countries is directly proportional to the economic
scale and inversely proportional to the distance between countries. Since this paper
selects panel data with 34 countries or regions as samples for empirical analysis, the
geographic distance between the exporting country and China is bound to become a
factor affecting imports. Hence, the empirical part uses the classic trade gravity model
to expand further. The original expression of the classic gravity model is:

Tij = C ∗ Gi Gj /Dij (1)

Among them, i and j represent two countries participating in international trade, T


represents the trade flow between them, G represents the country’s economic scale, D
represents the distance between countries participating in trade, and C is a constant. To
make the model more convenient for understanding and subsequent processing, the nat-
ural logarithm of both sides of the equation is generally taken to eliminate heteroscedas-
ticity,
 and later scholars have also introduced other factors that affect trade flows, denoted
by λk θk , so another expression of the classic gravity model is:

lnTij = β0 + β1 lnGi + β2 lnGj + β3 lnDij + λk θk + ε (2)

Given the research in this article, we assume that the intellectual property protection
of the importing country has the expected impact on the import trade. So the core
explanatory variables that need to be examined in this article are included in intellectual
property protection (IPR), as well as China’s use of foreign direct investment (FDI) and
the level of technological innovation (TI) of exporting countries that have the expected
impact on China’s computer and information service trade imports, taking into account
the factor of time t, that is, the model equation setting becomes:

lnIMijt = β0 + β1 lnGDPit GDPjt + β2 lnDistanceij


(3)
+β4 lnIPRit + β5 lnFDIijt + β3 lnTIjt + εijt

1 These 34 countries include 31 OECD countries and 3 APEC countries or regions. OECD
countries include Australia, Austria, Belgium, Canada, Czech Republic, Denmark, Estonia,
Finland, France, Germany, Greece, Hungary, Italy, Ireland, Israel, Japan, South Korea, Lux-
embourg, the Netherlands, Norway, New Zealand, Poland, Portugal, Slovenia, Slovakia, Spain,
Sweden, Switzerland, Turkey, the United States and the United Kingdom. APEC countries or
regions include Hong Kong, Russia and Singapore.
608 H. Yang and S. Pang

2.2 Selection of Variables and Data Sources

IMijt represents the import trade volume of country i from country j in the computer and
information service industry in year t; GDPit GDPjt represents the economic development
level and market capacity of country i and country j in year t, measured by the product of
the two gross domestic products; Distanceij represents the distance between China and
the capital of the exporting country; IPRit represents the level of intellectual property
protection in China in year t; FDIijt represents China’s use of foreign direct investment in
year t; TIjt represents the technological innovation capability of the exporting country in
year t, although the number of patent applications can be used to measure technological
innovation capabilities by eliminating the influence of government patent agencies and
other human factors (Chen Lijing, Gu Guoda, 2011) [1], but the amount of patent grants
can better reflect the quality of technological innovation, Dai Ming, Chen Xiao, and
Jiang Han (2017) also believe that China’s technological level should be measured by
the amount of China’s foreign patents granted [2], so this article uses the number of
patents granted by each exporting country based on computer technology to measure the
technological innovation capability of the exporting country according to the selected
research objects.
Computer and service trade import data comes from OECD Statistics; each country’s
GDP product data comes from the World Bank’s statistical database, and GDP data is
calculated in 2010 constant US dollars; the distance between China’s capital and its
trading partners comes from CEPII-Database; the calculation source of the level of
intellectual property protection is based on the GP index researched by Ginarte and
Park (1997) [3] to measure the level of intellectual property protection in China. Since
the GP index itself is an indicator of the intensity of legislation and does not take
into account the factors of the law enforcement of intellectual property protection, we
adjusted the GP index accordingly. Refer to Han Yuxiong and Li Huaizu (2005) [4]
to introduce “law enforcement” on the basis of the GP index, the research method of
“intensity” considers the influencing factors of law enforcement intensity, constructs an
index system of China’s intellectual property protection law enforcement intensity, then
calculates the level of intellectual property protection law enforcement intensity, and
finally calculates China’s actual intellectual property protection level based on the GP
index. The law enforcement system includes four indicators, including legal influence
factors, economic development level, social public awareness and international society
influence factors. Han Yuxiong and Li Huaizu (2005) believe that different indicators
have the same impact weight on law enforcement, so the four indicators are calculated
using an average weighting method. The arithmetic average of the four indicators is
the level of law enforcement of intellectual property protection over the years, and it
expressed by F(t); the intensity of intellectual property protection legislation over the
years is expressed by G(t), and it is calculated through the expansion of the GP index
researched by Ginarte and Park (1997); IPR(t) is used to express the actual level of
intellectual property protection over the years, then the actual intellectual protection
level can be expressed as:

IPR(t) = G(t) ∗ F(t) (4)


The Impact of Intellectual Property Protection 609

According to the above formula, the actual protection level of intellectual property
in China from 2007 to 2019 can be calculated; China’s use of foreign direct investment
(FDI) data comes from the National Bureau of Statistics; the number of patents granted
by computer technology in various countries comes from the WIPO database.

2.3 Analysis of Empirical Results


Based on the empirical model constructed above, using the panel data of China and 34
trading partner countries from 2007 to 2019, the software stata16 is used for stepwise
regression estimation.
First, perform descriptive statistics on each variable, and the results are shown in
Table 1.

Table 1. Descriptive statistics of variables

Variable Observed value Mean Standard error Minimum Maximum


lnIMijt 442 3.667 2.075 −2.877 7.814
lnGDPit GDPjt 442 12.6 0.621 11.02 14.32
lnDistanceij 442 8.803 0.489 6.862 9.32
lnIPRit 442 1.182 0.155 0.876 1.343
lnFDIijt 442 9.261 2.875 1.099 16.08
lnTIjt 442 5.038 2.356 0.111 10.69
N 34 34 34 34 34

Since this article uses panel data from the sample countries, the Hausmann test and
LM test must be passed before estimating the model’s parameters to compare fixed
effects, random effects, or mixed regression, which is better. But the distance between
the two countries’ capitals does not change with time, so the variable distance cannot
be identified in the fixed-effects model. Therefore, this article conducts an LM test to
determine which is better than random-effects model or mixed regression. After the LM
test, the result shows Prob > chibar2 = 0.0000. We want to reject the null hypothesis
that the model adopts mixed regression, indicating that the use of random effects is
better than mixed regression. The following table shows the stepwise regression esti-
mation results based on the random effects model. The specific conditions are shown in
Table 2.
610 H. Yang and S. Pang

Table 2. Regression estimation result

Variable lnIMijt lnIMijt lnIMijt


lnGDPit GDPjt 2.588*** 1.979*** 1.367***
(9.07) (6.72) (3.59)
lnDistanceij −0.565 −0.391 −0.144
(−0.99) (−0.90) (−0.31)
lnIPRit 2.956*** 3.526*** 3.597***
(6.03) (8.18) (8.87)
lnFDIijt 0.095** 0.111**
(2.03) (2.44)
lnTIjt 0.194**
(2.22)
Constant −27.464*** −22.870*** −18.547***
(−4.51) (−4.87) (−3.61)
R2 0.4908 0.5809 0.6635
Note: *, **, *** indicate that the estimated coefficient is significant at the level of 10%, 5%, and
1% respectively, that is, * p < 0.1, ** p < 0.05, *** p < 0.01, () represent Z statistic.

Finally, we analyze the above regression estimation results.


First, the regression coefficient β1 of the variable market scale and economic devel-
opment level of the two countries is significant at the 1% confidence level, and the sign
is positive, indicating that the larger the market scale and economic development level
of China and developed countries’ trading partners, the higher the completeness of their
industrial chains, and the greater the demand for China’s imports from partner countries
in the computer and information service industries.
Second, the regression coefficient β2 of the distance between the two countries is not
significant, and the sign is negative, indicating that the distance between partner countries
still has a negative impact on China’s computer and information service trade imports, but
the effect is not significant. This is due to the large sample selected in this article. Most of
them are developed countries. Developed countries have a high degree of integration in
the global shipping network and a high degree of accessibility to air transportation routes,
and domestic consumers can use the computer and information services imported from
China through telecommunications, postal services, and computer networks. Therefore,
the distance cost has gradually become no obstacle in developing countries’ bilateral
emerging trade exchanges.
Third, the regression coefficient β3 of the core explanatory variable China’s intel-
lectual property protection level is significant at the 1% confidence level, and the sign is
positive, indicating that computer and information service providers use modern science
and technology to provide development. When designing, producing, collecting, process-
ing, processing, storing, transmitting, retrieving, and using services, the improvement
of the level of intellectual property protection has effectively restrained the imitation of
The Impact of Intellectual Property Protection 611

domestic consumers, and China has strengthened intellectual property protection on the
import of computer and information services trade. The effect of market expansion is
greater than the effect of market power, which has promoted the import scale of computer
and information service trade.
Fourth, the regression coefficient β4 of China’s use of foreign direct investment is
significant at the 5% confidence level, and the sign is positive, indicating that the influ-
ence of FDI on import trade is becoming more and more obvious. Foreign companies
have brought in a large amount of advanced technology, machinery, equipment, informa-
tion and management experience directly related to production, sales and consumption to
directly invest in China, which will also drive domestic demand for computers and infor-
mation services, and increase China’s actual purchasing power to promote the growth
of import trade in this service industry.
Fifth, the regression coefficient β5 of the exporting country’s technological innova-
tion capability is significant at the 5% confidence level, and the sign is positive. On the
one hand, if the technology gap between the two countries is large, China, as a major
imitating country, will further strengthen the company’s own imitating ability. If this
happens, it will inhibit the transfer of advanced technology to the country and constitute
a trade barrier, which will reduce China’s imports. On the other hand, under the govern-
ment’s intervention and other external factors, Chinese companies will choose countries
with a large technological gap to import products and services that are much higher
than their domestic technology levels, allowing companies to conditionally apply and
promote technologies from these countries. Therefore, the possible explanation for this
is that the application and promotion of these technological innovations has a greater
impact on imports than the increase in domestic imitation ability that inhibits imports.
At the same time, the import of high-tech-intensive products also drives the import of
corresponding high-tech services, that is to say, the improvement of the technological
innovation capacity of exporting countries is not an obstacle to China’s computer and
information service imports, but a facilitating factor.

3 Conclusions

This article uses data from 2007 to 2019 in China, OECD countries and APEC countries
or regions (excluding countries that overlap with OECD organizations). The extended
trade gravity model analyzes the impact of intellectual property protection on the import
of computer and information services trade from a quantitative and qualitative perspec-
tive. The main conclusions are as follows: first, while importing computer and informa-
tion services in China, the improvement of China’s intellectual property protection level
has effectively restrained the imitation behavior of domestic member consumers and
has a significant promotion effect on the growth of computer and information service
trade imports, and the protection of intellectual property rights on the effect of market
expansion is greater than the effect of market power. Secondly, there is still a lot of
room for improvement in intellectual property protection in China. Before the exporter
forms the market power effect of intellectual property protection and restrains Chinese
imports, enhancing intellectual property protection can further promote trade imports of
computer and information services.
612 H. Yang and S. Pang

References
1. Chen, L., Gu, G.: The impact of technological innovation and intellectual property protection
on the structure of china’s imported goods: an empirical analysis based on time series data
from 1986 to 2007. Int. Trade Issues. 05, 14–21 (2011)
2. Dai, M., Chen, X., Jiang, H.: The interactive effect of exporting country’s technological level
and importing country’s intellectual property protection on export trade of exporting coun-
tries——based on the empirical study of China’s export trade. Tech. Econ. 36(05), 103–109
(2017)
3. Ginarte, J., Park, W.: Determinants of patent rights: a cross-national study. Res. Policy 26(3),
283–301 (1997)
4. Han, Y., Li, H.: Quantitative analysis on the level of protection of intellectual property rights
in China. Res. Sci. Sci. 03, 377–382 (2005)
5. Liu, C., Li, X., Xu, L.: The influence of regional intellectual capital on regional economic
development-evidence from China. Int. J. u - e Serv. Sci. Technol. 8(6), 89–102 (2015)
The Influence Mechanism of Business
Environment on the Allocation
of Entrepreneurship

Juan Li1(B) and Tian Zhang2


1 Harbin University of Commerce MBA (MPA) Center, Harbin, China
2 School of Economics, Harbin University of Commerce, Harbin, China

Abstract. The business environment has become an important manifestation of


the strength of economic competitiveness. It is also institutional protection for
a country to realize that innovation leads to economic development. The advan-
tages and disadvantages of the business environment affect the behavior of the
market entities represented by entrepreneurs and guide the allocation direction of
entrepreneurship. This paper explores the impact mechanism of business environ-
ment on the entrepreneurial spirit by literature research and qualitative analysis.
The study finds an intrinsic link between the business environment, entrepreneur-
ship, and economic growth. They are inseparable, interact, and jointly determine
the direction and quality of Chinese economic development.

Keywords: Business environment · Entrepreneurship · Differential allocation ·


Influence mechanism

1 Introduction
China has achieved long-term, rapid, and stable economic growth in the early stage of
reform and opening up, relying on demographic dividends and resource advantages.
However, the disappearance of these dividends and the pressure of aging, the short-
age of resources, and the backwardness of technology have made Chinese economic
development unbalanced and insufficient. Traditional economic development methods
do not affect current economic development challenges. When the development con-
ditions change, Chinese economic development and the adjustment of economic struc-
tures can only rely on innovation. Because innovation can break through the economic
development bottleneck, crack economic development challenges, find new develop-
ment advantages and guide economic development [1]. Innovation will become the first
driving force to lead Chinese economic development. However, the implementation of
innovation is inseparable from policy support, market subject participation, and transfor-
mation of thinking. In addition, the quality of the business environment largely affects the
behavior of market participants represented by entrepreneurs and the allocation direction
of production factors, which in turn affects the economic efficiency of the entire society.
Special Project of Philosophy and Social Science in Heilongjiang Province (20JYH078).

© The Author(s), under exclusive license to Springer Nature Switzerland AG 2022


A. E. Hassanien et al. (Eds.): BIIT 2021, LNDECT 107, pp. 613–622, 2022.
https://doi.org/10.1007/978-3-030-92632-8_57
614 J. Li and T. Zhang

Therefore, this article will analyze the influence mechanism of the business environment
on the allocation of entrepreneurship in economic development. And then, it will provide
solutions for the problems encountered in the process of economic operation.

2 Literature Review
2.1 Research on the Business Environment

The business environment is closely related to a country’s economic growth. A good


institutional environment and relatively losing control have a significant role in promot-
ing economic growth, which promotes enterprises to invest more resources in research
and development and enhance their innovation capabilities and market competitive-
ness. However, strict market supervision, unfair competition, difficulty in obtaining
funds, inefficient government administration, and other factors will significantly affect
the enthusiasm for starting new enterprises, reduce the number of market enterprises,
decrease the employment rate, and greatly increase the business costs and transactions
costs. Financing, labor quality, the efficiency of government approval, legal environ-
ment, and infrastructure are important aspects that affect the level of the business envi-
ronment and key factors that affect the healthy development of an enterprise. Comparing
the business environment between different countries or regions, losing market access
rules, developed financial systems, sound property rights protection systems, and effi-
cient administrative services are the good performance of the business environment. In
addition, these factors can reduce the obstacles in the business operation process and
effectively promote the sustainable development of the regional economy.

2.2 Research on the Allocation of Entrepreneurship

The allocation direction of entrepreneurship is the manifestation of the behavior of


microeconomic subjects, which affects the efficiency of resource allocation and the cre-
ation of social value. Schumpeter pointed out that the driving force of economic devel-
opment comes from innovation, and the creation of innovation stems from the “creative
destruction” of entrepreneurs [2]. Although entrepreneurship will promote economic
growth, differential allocation of entrepreneurship will affect the efficiency of economic
development. Baumol expanded the concept of entrepreneurship based on Schumpeter’s
theoretical model of innovation, pointing out that the impact of entrepreneurship on eco-
nomic development depends on the differential allocation of entrepreneurship in pro-
ductive, non-productive, and destructive activities [3]. Baumol advocates guiding the
allocation of entrepreneurship by changing the institutional environment to maximize
the role of entrepreneurship in promoting economic development. In addition, many
scholars have deeply explored the impact of the differential allocation of entrepreneur-
ship on economic development from the legislation, government regulation, and property
rights protection.
The Influence Mechanism of Business Environment 615

3 Analysis of Impact Mechanism


The quality of the business environment affects the productivity and competitiveness of
a country or region. And the allocation direction of entrepreneurship determines the per-
formance of economic growth. The business environment that comprehensively reflects
the institutional environment of a country or region is a key factor affecting the allocation
of entrepreneurship. In order to fully reflect the impact of the business environment on
entrepreneurship, this section will comprehensively analyze the impact mechanism from
five aspects: market environment, legal environment, economic environment, service
environment, and social environment.

3.1 The Influence of Market Environment

The market environment reflects the process and results of marketization in a country
or region. Its quality is related to the direction of entrepreneurship, and it is a key factor
affecting the development of entrepreneurship.

The Impact of Competition. The competition mechanism is the core of the market
economy and an important means for market entities to realize their interests. A fair
and efficient market competition environment adjusts product prices through supply and
demand, rather than excessive human intervention to distort market prices and cause
resource mismatch. Fair competition promotes the efficient operation of the market,
effectively stimulates the vitality and creativity of market players, and actively attracts
them to actively participate in economic construction. In an environment of fair compe-
tition, entrepreneurs rely on their talents to configure production factors by market rules,
reduce unnecessary non-productive activities that consume social wealth, and devote
more energy to more creative and productive activities. During the activities, the effi-
ciency of economic operation will be improved, and the economy will achieve Pareto
optimality.

The Impact of Integrity. In the process of the market economy becoming increasingly
mature and market scale gradually expanding, the spirit of contract has gradually pene-
trated all aspects of economic life, regulating and guiding the behavior of market entities.
The quality of integrity management derived from the spirit of contract is a beneficial
weapon to help enterprises and entrepreneurs establish a good image and enhance their
competitiveness. An enterprise’s operation and sustainable development require that its
managers have the quality of integrity and products within the scope of laws and regula-
tions. The honest market environment reduces the financing barriers of enterprises to a
certain extent, enables the rational allocation of funds, promotes the flow of production
factors from surplus departments to shortage departments, improves resource alloca-
tion efficiency, reduces enterprise production costs and transaction costs, and promotes
healthy economic development.

The Impact of Government Regulation. The market environment of fair competition


and honest operation needs to be maintained by government departments. Fair supervi-
sion is the benchmark for China to promote the reform of “decentralization, regulation,
616 J. Li and T. Zhang

and service” and a source of motivation for stimulating market players’ vitality and cre-
ativity. It can regulate market players’ behavior and resolve law enforcement conflicts and
help establish good government-enterprise relationships. In addition, regulatory agencies
should also optimize and innovate regulatory models, expand the scope of supervision of
market entities and strengthen supervision by cooperating with other agencies to achieve
comprehensive supervision of cross-regional markets. At the same time, the regulatory
authorities should advocate to create a social atmosphere for honest law enforcement,
introduce relevant policies and standard systems to regulate entrepreneurial behavior,
and improve the efficiency of supervision and management.

Competition Law of Supply


and Demand Promote the
Allocation of
Market Integrity Entrepreneur
Environment Spirit of
Contract -ship to
Productive
Government Normative Activities
Regulation Behavior

Fig. 1. The influence mechanism of market environment

3.2 The Influence of Legal Environment

The construction of a legalized business environment can regulate and restrict the behav-
ior of market participants and attract more foreign investment and promote regional
economic development.

The Impact of Investor Protection. Investors provide financial support for the estab-
lishment and continuous operation of enterprises. While protecting enterprises and their
operators, the law must also protect investors’ legitimate rights and interests and main-
tain market fairness and justice. The financial support of small and medium investors
broadens the financing channels for small and micro enterprises and helps solve the
financing difficulties of enterprises. Improving the legal protection of small and medium
investors and enhancing the investment confidence of capital providers are an incentive to
continue to provide financial support for small and medium investors. At the same time,
standardizing corporate capital allocation behavior and improving capital use efficiency
can constrain corporate management behavior.

The Impact of Intellectual Property Protection. The protection of intellectual prop-


erty plays a significant role in promoting innovation. The new organizational struc-
ture, inventions, scientific research results, and the vigorous development of emerging
industries are the concentrated expression of entrepreneurial innovation spirit. Rele-
vant institutions need to formulate laws and regulations to protect the innovation rights
of entrepreneurs. In addition, they should also severely crackdown on illegal acts that
infringe upon entrepreneurs’ innovation rights. Only in this way can the entrepreneurial
The Influence Mechanism of Business Environment 617

initiative of innovation be further strengthened. It can be seen that innovation is con-


ducive to promoting the flow of factor resources to sectors with higher marginal returns,
giving full play to the intrinsic value of production factors and providing motivation for
regional economic development. At the same time, the protection of property rights is
also conducive to forming a social atmosphere that advocates and respects innovation.
Property protection also stimulates the innovation consciousness of the whole society,
enhances the innovation ability of the whole society, and guides entrepreneurship to
allocate productive activities that create social wealth.

The Impact of Law Enforcement and Justice. Law enforcement can regulate and
restrict the behavior of market entities. Justice is the last line of defense to safeguard social
fairness and justice. The efficiency of law enforcement and judicial transparency is cru-
cial to constructing a legalized business environment and the allocation of entrepreneur-
ship. Judicial transparency is the key to realizing government credibility and reflects
whether social fairness and justice are maintained and respected. The open judicial
system fundamentally solves the problem of human rights protection, which not only
maintains the authority of the national judicial department but also effectively resolves
the conflicts between government and enterprises [4]. For enterprises and entrepreneurs,
improving the efficiency of law enforcement and enhancing judicial transparency reduce
the cost of litigation for enterprises and protect enterprises’ legal rights and interests.
It can also regulate and restrict entrepreneurs’ behavior and guide entrepreneurship to
flow into the productive activities that create social value.

Investor Protection Promote the


Allocation
of
Legal Intellectual Property Entrepreneur
Environment Protection
-ship to
Productive
Law Enforcement
and Justice Activities

Fig. 2. The influence mechanism of legal environment

3.3 The Influence of Economic Environment


The economic environment supports the activities of market entities and will have the
most direct impact on the development of enterprises.

The Impact of Openness. Since the reform and opening up, the transnational flow of
labor, capital, technology, and other production factors has greatly promoted China’s
economic growth. The influx of high-quality production factors enables China to enjoy
the technology spillover effect and promotes the development and innovation of China’s
technology. The improvement of technological level has improved the quality of Chinese
products, improved Chinese industrial structure, and made Chinese position in the global
value chain gradually rise. In addition to the free flow of production factors that will
618 J. Li and T. Zhang

promote the development of a country’s economy, the mature business management


models of developed countries will also positively impact the production and operation
of Chinese enterprises. Efficient management can improve the operating efficiency of the
enterprise, optimize the allocation direction of the enterprise’s resources and increase the
production capacity [5]. Apart from this, it can also significantly reduce the enterprise’s
operating and management costs.

The Impact of the Level of Economic Development. The level of foreign direct
investment and national income can intuitively reflect the economic development of
a country. And the level of a country’s economic development will directly affect the
allocation direction of entrepreneurship. When the level of national income increases,
people will save after satisfying their basic living needs. The conversion of savings
to investment will drive the improvement of a country’s investment level. To a certain
extent, the increase of idle capital will provide financial support for entrepreneurs and
reduce the difficulty of financing small and medium-sized enterprises [6]. In addition, it
can also reduce the possibility of rent-seeking for entrepreneurs due to insufficient funds
and optimize the allocation direction of entrepreneurship.

The Impact of Business Cycles. The business cycle is a comprehensive reflection of


various contradictions in economic activities. At present, China’s economy is in a critical
period. Whether it can continue to maintain rapid growth for a long time in the future is
related to realizing my country’s modernization. In addition, institutional innovation and
technological innovation are also indispensable and important factors for achieving sus-
tainable economic development. The huge domestic market, sufficient supply of labor
and material resources, and strong production capacity guarantee the further develop-
ment of China’s economy. In the context of innovation-driven economic development,
entrepreneurs should choose industries with vitality and creativity for deep cultivation
and follow the direction of economic development, rather than blindly starting a business
and wasting social resources.

Openness Introduce Advanced Comprehen


Production Factors sively
Influence
Economic Economic Increase in the
Environment Level Investment Direction
of
Business Guide the Direction of Entreprene
Cycle Development urship

Fig. 3. The influence mechanism of economic environment

3.4 The Influence of Service Environment


The service environment affects the entire process of the survival and development of
an enterprise. And it is the key to the sustainable, healthy, and stable development of an
enterprise.
The Influence Mechanism of Business Environment 619

The Impact of Social Security Services. A sound social security system is conducive
to narrowing the income gap and alleviating social conflicts. First of all, a sound med-
ical insurance system can protect the health of the labor force and ensure the driving
force needed for sustained economic development. Secondly, in the context of aging,
the improvement of the old-age insurance system helps to ensure the normal replace-
ment of labor and promote the rationalization of the employment structure. Finally, the
improvement of the housing provident fund system has effectively solved the problem
that hinders the free flow of labor factors and ensures the flow of labor factors to regions
with higher production efficiency.
The Impact of Government Services. The government service environment runs
through the entire development process of the enterprise. The digital government service
platform can greatly simplify the government approval process, lower the market access
threshold and reduce the waste of social resources caused by vicious competition. It can
also promote the free flow and aggregation of production factors and provide market
participants with a fairer business environment. While streamlining administration and
delegating power, the government must supervise the market strictly to maintain market
order and regulate the behavior of market entities by establishing a credit supervision
system, which ensures that market participants operate with integrity within the legal
boundaries [7].
The Impact of Financial Services. The financial service system plays an irreplace-
able role in mobilizing savings, allocating idle funds, and diversifying investment risks.
There are significant differences in the financing difficulties faced by companies of differ-
ent sizes. Compared with large companies, small enterprises have difficulties obtaining
financial support from large banks because of lacking reliable financial information and
mortgageable fixed assets. Therefore, small enterprises can only obtain financial sup-
port by relying on “soft information” such as the manager’s ability, personal qualities,
industry development prospects, and corporate profitability, forming a good expectation
for the capital supplier. It can be seen that only when entrepreneurs are committed to
productive activities can they solve the problem of lack of funds.

Security Labor Support


Services Promote the
Allocation of
Service Government Reduce Entrepreneur
Environment Services Operating Costs -ship to
Productive
Financial Financial Activities
Support Support

Fig. 4. The influence mechanism of service environment

3.5 The Influence of Social Environment


The social environment affects the values and behaviors of social groups. It also affects
the formation and development of entrepreneurship.
620 J. Li and T. Zhang

The Impact of Traditional Culture. Although the traditional culture represented by


Confucian culture has a positive effect on the development of the Chinese economy, its
historical limitations have seriously hindered the formation of social innovation con-
sciousness. First, the old concept of “emphasizing agriculture and restraining business”
hinders individuals from discovering their entrepreneurship. The pursuit of stability
and abandoning adventure will inhibit the scale of entrepreneurial groups, which is not
conducive to the diffusion and spread of entrepreneurship. Secondly, recognizing the
“patriarchal blood” makes the management style of “major leader” flood the modern
enterprises, which is not conducive to democratization and rational management. In
addition, the lack of attention to science and technology, which is divorced from reality,
is even more detrimental to the formation of social innovation consciousness.
The Impact of Education. In the context of the transformation of the Chinese eco-
nomic development model, the traditional test-oriented education method is not con-
ducive to the development of contemporary youth’s innovative consciousness. It will
also hinder the realization of China’s goal of driving economic growth through inno-
vation. In addition, the Chinese relatively single talent training method and relatively
backward educational concepts have also made it China difficult to respond to the global
demand for comprehensive and innovative talents. It can be seen that backward education
methods are not conducive to entrepreneurship production and hinder the formation and
development of the whole society’s innovation consciousness, which is a great waste of
entrepreneurial talents.
The Impact of Public Opinion Atmosphere. Social public opinion is a concentrated
reflection of group consciousness and has an impact on people’s ideas and behavior.
On the one hand, the promotion and encouragement of innovation and entrepreneurship
activities will guide entrepreneurship to allocate productive activities. The tolerance of
public opinion to entrepreneurs will also stimulate the enthusiasm of market players to
start businesses. On the other hand, public opinion can effectively supervise and regulate
the behavior of entrepreneurs and reduce the occurrence of non-productive activities that
waste social resources.

Traditional
Culture Affect the Comprehens
ively
Thoughts
Social Influence
Education and
Environment Method the
Behaviors of Direction of
Social Entrepreneu
Public Groups
Opinion rship

Fig. 5. The influence mechanism of social environment

4 Suggestions for Optimizing the Business Environment


In order to improve the efficiency of the allocation of entrepreneurship and guide the flow
of entrepreneurship to productive activities, China should also continue to deepen the
The Influence Mechanism of Business Environment 621

reform of “decentralization, regulation, and service” and further optimize the business
environment.

4.1 Establish a Market Environment for Fair Competition and Honest Operation
In order to establish a good market environment, the government and the market must
clarify the boundaries and perform their respective duties. Firstly, based on implement-
ing a negative list system for market access, the regulatory authorities should curb illegal
competition among market players, ensuring that market players participate in market
competition fairly and obtain equal access to production factors. Secondly, the gov-
ernment must establish a sound integrity mechanism to restrict entrepreneurs’ illegal
production behavior and reduce entrepreneurs’ non-productive activities. Finally, the
government must regulate the behavior of government and enterprise personnel and
save social resources.

4.2 Build a Legal Environment that Protects the Legitimate Rights of All Parties
The protection of the legitimate rights and interests of both investors and entrepreneurs
is the key to promoting the Chinese legalized business environment. Firstly, relevant
departments should formulate laws to provide institutional guarantees for investors and
entrepreneurs. Secondly, the supervisory department must standardize law enforcement
and improve the credibility of. Thirdly, the judiciary must uphold fair justice, improve
its adjudication capabilities, and fully protect investors and entrepreneurs’ legitimate
rights and interests.

4.3 Set up an Economic Environment for the Rational Allocation of Production


Factors
The rational allocation of production factors is the key to the improvement of economic
operation efficiency. Firstly, the government should guide production factors to emerging
industries and increase the ratio of factor input-output. Secondly, the government should
encourage enterprises to actively participate in the international market, which will help
them understand the needs of the international market and improve the efficiency of
factor allocation. Thirdly, China should further tap the domestic market and improve the
quality and competitiveness of domestic products.

4.4 Build a Comprehensive Service Environment that Guarantees


the Development of the Enterprise
The service environment reflects the efficiency of the government and the operating costs
of enterprises. Firstly, the government should reduce the operating costs of enterprises
through tax and fee reductions and encourage enterprises to use funds for research and
development, which will enhance their innovation capabilities. Secondly, the government
should encourage entrepreneurs to devote more energy to production and operation by
lowering the financing threshold and strengthening financial support for companies.
Thirdly, simplifying the government approval process and improving the efficiency of
government affairs could shorten the time to start a business.
622 J. Li and T. Zhang

4.5 Create an Open Social Environment that Encourages Innovation

The way of education and the atmosphere of public opinion affect the thoughts and
behaviors of social groups. In order to guide the deployment of entrepreneurship to
productive activities, firstly, China should change the existing education methods and
education concepts and focus on cultivating students’ exploration and innovation spirit.
Secondly, China should abandon the traditional education method and increase the flex-
ibility of education. Thirdly, the whole society should establish a good fault-tolerant
mechanism to reduce the psychological burden of entrepreneurs.

5 Conclusion

The allocation of entrepreneurship plays an important role in the economic development.


And it is the key to driving Chinese economic development with innovation. Only the
flexible market and good system work together, innovation can truly become the first
driving force to lead the development of Chinese economy. Although this article analyzes
the influence mechanism of business environment on the allocation of entrepreneurship,
there is still a lack of empirical demonstration, which is worthy of further research. By
studying, we can draw the following conclusions: the business environment will have a
comprehensive impact on entrepreneurship and guide the direction of its allocation.

Acknowledgment. In writing the thesis, my supervisor Juan Li helped me sort out the context of
the article, find logical loopholes and make the whole thesis more and more perfect. My parents
and friends have always encouraged me to go forward bravely and silently support me on my way
to school. I am very grateful for their love and tolerance for me, and I hope they can be happy
forever.

References
1. Zhuang, Z.: Southern imitation, entrepreneurship and long-term growth. Econ. Res. 62–70 +
94 (2003)
2. Schumpeter, J.A.: The Theory of Economic Development (1934)
3. Baumol, W.J.: Entrepreneurship: productive, unproductive, and destructive. J. Politics 98, 893–
921 (1990)
4. Zheng, F., Wang, Z., Wei, H.: Business rule of law environment index: evaluation system and
guangdong empirical. Guangdong Soc. Sci. (05), 214–223, 256 (2019)
5. Li, H., Li, X., Yao, X., Zhang, H., Zhang, J.: The impact of entrepreneurial and innovative
spirit on China’s economic growth. Econ. Res. 44(10), 99–108 (2009)
6. Zhou, C., Liu, X., Gu, Z.: Business environment and China’s foreign direct investment: from
the perspective of investment motivation. Int. Trade Issues (10), 143–152 (2017)
7. Liu, J., Guan, L.: Optimization of business environment, government functions and new growth
momentum of enterprise TFP- “Window lighting” or “Pro-upper and clearer”. Soft Sci. 34(04)
(2020)
The Transmission and Preventive Measures
of Internet Financial Risk

Na Zhao1,2 and Fengge Yao1(B)


1 School of Finance, Harbin University of Commerce, Harbin 150000, China
2 School of Economics and Management, Qiqihar University, Qiqihar 161006,

Heilongjiang, China

Abstract. With the continuous development of China’s economy, Internet tech-


nology has been integrated into daily work, study, and various industries. The
integration of Internet technology and the financial industry will bring about the
innovation and development of Internet finance. Internet finance is the application
of Internet technology in financial services. The improvement of Internet finan-
cial security is the basis for the healthy and orderly development of the financial
industry and an important part of the national informatization construction. How-
ever, there are also great risks in Internet finance. Starting from the development
of Internet finance, this paper analyzes the risk transmission path of third-party
payment, P2P, and crowdfunding in Internet finance. It analyzes the risk trans-
mission mechanism of Internet finance by using the bank run model and game
theory. Finally, it explores the prevention and control measures of liquidity risks
of Internet finance in China from the aspects of strengthening network security
supervision measures, perfecting laws and regulations, and perfecting the social
credit system.

Keywords: Internet finance · Liquidity risk · Bank run model · Preventive


measures

1 Introduction
Internet finance is an inevitable trend of economic informationization and a concrete
manifestation of financial modernization. Internet finance refers to integrating Internet
technology (such as computer technology, communication technology, network technol-
ogy, etc.) with financial services to realize the automation and electronation of financial
services, optimize the configuration of financial services, and improve the efficiency of
financial services.
The impact of Internet finance on financial development is two-sided. On the one
hand, Internet finance can promote economic development, promote financial develop-
ment, promote economic globalization, and improve the efficiency of financial services.
On the other hand, the development of Internet finance makes electronic technology
integrated into the financial industry, and the era of big data leads to a large amount of
information, which exacerbates financial risks. However, the development of Internet
finance has become an inevitable trend of economic development and a new engine for
the practice and development of the digital economy.

© The Author(s), under exclusive license to Springer Nature Switzerland AG 2022


A. E. Hassanien et al. (Eds.): BIIT 2021, LNDECT 107, pp. 623–631, 2022.
https://doi.org/10.1007/978-3-030-92632-8_58
624 N. Zhao and F. Yao

2 The Development of Internet Finance


The development of Internet finance is multi-level and leap-forward, and the development
time is short, but the speed is fast. The development of Internet finance can be roughly
divided into four stages (as shown in Fig. 1).

Fig. 1. The development process of Internet finance in China

The first stage is the initial stage of Internet finance (before 2005). At this stage,
Internet finance has not been established, which is mainly reflected in the attempt of
traditional financial institutions to connect with the Internet and try to move the traditional
financial business to the online operation. The emergence of Taobao in May 2003 has
provided a broader space for the development of Internet finance.
The second stage is the embryonic stage of Internet finance (2005–2012). At this
stage, traditional financial institutions continue to deepen their business cooperation
with the Internet. The emergence of third-party payment platforms such as Alipay and
P2P has made Internet finance enter a new stage of development. During this stage,
the People’s Bank of China began to issue third-party payment licenses, which laid a
foundation for the rapid development of the Internet.
The third stage is the rapid development of Internet finance (2013–2014). The year
2013 is known as the “first year of Internet finance”. In this stage, the combination of
the Internet and finance was further deepened. Many financial and structured financial
products appeared on the network platform on a large scale, laying a foundation for the
rapid development of the Internet. At the same time, in June 2013, Alipay and Tianhong
Fund jointly launched “Yu ’ebao” product, which opened up a new situation of Internet
finance and opened up a new market situation of low-risk financial products with higher
returns than deposits. At the same time, the rapid development of third-party payment,
P2P, crowdfunding platforms, etc., also makes the Internet finance industry ushered in
a boom of development.
The Transmission and Preventive Measures 625

The fourth stage is the regulatory development of Internet finance (from 2015 to
now). With the gradual expansion of the Internet financial business, P2P platforms also
have frequent thunderclap times. Since July 2015, the state and local governments have
successively issued regulatory regulations for the Internet financial industry, which pro-
vides a legal basis for the development of Internet finance and puts an end to the chaotic
era of Internet finance. It opens a period of evidence-based development of Internet
finance and protects investors’ legitimate rights and interests.

3 Risk Transmission Path of Internet Finance


Liao Y (2015) believes that the representative products of Internet finance, such as
Yu’ebao, P2P, and third-party payment, impact the traditional banking business, but they
also contain opportunities and risks. The “three noughts” of Internet finance, such as no
entry threshold, industry standards, and regulatory institutions, also hidden huge credit
risks. We should strengthen supervision, pay risk reserves, and strengthen legislation [1].
Christian Grisse (2015) used recently developed econometric methods to estimate the
relationship between exchange rate returns and risk factors in augmented UIP regressions
[2]. Sarah Chan provides an integrated perspective in examining how financial repression
in China has led to economic imbalances that elevate financial risk, particularly amid
the ongoing US-China trade tussle and the Covid-19 pandemic [3]. Among the business
models of Internet finance, the most representative is third-party payment, credit platform
represented by P2P, and crowdfunding. This chapter analyzes the impact of the risk
transmission paths of the three modes on Internet finance.

3.1 Analysis of Third-Party Payment Risk Transmission Path


The transmission path of third-party payment to financial risks is shown in Fig. 2.

Fig. 2. The transmission path of third-party payment to financial risks

Internet financial third-party payment platform development quick, relatively, com-


pared to traditional financial sector pay cost of third-party payment market is low, formed
the scale economy, have lower marginal cost, this will cause the enterprise to enter and
exit the third-party payment market is relatively frequent, makes the stability of the
whole market is bad, The probability of financial risk in the third-party payment mar-
ket will increase. Meanwhile, with the rapid development of third-party payment, the
supporting systems, laws, and regulations are not perfect. Due to the lax supervision,
platform funds will be misappropriated, resulting in the shortage of platform funds, the
phenomenon of a bank run, and the risk of Internet finance.
626 N. Zhao and F. Yao

3.2 Analysis of P2P Risk Transmission Path

P2P risk transmission path to Internet finance is shown in Fig. 3.

Fig. 3. P2P risk transmission path to Internet finance

First, in the P2P trading platform, there exists between the borrower and net credit
platform. The information asymmetry between lenders and borrowers, financial markets
as part of the overall economic activity, also because of market failure caused by asym-
metric information, in this case, with the advantage of the information of one party to
pursue the maximization of self-interest, to adverse selection and moral, Thus, it will
bring the risk of a run on the financial market and cause the risk of Internet finance.
Second, in P2P platforms, borrowers default and fail to repay loans on time. This credit
risk will lead to insufficient information in the whole industry, thus causing Internet
financial risks. Third, due to the poor operation of some P2P platforms, the money lent
by the platforms cannot be recovered. Investors cannot meet their demands when they
want to withdraw money, and violent incidents occur on the platforms, bringing risks to
Internet finance.

3.3 Analysis of Risk Conduction Path of Crowdfunding

The risk transmission path of crowdfunding to Internet finance is shown in Fig. 4.


As a new form of Internet finance, crowdfunding is developing rapidly, but the
corresponding rules and regulations are not perfect. If the system is not perfect, both
sides of crowdfunding may violate the rules, and the risks will be spread among peers
or financial institutions, thus causing risks of Internet finance. In addition, the unsound
system will make the fund-raisers provide false information, or the platform will use the
raised money for other purposes, which will cause a credit crisis and thus bring risks to
Internet finance.
The Transmission and Preventive Measures 627

Fig. 4. The risk transmission path of crowdfunding to Internet finance

4 Internet Financial Risk Transmission Based on the Bank Run


Model
Based on the past cases of risk transmission of Internet finance, we find that if in the pro-
cess of financial informatization, only one financial institution has risks, timely measures
will not bring too much impact on the economy and society; However, if timely mea-
sures are not taken, financial risks will be transmitted to other financial enterprises and
even the entire financial industry, and such risk transmission is bound to cause serious
consequences to the entire economy and society. In 1983, Diamond and Dybvig jointly
proposed the bank run model, demonstrating the vulnerability of banking institutions
and the possibility of bank run behavior from game theory. This part elaborates the risk
transmission mode of Internet finance by establishing the bank run model and studies the
possibility of bank run risk using game theory. It provides a policy basis for preventing
financial risks [4].

4.1 Study Hypotheses


H1: In the process of financial informatization, every investor of financial products is
rational and risk-averse, and rational brokers pursue the maximization of investment
returns.
H2: Model has three stages, respectively for T = 0, assuming that the investor at T
= 0 time deposit money, the bank promised to investors at T = 0 time deposit “1” unit
of money, if the investors want to consumption during T = 1, and still get the capital of
“1” unit, if investors choose consumption during T = 2, they can get money to R, And
R > 1.
H3: Investors are divided into two types, but in the period T = 0, each investor is the
same, and it is uncertain which type of investor they belong to; At T = 1, some investors
lose patience and advance their consumption, which we call the first type of investors;
628 N. Zhao and F. Yao

While the rest of the investors remain patient, delay consumption, choose to spend at T
= 2, we call the second type of investors.

4.2 Model Construction


The utility function of the first type of investor in θ:

u(c1, c2; θ ) = u(c1) (1)

The utility function of the second type of investor in θ:

u(c1, c2; θ ) = ρu(c2) (2)

c1 and c2 represent the amount of withdrawals by type I and type II investors, θ


represents the state variable, and ρ represents the time preference. The function of u u(c)
is quadratic continuously differentiable, strictly concave, increasing function. The total
investment utility function of all investors can be obtained from (1) and (2).

E[U (c1, c2; θ )] = pu(c1) + (1 − p)ρu(c2) (3)

ρ is the proportion of investors in the first category.


Assuming that in the process of financial product trading, the total amount of invest-
ment in the market is 1, then the constraints of investment budget in the economic system
are as follows:

pc1 + (1 − p)c2/R = 1 (4)

Under the condition of pursuing the maximization of investment benefits, the


Lagrange function is constructed to solve the maximum solution of total investment
utility:
 
L = pu(c1) + (1 − p)ρu(c2) + λ 1 − pc1 − (1 − p)c2/R (5)

Take the partial derivative of (5) C1, c2, and get:


∂L ∂u
=ρ − λρ = 0 (6)
∂c1 ∂c1
∂L ∂u λ(1 − ρ)
= (1 − ρ)ρ − =0 (7)
∂c2 ∂c2 R
∂L (1 − ρ)
= 1 − ρc1 − c2 = 0 (8)
∂λ R
By combining the above (6) (7) (8), we can get:

U  (c1∗ ) = ρ ∗ R∗ U  (c2∗ ) (9)

This expression indicates the conditions that should be satisfied when the equilibrium
is reached and is the Pareto equilibrium solution. Since it is a quadratic continuously
The Transmission and Preventive Measures 629

differentiable, increasing, and strictly concave function, its equilibrium solution is c1 *


> c1 and c2 * > c2 . So what that means is that for the utility to be maximized c1 * >
1,c2 * < R. In other words, when an investor chooses to consume at T = 1, it will get
compensation greater than his initial investment. Since there is information asymmetry
between the investor and the bank, every investor can consume at T = 1 in advance,
which is also a state of equilibrium. However, such advance consumption is likely to
cause bank runs, thus causing bank liquidity risks. The liquidity risks caused by bank
runs can also be studied using the game theory method in modern economics, as shown
in Table 1.

Table 1. Returns of multi-equilibrium models

Investor A Not withdrawal Withdrawal


Investor B
Not withdrawal R,R 0,C1
Withdrawal C1 ,0 C1 ,C1

From Table 1 can see there is two Nash equilibrium strategy (not withdrawal, not
withdrawal) and (withdrawal, withdrawal), the corresponding return is (R, R) and (C1 ,
C1 ), because the R > C1 , it can be seen that before a Nash equilibrium (not withdrawal,
not withdrawal) is superior to the latter equilibrium (withdrawal, withdrawal), also is
the best policy. Both investors will choose not to withdraw money in this state, and there
will be no run on the bank. But because there is an information asymmetry phenomenon,
both investors may choose to withdraw. However, this equilibrium, policymakers do not
want to happen, but due to the random appear when the two equilibrium, once appear,
the equilibrium that market will appear in the run, if the Internet agency risk, about the
effects of risk, is not big. Still, if this risk is transmitted to other peers or the financial
industry, it will lead to the collapse of the entire financial market [5].

5 Prevention Measures of Internet Financial Risks

Firstly, improve risk prevention and control capabilities and strengthen network security
supervision measures. Network security is the premise of the development of Internet
finance, but also an important support force for the development of inclusive finance.
Only by improving network security can we gain people’s confidence and make people
accept Internet financial services from their hearts. Only by improving customer demand
can we continuously strengthen the process of Internet finance, and only when there is
demand can there be progress. Customers will put forward some practical suggestions
while accepting Internet finance. Financial practitioners can promote the development of
financial informatization according to customers’ needs, so as to promote the continuous
progress of Internet finance. Therefore, the development of Internet finance cannot be
separated from the care of network security, and the continuous improvement of network
security can promote the further development of inclusive finance.
630 N. Zhao and F. Yao

Secondly, Improve relevant laws and regulations. First of all, to stabilize Internet
finance development, the state has promulgated relevant laws. The publication of Guid-
ance on Promoting the Healthy Development of Internet Finance in 2015 has certain
constraints on the development of Internet finance. It has accurately classified the finan-
cial forms such as third-party payment, crowdfunding, and P2P. But the “guidance”,
“recommended”, “method”, etc. There are disadvantages, which will be a loophole of
Internet finance. At present, China doesn’t have perfect laws and regulations policy, the
lack of implementation details, the supporting measures and the measures, and China’s
financial regulator lacks independence, not an independent agency of the Internet finan-
cial regulation. Internet finance is a double-edged sword, as well as low cost, high
efficiency, the advantages of the portable performance, is good, universality. There are
legal issues, such as the common phenomenon in the social activities such as buying
and selling citizens personal information, telecom fraud, etc., so the refining industry
sector of the telecommunications network fraud obligation, it is imperative to perfect
the relevant punishment mechanism of telecom network fraud.
Thirdly, we will further improve the social credit system. Internet financial trading is
based on network, so the credit standards for consumers demand is higher. Still, the social
credit system in China is compared to the rapid development of Internet financial lag,
third-party payment platform, if the low barriers to entry, payment verification method are
relatively simple, not only will bring some risk for subsequent management, At the same
time, it will also bring opportunities for illegal elements. So set up a credit rating system,
perfect the social credit system, reduce Internet financial risk, and effectively evaluate
the credit risks of the Internet financial. Increasing Internet finance’s criminal and illegal
costs and strengthening the sharing of credit information among the government, public
security departments, and the market are conducive to creating a green and safe operating
environment for Internet finance.

6 Conclusion
According to the characteristics of Internet finance, this paper deduces the risk trans-
mission mechanism of Internet finance by using the bank run model and systematically
expounds the path and principle of risk transmission of Internet finance. Based on the
analysis of the development of Internet finance, the risk transmission path of the influence
of the three forms of Internet finance, such as third-party payment, P2P and crowdfund-
ing, on Internet finance is proposed, and the return of the multi-equilibrium model is
constructed by game theory. According to the characteristics of Internet risk transmis-
sion, this paper puts forward how to use it for risk management to improve the risk
governance ability and risk management efficiency of Internet finance.

Acknowledgments. This work was supported by the grants of 2017 National Social Science
Foundation of China (No. 17BJY119).
The Transmission and Preventive Measures 631

References
1. Liao, Y.: The supervision research on China’s internet financial development and risk——
based on P2P Platform, Yu’ebao and the third-party payment as an example. Econ. Manage.
29, 51–57 (2015)
2. Grisse, C., Nitschka, T.: On financial risk and the safe haven characteristics of Swiss franc
exchange rates. J. Empirical Financ. 32, 153–164 (2015)
3. Chan, S.: Financial repression and financial risk: the case of China. Post-Communist Econ.
33(1), 119–132 (2021)
4. Chuanmin, M.I., Runjie, X.U., Jing, T.: Analysis of internet financial spatial aggregation and
systematic risk prevention——based on t-SNE machine learning model. Collected Essays
Finance and Econ. (2019)
5. Rongda, C., Lean, Y., Chenglu, J.: The development process, development mode and future
challenges of internet Finance in China. J. Quant. Tech. Econ. 37(01), 3–22 (2020)
Emotion Analysis System Based on SKEP Model

Zhang Yanrong1,2(B) , Zhang Yuxuan1,2 , and Xie Yunxi1,2


1 Information Engineering, Harbin University of Commerce, Harbin 150028, China
2 Heilongjiang Key Laboratory of Electronic Commerce and Information Processing,

Harbin 150028, China

Abstract. In recent years, with the rise and development of e-commerce and
social networking, appeared on the Internet a lot of subjective texts, and emo-
tional analysis task is defined with the text analysis, the subjectivity of the emo-
tional color processing, summing up. Reasoning process, so the sentiment analysis
technology is designed to deal with subjective text. This paper proposes an emo-
tional analysis system based on the SKEP model based on network neologies,
ambiguous sentences, irony, or implicit emotional expressions in subjective texts.
In addition, traditional sentiment analysis methods often have disadvantages such
as low accuracy, weak semantic representation, incomplete feature extraction, and
low contextual correlation. Through text pre-training of large-scale unsupervised
e-commerce, fine-tuning is carried out on small-scale supervised data. Based on
the same data set, compared with THE SEKP model, the F1 value of the opinion
extraction task and the accuracy of the sentence-level sentiment analysis task were
significantly improved. Still, the accuracy of the evaluation object-level sentiment
analysis did not show good results.

Keywords: Analysis · Subjective text · Viewpoint extraction · Sentence level


sentiment analysis · Evaluation object level sentiment analysis

1 Introduction

With the emergence of subjective text, the sense of participation of Internet users has
been enhanced, and the user stickiness and system retention of APP. The task of sen-
timent analysis is defined as analyzing, processing, inducing, and reasoning subjective
texts with emotional color. That is, corresponding to the input text, output description
entity, attribute, emotion, opinion holder, time five dimensions. At present, most emotion
analysis technologies mainly solve the extraction of the three dimensions of description
entity, attribute, and emotion, corresponding to the opinion holder, while the two dimen-
sions of time mainly use some existing information extraction technologies directly.
According to the different emphases of the techniques, we will specifically divide the
emotion analysis techniques into three categories: opinion extraction, sentence-level
emotion classification, and more fine-grained evaluation object-level emotion classifi-
cation. Opinion extraction and sentence-level sentiment classification are both extracted
from a single emotion dimension, and the only difference between them is the granularity

© The Author(s), under exclusive license to Springer Nature Switzerland AG 2022


A. E. Hassanien et al. (Eds.): BIIT 2021, LNDECT 107, pp. 632–642, 2022.
https://doi.org/10.1007/978-3-030-92632-8_59
Emotion Analysis System Based on SKEP Model 633

of the input text. One is the granularity of words, and the other is the granularity of sen-
tences or texts. However, the evaluation object-level emotion classification will extract
information from multiple dimensions such as entity, attribute and emotion, which is
difficult to realize technically.
Hopes to use in electric business review sentiment analysis technology to automati-
cally extract the user’s point of view and view of polarity, through automated analysis of
the electricity business review user’s point of view, including logistics soon, authenticity,
cost-effective, easy to use, these views, on the one hand, the characteristics of can help
the user to quickly understand the product, auxiliary for the consumer decision-making,
on the other hand, can also help our businesses to be timely find the advantages and
disadvantages of its products, to improve their products.

2 Related Work

Because opinion extraction techniques are usually fairly basic, we can eventually get
the sentiment analysis results for each word. We can then use the results of opinion
extraction as a lexicon for other sentiment categorization results. Hence the term many
people use for opinion extraction techniques to build sentiment dictionaries. Common
methods to construct an emotion dictionary include manual annotation and automatic
method [1]. The advantage of manual annotation is that the accuracy rate is very high, but
the disadvantage is that many labor costs are needed. Of course, there are some partial
automatic methods. These are mainly based on the manual annotation of prominent words
and then automatically emotional annotation of other words. Constructing an emotion
dictionary is an important link in the task of emotion analysis. Yang Shuxin, Zhang Nan
et al. [2] believed that the emotion dictionary could reflect the unstructured features
of texts and adopted an automatic partial method to screen words in sentences using
the emotion dictionary. Knowi et al. [3] proposed that using an emotion dictionary for
corpus emotion analysis is an important work content and significance of the dictionary.
Through in-depth research on the emotion dictionary, Zargari Hamed et al. [4] proposed
an emotion dictionary based on the analysis of emotion components such as emotion
words, negative words, and intensifier words, which makes up for the heterogeneous
influence of the phenomenon of multiple languages of emotion compound in the emotion
dictionary.
As for sentence-level emotion analysis aims to output the overall emotional tendency
of the whole sentence and discourse. It is generally accepted that this study started with
Pang [5] and Turney [6]. Generally speaking, sentiment analysis at the sentence or
discourse level is treated as a typical text classification problem. In terms of specific
methods, it mainly experienced three stages: rule-based method, traditional machine
learning, and deep learning method [7]. The rule-based method requires experts to set
rules and then requires a lot of labor costs.
Moreover, this rule-based method has no generalization ability, and the model per-
formance is relatively weak [8]. After a long period of time, with the development of
traditional machine learning technology, methods based on traditional machine learning
gradually occupy the mainstream position [9]. Traditional machine learning methods
require the manual design of some features, and the quality of feature design will greatly
634 Z. Yanrong et al.

affect the model’s effectiveness. With the emergence of deep learning methods in recent
years, a great advantage is that there is no need to define some features manually [10].
The evaluation object-level emotion analysis task, or the objective level emotion
analysis, is a more fine-grained emotion analysis ability, aiming at the whole sentence
and chapter, output the emotion analysis results for the specific evaluation object or
dimension. It is specifically divided into the emotion analysis task of the given evaluation
object and the joint extraction of the evaluation object and emotion [11]. The emotion
analysis task of the given evaluation object first needs to give the target entity to be
analyzed, and at the same time needs the user to input the current input text to be
analyzed, and the model needs to give a judgment according to the two input results.
However, the joint extraction task of evaluation object and emotion only requires the
user to input the text, and the model can automatically identify the description object
and its corresponding emotion [12].

3 Method Research Based on SEKP Pretraining Model


3.1 Analysis of Word-Level Emotion
Word-level sentiment analysis is to identify emotions directly corresponding to an input
word. For example, if you input “too good”, then it corresponds to a positive emotion;
if you input “sad”, then it corresponds to a negative emotion. Since the word-level
sentiment analysis technique is relatively basic and can obtain corresponding sentiment
analysis results, the results of the existing researches on the basic word-level sentiment
analysis can provide help for other sentiment classification results in a dictionary way.
Therefore, many people refer to word-level emotion analysis techniques as building an
emotion dictionary [13].
There are also different ways of expression in emotion dictionaries. Generally speak-
ing, there are two ways of expression in emotion dictionaries [14]. The first way is a
discrete expression, which directly marks a word as positive, negative or neutral. But
this method often has some problems, we in this way, there is no way to embody a word,
he had the strength of the expression of emotion, “happy” is a positive, “wonderful” is
positive, but the “wonderful” express emotions positive strength is stronger, a discrete
representation, there is no way to distinguish them. There is another kind of problem,
for two words of the same emotion, we cannot compare their similarity. If we want to
compare the emotional expression of “happy” and “pleased” more similar, or “happy”
and “great” more similar, then we cannot deal with it by the discrete method.
Because of these problems, scholars put forward continuous multidimensional rep-
resentation methods. Instead of categorizing a word as a single positive or negative cate-
gory, this approach represents it as a multidimensional scale vector. And each dimension
in this vector, we give it endowed with different meanings, such as the representation
method of common VAD model [15], it will put a word of the scale of the emotional
information is expressed as a three-dimensional vector, the first dimension may be to
represent the feelings of the positive and negative polarity, the second dimension to
represent the strength of the emotion, the third dimension to represent the emotion of
subjective and objective information, using this method can be a very good solution to
solve the above these problems, as shown in Table 1.
Emotion Analysis System Based on SKEP Model 635

Table 1. Continuous multidimensional representation of the VAD model.

The sample Coordinates


Happy (0.7, 0.5, 0.7)
Wonderful (0.6, 0.9, 0.3)
Pleased (0.7, 0.55, 0.7)

With continuous multi-dimensional representation, we can found that “happy” and


“wonderful” because they are second dimensional numerical difference is very big, is
also wonderful expression of emotion is more intense, more “happy” and “pleased”
again they term vectors will be more close to, that is to say “happy” and “pleased” on
the emotional expression is similar.

3.2 Sentence or Discourse Level Emotion Analysis Research

Sentence or discourse level emotion analysis aims to output the overall emotional ten-
dency of the whole sentence and discourse. For example: “What China has achieved is
really incredible”, we need SKEP model to understand the overall emotion expressed
in this sentence, and then output the corresponding result of sentiment classification, as
well as the positive probability and negative probability.
Generally speaking, sentiment analysis at sentence or discourse level is treated as a
typical text classification problem. In terms of specific methods, it mainly experienced
three stages: rule-based method, traditional machine learning and deep learning method.
The rule-based method requires experts to set rules, which not only consumes a lot of
labor costs, but also has weak generalization ability. After a long period of time, with the
development of traditional machine learning technology, methods based on traditional
machine learning gradually occupy the mainstream position. Firstly, the input discrete
text is continuously processed, which is the process of feature extraction and feature
selection [16].
“This store has great facilities. I’m very satisfied.” First, we manually define the
features that we think can be used to express the emotion of a sentence. Perhaps the
result of the word breaking is the emotional feature that we find useful. Then, we use
the feature name combined with the feature value to represent the result of the word

Table 2. Feature extraction and feature selection samples.

Common examples Feature sample


Participle I:1 Very:1 Satisfied:1
N-Grams Great_facilities:1
Emotional words Pos_satisfied:1
Degree words Degree_very:1
636 Z. Yanrong et al.

breaking. The other is a fragment of two words, which extracts N-element grammar as
a feature that is useful for emotional classification. As shown in Table 2.
Similarly, emotion words and degree words can be used to construct features. In this
way, after all the features are constructed, the discrete input text can be expressed as
the vector form of real numbers, and then the real number vector can be applied to the
classification model. Common models include SVM, maximum entropy, Naive Bayes,
Logistic regression, etc. [17].
In addition, in addition to feature extraction and feature selection, we also need to
define a loss function to guide model training. When the number of training rounds of
the model is enough, and the effect is good, we can deploy the model and predict the
emotion of the new samples. See Fig. 1.

Fig. 1. Traditional machine learning paradigm.

Traditional machine learning methods require the manual design of some features,
and the quality of feature design will greatly affect the effectiveness of the model. With
the emergence of deep learning methods in recent years, compared with traditional
machine learning methods, a great advantage is that there is no need to define some
features manually. The input text only needs to be represented by embedding encoding,
and then the model can automatically learn and extract features [18].

3.3 Evaluation Object-Level Emotion Analysis Research


In the scenario of e-commerce product analysis, in addition to analyzing the emotional
polarity of the whole product, it also refines the emotional analysis by taking the specific
“aspects” of the product as the analysis subject. For example, “This product has good
performance, but the price is too expensive”, the “performance” of the product is positive,
while the “price” of the product is negative [19].
The evaluation object level emotion analysis task can be divided into emotion analysis
task of given evaluation object and joint extraction of evaluation object and emotion.
Emotion Analysis System Based on SKEP Model 637

The most classical model for emotion analysis task of a given evaluation object is the
LSTM model based on Attention mechanism [20], as shown in Fig. 2.

Fig. 2. LSTM model based on attention mechanism.

This model first represents the input word as a word vector, and then inputs it into
the LSTM model according to the sequence of statements, so that the implicit vector of
each position can be obtained. The model introduces a new Embedding vector Va for the
current given evaluation object, and this vector is automatically learned in model training.
For example, “Mi phones” is represented as Va, and then Va and the implicit vector of
each position of the input text (H1,..Hn) uses Attention to calculate correlation and get
a weight of Attention. Attention model can be understood as a method of automatic
correlation calculation. Attention can be used to automatically learn [21] which words
in the input text are important for us to classify emotions, and then the Attention weight
vector will give higher weight to these words. After we get the Attiention weight vector,
our model will perform a weighted sum of H at all positions based on the Attiention
weight. So we get the semantic representation r of the current evaluation object, and
then we take that r to do affective classification. The advantage of the model is to learn
a semantic representation related to each evaluation object by introducing the vector
Va and Attiention mechanism of the evaluation object. Therefore, it is better to use this
semantic representation for emotion classification.
In view of the evaluation objects, emotion combined extraction task we usually
practice it as a typical problem of information extract, two-way LSTM + CRF model
is the most classic [22], you first need to put every word for embedding embedded said,
then will put each embedding vector according to the sequence of input into our LSTM
model, then we could get the LSTM implicit vector model outputs, CRF directly in the
implicit vector based by one layer, let the CRF sequence annotations for each position. In
this way, according to the results of CRF annotation, we can directly output the extraction
638 Z. Yanrong et al.

results of the model. The advantage of the model is that such dimension and emotion
are extracted together by introducing a sequence annotation task [23].

3.4 Research on SEKP Pre-training Model Method


Introduction of SKEP Pre-training Model. The emergence of this pre-training model
in recent years has defined a new paradigm for NLP tasks. Now a new approach of pre-
training on large scale unsupervised data and then fine-tuning on a small scale supervised
data continues to refresh the effects of NLP [24].
However, we believe that the existing general pre-training models are more based
on the training of objective texts such as news or wiki, and these models pay more
attention to the information of entities or textual connectives in factual texts. In fact, our
sentiment analysis focuses more on the analysis of some emotions or opinions hidden in
the subjective text. So based on this idea, we through the use of more subjective data for
training, or through some known our feelings dictionary, the transcendental knowledge
emotional for some enhanced training, you can continue to improve the efficiency in the
process of the training model on the classification of all feelings, so in this paper, on the
basis of SKEP model [25], by fine-tuning on small-scale supervised, to get better results
than the SKEP models. The model is shown in Fig. 3.

Fig. 3. Skep model

The method of this model is to first train the pre-training model continuously with
a large amount of subjective text data, such as “This product came really fast, and I
appreiated it” commentary text, and then remove the words that are considered to be
related to emotional analysis from these sentences so that the model can be automatically
restored. For example, to block the word “fast” and let the model predict in the pre-
training stage, the original position should be an emotional word, and then to block the
word “appreiated”, and then let the model predict what word is in the pre-training stage,
or even whether the model predicts a positive or a negative emotional word before the
position. Through these simple and effective methods, the SKEP model has a better
perception of multiple types of emotional knowledge, and then it will achieve better
results on the downstream emotional analysis tasks.
Objective Optimization Function. The objective optimization function L(Sentiment
Pre-training Objectives) of the emotional pre-training model SKEP consists of three
Emotion Analysis System Based on SKEP Model 639

objective optimization functions. They are sentient word objectives (Lsw ), word polarity
objectives (Lwp ) and aspect sentient pair objectives(Lap ),so L = Lsw + Lwp + Lap . The
calculation method of objective function loss in emotional words is shown in formulas
(1) and (2):

ŷi = soft max(x̃i W + b) (1)


i=n
Lsw = − mi × yi log ŷi (2)
i=1

Where x̃i is the output vector of the encoder in the transformer, and ŷi is the probability
distribution obtained after x̃i passes through the output layer and then through softmax.
However, after reasoning to get the token of each location, the loss of each token will
not be calculated, but only the loss of the location of emotional words, and the location
of non emotional words will not participate in the calculation. This is also the role of,
which will screen which words are emotional words and which are not.
The calculation method of the objective function Lwp of emotional polarity words is
similar to that of Lsw . The difference is that Lsw calculates token and loss and polarity
loss. In fact, polarity can be understood as another type of token here. The difference is
that there are only two classes: positive and negative.
The calculation method of attribute word emotion word pair objective function Lap
is shown in formulas (3) and (4):
 
ŷa = sigmoid x̃1 Wap + bap (3)


a=A
Lap = − ya log ŷa (4)
a=1

Here, x1 is the output vector of the [CLS] position. ya is an aspect sentimental pair,
that is, an attribute word emotion word pair. ŷa is the probability evaluation value of ya .
It should be noted that we already have a dictionary library of aspect sentimental pairs,
that is, each pair has a corresponding ID representation.

4 Experimental Results and Analysis


4.1 Data Set

A total of 7 data sets were collected and sorted for opinion extraction task, sen-
tence level emotion classification task and evaluation object level emotion classification
task, including CoTE-BD, Cote-MFW and Cote-DP of Chinese Academy of Sciences,
ChnSentiCorp of Chinese Academy of Sciences, NLPCC14-SC of Sooksu University,
SE-ABSA16_PHNS and SE-ABSA16_CAME of Harbin Institute of Technology.
640 Z. Yanrong et al.

4.2 Experimental Results


F1-score, precision and recall were used as evaluation indexes in this experiment, as
shown in Formula (5), (6) and (7):
a
recall =× 100% (5)
b
a
precision = × 100% (6)
c
2 × precision × recall
F1 −Score = × 100% (7)
precision + recall
Where a is the number of correctly identified evaluation units; b is the number of
actually existing evaluation units; c is the number of programs identified evaluation units.
The experimental environment is shown in Table 3. The results are shown in Table 4.

Table 3. Experimental environment.

GPU Tesla V100


Video Mem 32 GB
CPU 4 Cores
RAM 32 GB
Disk 100 GB
Python version Python 3.7
Framework version PaddlePaddle 2.1.2

Table 4. Experimental results.

Data set name F1 value (This paper) F1 value (SKEP)


COTE-BD 0.8921 0.845
COTE-MFW 0.8736 0.879
COTE-DP 0.9045 0.863
Data set name Precision (This paper) Psrecision (SKEP)
ChnSentiCorp 0.9675 0.9650
NLPCC14-SC 0.8425 0.8353
SE-ABSA16_PHNS 0.6923 0.8291
SE-ABSA16_CAME 0.7194 0.9006

SKEP model of data taken from June 30, 2020, last updated data, in SKEP model,
on the basis of through small-scale supervised data of fine-tuning, changing the batch
Emotion Analysis System Based on SKEP Model 641

size (batch_size) and training rounds (epochs) and so on in order to achieve the best
training effect, in which the training model in Chinese using skep_ernie_1. 0 _large_ch,
this model is in pre training ernie_1 SKEP model. 0 _large_ch continues to advance in
massive Chinese data based training by the training model in Chinese.From Table 3 the
result of the experiment shows that extraction task overall improved values of F1, of
which COTE - DP value increased by 4%, the F1 sentences to emotion classification
task accuracy level range is not very big, ChnSentiCorp increased 0.25%, NLPCC14 -
SC increases by 0.72%, but the accuracy evaluation object level emotion classification
task optimization effect is far inferior to SKEP model.

5 Conclusion
In recent years, a large number of studies have shown that pretraining models based on
large corpora can learn common language representations, which is beneficial to down-
stream NLP tasks and can avoid training models from scratch. With the development of
computing power, the emergence of deep model Transformer and the enhancement of
training skills make the pre-training model of large corpus develop continuously from
shallow to deep.SKEP model uses an unsupervised method to mine the emotional knowl-
edge automatically, and then uses the emotional knowledge to construct pre-training
targets so that the machine can learn to understand the emotional semantics.

Acknowledgments. Natural Science Foundation of Heilongjiang Province in 2021 Joint Guiding


Project (LH2021F036).

References
1. Zhong, J., Liu, W., Wang, S., Yang, H.: Text emotional analysis methods and application
overview. Data Anal. Knowl. Discov. 5(06), 1–13 (2021)
2. Yang, S., Zhang, N.: Text emotional analysis. Comput. Appl. 1–6 (2021)
3. Li, M., Cui, X.: Usefulness evaluation of user-generated content based on domain emotional
dictionary--take douban reading as an example. Intell. Theor. Pract. 1–15 (2021)
4. Zargari, H., Zahedi, M., Rahimi, M.: GINS: a global intensifier-based N-Gram sentiment
dictionary. J. Intell. Fuzzy Syst. 40(6), 11763–11776 (2021)
5. Pang, B., Lee, L., Vaithyanathan, S.: Thumbs up: sentiment classification using machine
learning techniques. Empirical Methods Nat. Lang. Process. 79–86 (2002)
6. Turney, P.: Thumbs up or thumbs down semantic orientation applied to unsupervised classi-
fication of reviews. In: Proceedings of Annual Meeting of the Association for Computational
Linguistics, pp. 417–424 (2002)
7. Wang, T., Yang, W.: A review of text affective analysis methods. Comput. Eng. Appl. 57(12),
11–24 (2021)
8. Meng, J., Lv, Pin., Yu, Y., Zheng, Z.: Cross-disciplinary emotional analysis based on CNN.
Comput. Eng. Appl. 1–11 (2021)
9. Tang, L., Xiong, C., Wang, W., Zhou, Y., Zhao, Z.: An overview of short-text emotional
tendency analysis based on in-depth learning. Comput. Sci. Explor. 15(05), 794–811 (2021)
10. Liu, T., Feng, M.: Affective analysis based on in-depth learning. Inf. Commun. 01, 88–89
(2020)
642 Z. Yanrong et al.

11. Shen, Y., Zhao, X.: An overview of emotional analysis at different levels based on in-depth
learning. Inf. Technol. Stand. (Z1), 50–53+58 (2020)
12. Chen, P., Feng, L.: Overview of aspects extraction in emotional analysis. Comput. Appl.
38(S2), 84–88+96 (2018)
13. Wang, J., Zhang, Z.: Emotional analysis of short text on Weibo based on improved subject
model [J]. Information and Computer (theoretical version), (06): 134–135+141 (2019).
14. Wang, Q., Wu, Z.: Research progress of view mining technology for public opinion
monitoring. New Industrialization 9(06), 74–77 (2019)
15. Zhang, X., Wang, D.L., et al.: Boosting contextual information for deep neural network
based voice activity detection. In: IEEE/ACM transactions on audio, speech, and language
processing (2016)
16. Kim, Y.: Convolutional Neural Networks for Sentence Classification. Eprint Arxiv, 2014.
17. Qian, Q., Huang, M., Lei, J., et al.: Linguistically regularized LSTMs for sentiment classi-
fication. In: Proceedings of the 55th Annual Meeting of the Association for Computational
Linguistics (Volume 1: Long Papers) (2017)
18. Teng, Z., Vo D, T., Yue, Z.: Context-sensitive lexicon features for neural sentiment analysis. In:
Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing
(2016)
19. Li, X., Bing, L., Li, P., et al.: A unified model for opinion target extraction and target sentiment
prediction (2018)
20. Wang, Y., Huang, M., Zhu, X., et al.: Attention-based LSTM for aspect-level sentiment
classification. In: Proceedings of the 2016 Conference on Empirical Methods in Natural
Language Processing (2016)
21. Liu, F., Xu, M., Deng, X.: An emotional analysis combining attention mechanism and sentence
ordering. Comput. Eng. Appl. 56(13), 12–19 (2020)
22. Guan, P., Li, B., Lv, X., Zhou, J.: Bidirectional LSTM emotional analysis with enhanced
attention. Chin. J. Inf. 33(02), 105–111 (2019)
23. Tang, D., Qin, B., Feng, X., et al.: Effective LSTMs for target-dependent sentiment
classification. Comput. Sci. (2015)
24. Devlin, J., Chang, M W., Lee, K., et al.: BERT: pre-training of deep bidirectional transformers
for language understanding (2018)
25. Tian, H., Gao, C., Xiao, X., et al.: SKEP: sentiment knowledge enhanced pre-training for
sentiment analysis. In: Proceedings of the 58th Annual Meeting of the Association for
Computational Linguistics (2020)
Evaluation of Enterprise Development
Efficiency Based on AHP-DEA

Wenli Geng1,2(B) and Mengyu Gao1,2


1 Harbin University of Commerce, Harbin 150028, China
2 Heilongjiang Provincial Key Laboratory of Electronic Commerce and Information Processing,

Harbin 150028, China

Abstract. With the rapid development of China’s economy and the acceler-
ated expansion of enterprise-scale, it is necessary to conduct objective evalua-
tion through quantitative evaluation method to determine whether the investment
of enterprises can obtain the best efficiency and enhance the competitiveness of
enterprises. To overcome the lack of comparability of the traditional absolute eval-
uation method and better evaluate the relative development status of enterprises,
the relativity evaluation of input-output efficiency should be carried out among
different enterprises. A comprehensive development performance evaluation sys-
tem is constructed based on the AHP analytic hierarchy process (AHP) model and
the DEA data envelopment analysis model. First, the AHP method is used to deter-
mine the weight of the evaluation index at the same level and relative to the target
level. The indexes with larger weight values relative to the target level are selected
as the index of the DEA evaluation model among all the indexes, thus avoiding the
randomness of DEA index selection. Based on the annual report data of tourism
companies between 2017 and 2019, the AHP-DEA model was used to evaluate the
static efficiency of enterprise development. Then the DEA-Malmquist model was
used to analyze competitiveness with three years of panel data for dynamic anal-
ysis. At last, the effective and ineffective development enterprises were pointed
out, and the reasons for the invalid development were analyzed, aiming to provide
useful evidence for the manager to make decisions.

Keywords: Enterprises competitiveness · AHP · DEA · Evaluation model

1 Introduction
In order to obtain the largest market share, the scale of enterprises continues to expand, the
product system is increasingly improved, and the market order is continuously optimized
[1]. And with the development of enterprises, the competition among enterprises is
also intensifying. Since enterprise competitiveness is closely related to multiple factors
such as total assets, the number of products, operating income, marketing expenses,
profits, etc., simply expanding the scale to have more market share is not necessarily
beneficial for enterprise development. Therefore, effective evaluation of competitiveness
in enterprise development has great application value and can be widely used in other
industries.

© The Author(s), under exclusive license to Springer Nature Switzerland AG 2022


A. E. Hassanien et al. (Eds.): BIIT 2021, LNDECT 107, pp. 643–652, 2022.
https://doi.org/10.1007/978-3-030-92632-8_60
644 W. Geng and M. Gao

Existing research mainly evaluates the enterprise competitiveness from two aspects,
qualitative and quantitative. The qualitative analysis is mainly conducted from enter-
prise culture and the connotation of enterprise competitiveness [2]. Quantitative research
mainly establishes evaluation index systems from internal and external aspects such as
the external environment in which the enterprise is located and the enterprise’s devel-
opment. Quantitative analysis uses hierarchical analysis, principal component analy-
sis, neural network, fuzzy comprehensive evaluation method, data envelopment analy-
sis (DEA), comprehensive index evaluation method, and survey questionnaire method.
Many methods involve selecting evaluation indexes with subjectivity, arbitrariness, and
not scientific objectivity, and there is no uniform standard in selecting enterprises. In
order to overcome the subjectivity and arbitrariness of evaluation, a comprehensive and
objective comprehensive evaluation model of enterprise competitiveness evaluation is
constructed using the widely used multi-criteria methods AHP hierarchical analysis and
DEA data envelopment analysis. The main contributions of this paper include.

(1) Using AHP to establish the hierarchy structure of the enterprise, the elements of
each level are described quantitatively so that the peer weight and relative target
level weight of the elements of each level can be calculated and the weight order
of each index can be determined.
(2) The DEA evaluation index is determined, and the DEA-BCC model is used to make
a static analysis on the development performance of 10 listed tourism enterprises.
(3) This paper uses DEA Malmquist model to dynamically analyze the three-year
development of 10 listed tourism enterprises from 2017 to 2019.

2 Classical Evaluation Model


2.1 AHP Level Analysis Model
AHP hierarchical analysis model [3–5] decomposes the problem into different elements
by analyzing the factors contained in the complex problem and their interconnections
and grouping these elements into different levels, thus forming a multilevel structure, in
which a judgment matrix can be established at each level by comparing the elements of
that level pair by pair according to a specified criterion. By calculating the maximum
eigenvalues of the judgment matrix and the corresponding orthogonal eigenvector, the
weights of the elements of that layer for that criterion are derived. On this basis, the
combined weights of the elements of each layer for the overall goal are calculated [6].
When calculating the weights, to test the judgment matrix’s consistency, it is necessary
to calculate its consistency index. Suppose the consistency index is less than 0.1. In that
case, the judgment matrix is valid, and finally, the total ranking weights of the hierarchy
of the program layer for the target layer, which provides the basis for selecting the
optimal scheme.

2.2 DEA Data Envelopment Model


DEA data envelopment analysis is an efficiency analysis evaluation model [7, 8], which
is widely used in the evaluation of the same type of decision units with multiple index
Evaluation of Enterprise Development Efficiency Based on AHP-DEA 645

inputs and multiple index outputs, with unit invariance and function uncertainty, that
is, changes in the numerical units of the input-output index variables of the decision
units do not affect the final efficiency value results, and the units of each index can be
inconsistent. For the production function, there is no need for No prior assumptions to be
required for the production function, which simplifies the complex production process, so
DEA can overcome the influence of subjective factors and simplify the algorithm while
improving the accuracy. Since there are many models in DEA with different conditions
and emphasis, the BCC model and the Malmquist model were chosen to analyze the
company’s competitiveness from both static and dynamic quantitative aspects [9].

BCC Model. Relative to the DEA-CCR model based on constant payoff size first pro-
posed by A. Charnes W.W. Copper and E. Rhodes in 1978, the BCC model [10] adds
a condition that the sum of the weight coefficients Lambda of the reference bench-
mark is 1. The analysis results obtained not only consider the pure scale efficiency. The
BCC model focuses on the analysis of static efficiency values of multiple decision units
simultaneously. If we want to measurethe efficiency
n of a set of n decision making units
(DMUs) of the same kind, denoted as DMUj j=1 , each DMU decision making unit has
m input index, denoted as {xi }m
i=1 , and s output index, denoted as {yr }r=1 . A constant is
s

introduced to denote the non-Archimedean infinitesimal, and sr+ and sr− are the slack
variables for output and input, respectively. When the firm reaches the Pareto optimal
state, that is, θ = 1 and sr+ = 0 and sr− = 0. This is the decision unit with stronger
validity, and the other is when θ = 1, without considering the slack variables, which is
the decision unit with weaker validity.

Malmquist Index Model. The Malmquist index model was proposed to address the
limitations of the BCC model with same period data, i.e., cross-sectional data, which
is a dynamic efficiency value analysis model for multiple decision units with a time
horizon. RolfFare et al., in 1994 [11] combined the Malmquist index. They proposed the
DEA total factor productivity (TFP, which refers to the ratio of the value of the output of
an economic system to the value of all inputs, reflecting the combined productivity level
of all factors of production that are input to production) model. This model decomposes
the Malmquist index into the change in integrated technical efficiency (EC) and the
change in technological progress (TC), which can also be divided into the change in the
value of pure technical efficiency (PEC) and the change in the value of scale efficiency
(SEC).

3 AHP-DEA Enterprise Competitiveness Evaluation Model

3.1 AHP Model to Determine Index Weights

Hierarchical Element Selection for AHP. First of all, the competitiveness of enter-
prise development includes the growth status of enterprise competitiveness, and the
evaluation of enterprise competitiveness can be measured from three aspects: enterprise
economic scale situation, enterprise profitability, and enterprise solvency.
646 W. Geng and M. Gao

1) The enterprise’s economic scale can determine the enterprise’s position in the com-
petition of the same industry, as well as the enterprise’s ability to resist risks, and
the corresponding indexes are selected as follows.

Total assets: The total number of assets owned by the enterprise, including current
assets, long-term assets, fixed assets, intangible and deferred assets, and other long-term
assets, etc.
Total Products Produced: The type and number of products.
Owner’s equity: It is the interest of the enterprise’s investors in the enterprise’s net
assets and contains the profit that the owner can make according to the number of assets
provided.

2) Profitability of the enterprise is the driving force for the development of the enter-
prise, mainly including operating income, operating costs, marketing expenses and
net profit.

Operating income: reflects the enterprise’s income, which is the main support to
maintain the development of an enterprise. The size of the marketing amount can observe
the operation of the enterprise and the occupation of the industry market.
Operating cost: It is the cost of business against the turnover. The lower the cost, the
more likely the enterprise will be profitable. Therefore, operating costs can effectively
reflect the efficiency of business management.
Marketing costs are the various expenses related to marketing activities. In order
to attract new customers and increase the market share, enterprises need to do various
promotional and promotional activities or advertising investments. Marketing expenses
directly affect the profit of the enterprise. The enterprise should also control the marketing
amount.
Net profit: The amount of total profit minus income tax of the enterprise, which can
reflect the important index of business efficiency of the enterprise.

3) Enterprise solvency is an important basis for sustainable development of enterprises,


mainly including current ratio and gearing ratio.

The current ratio of current assets to current liabilities is a simple estimate of short-
term debt servicing capacity.
The gearing ratio measures an enterprise’s ability to use creditors’ funds to carry out
business activities. It is an important index of the level of indebtedness and the degree
of risk.
In summary, the overall framework of the AHP model is shown in Fig. 1.
Evaluation of Enterprise Development Efficiency Based on AHP-DEA 647

Fig. 1. AHP model for enterprise competitiveness

Determination of AHP Index Weights. By comparing and analyzing the relative


importance of each evaluation index, establishing a hierarchical analysis matrix, and
using AHP software to calculate, the weight values of each evaluation index can be
obtained as shown in Table 1.

Table 1. The weight of each index at the same level and relative to the target level

First level index Second level index Weight of same level Weight for target level
Competitive Enterprise scale 0.2790 0.2790
Profitability 0.6491 0.6491
Solvency 0.0719 0.0719
Enterprise scale Total assets 0.0497 0.1782
Total products 0.0196 0.0704
Owner’s equity 0.2096 0.7514
Profitability Operating income 0.1608 0.2477
Net profit 0.3595 0.5538
Operating cost 0.0817 0.1259
Operating expenses 0.0472 0.0727
Solvency Current ratio 0.0629 0.8750
Asset-liability ratio 0.0090 0.1250
648 W. Geng and M. Gao

It can be seen from Table 1 that among the primary evaluation indexes, the lowest
weight is given to the solvency of the enterprise, while the highest weight value is given
to the profitability. Among the secondary ‘indexes, the weighting of the impact on the
economic scale of the enterprise is Owner’s Equity > Total Assets > Total Product
Production. The weight value of the impact on profitability is net profit > operating
income > operating cost > marketing expense.

3.2 Determination of DEA Data Envelopment Analysis Model Indexs

To determine the input and output indexs of the DEA model, the weight values of each
evaluation index can be calculated according to AHP, and the evaluation indexes affecting
the competitiveness of enterprises are mainly selected from the economic scale and
profitability of enterprises, and the indexs with larger weight values are selected, which
can comprehensively and accurately show the operating conditions and development
trends of these enterprises. Among them, “operating revenue (billion)”, “owner’s equity
(billion)” and “net profit (billion)” are used as three output indexes, “operating Costs
(billion)” and “Operating expenses (billion)” are used as 2 input indexes x.

4 Empirical Analysis of AHP-DEA Model in the Evaluation


of Tourism Enterprises
4.1 Selection of Decision Units

Based on the list of top-ranked tourism enterprises selected from public data, a total of 10
listed tourism enterprises, including Ctrip, China National Travel Service, China Youth
Travel Service, Tempus International, Zongxin Travel, Xi’an Tourism, Tibet Tourism,
Huangshan Tourism, Qujiang Cultural Tourism, and Yunnan Tourism Co. were selected
from the Shenzhen and Shanghai markets, respectively, as research subjects. These enter-
prises were selected considering factors such as tourism market ranking, tourism char-
acteristics, and e-commerce applications, and representative enterprises were selected
to represent the current development status of the tourism industry.

4.2 Static Efficiency Analysis

Because the size of the tourism market is still expanding continuously, the current market
is always in a state of expansion, so the BCC model based on the output perspective is
selected, i.e., the inputs are kept constant, making the output maximized, and the results
of the DEAP2.1 software used to calculate the specific efficiency of the 10 enterprises
from 2017–2019 are obtained. As shown in Fig. 2:
Evaluation of Enterprise Development Efficiency Based on AHP-DEA 649

Fig. 2. The efficiency chart of variable scale return and constant scale return of 10 tourism
enterprises from 2017 to 2019

From Fig. 2, we can see that if constant returns to scale (CRS) are assumed, each
firm’s overall efficiency of development is low, and most firms are inefficient and far
from the efficient curve1. Thus variable returns to scale (VRS) should be chosen as a
premise.
Analysis of the efficiency of each enterprise from 2017 to 2019: 8 enterprises had
effective, comprehensive technical efficiency and two enterprises had ineffective effi-
ciency in 2017. Only four enterprises developed in 2018 were effective; three were
ineffective. In 2019, only four pure scale efficiency effective enterprises. There are eight
pure technical efficiency effective enterprises. Overall, the overall technical efficiency
value, subject to variable payoffs for scale, decreases from 0.891 in 2017 to 0.878 in
2019, indicating that the firm’s overall efficiency is decreasing. The pure technical effi-
ciency is also decreasing from 0.977, 0.962 to 0.954, while the pure scale efficiency of
the firm is effective overall from 0.912, 0.915 to 0.922, although the scale efficiency is
increasing less. However, the decreasing pure technical efficiency of travel firms indi-
cates that the technical input efficiency of travel firms is low. From the Fig., it can be seen
that Ctrip network is a typical example, which has been in an effective state in 3 years.
It is a typical representative enterprise of tourism e-commerce, focusing on the applica-
tion and development of advanced technology, and has been in the leading position in
terms of technical efficiency; although the scale gain remains the same, the enterprise
is still ineffective development. Thus the enterprise development mainly depends on
technological progress.

4.3 Dynamic Efficiency Analysis

The development and operation of an enterprise is a long-term process, which requires


analyzing the efficiency of the enterprise’s operation from a dynamic perspective. The
Malmquist index can reflect the state changes of different enterprises and the state
changes at different stages. Input index and output index data of 10 enterprises from
2017–2019 are inputted into the Malmquist model, and the 3-year efficiency changes of
each enterprise can be synthesized as shown in Fig. 3.
650 W. Geng and M. Gao

Fig. 3. Chart of total factor productivity of 10 tourism enterprises from 2017 to 2019

(1) Figure 3 shows the dynamic trends of integrated efficiency change effch, tech-
nological change techch, pure technological change pech, scale change sech, and
total factor productivity i.e., productivity index tfpch for the three years 2017–
2019, where total factor productivity change is determined by integrated technical
efficiency change (i.e., resource allocation efficiency) and technological progress,
while purely technical changes and scale changes determine integrated technical
efficiency change. Where the productivity index is greater than 1, firm productivity
rises, and if the productivity index is less than 1, firm productivity is falling.
(2) Stage-by-stage trend analysis of tourism enterprises’ development efficiency during
the three years. Through the Malmquist model, it is also possible to obtain efficiency
changes among different years for all enterprises, as shown in Table 2.

Table 2. Table of efficiency changes from 2017 to 2019

Year effch techch pech sech tfpch


2017/2018 0.992 1.235 0.982 1.009 1.225
2018/2019 0.999 0.827 0.990 1.009 0.826
Mean 0.995 1.011 0.986 1.009 1.006

The table shows that relative to 2017, the production index for 2018, i.e., total factor
productivity, is greater than 1 and shows an upward trend, above the average. This is
mainly due to the technological progress index of 1.235, which is 23.5% higher than the
efficiency value in 2017. The decline in ure technical efficiency and the increase in scale
efficiency led to a combined efficiency value of 0.992, which is 0.8% lower than in 2017.
One of the main factors is that 2018 is the opening year of the integrated development
of culture and tourism. People’s enthusiasm for tourism is high, the business volume of
enterprises has increased. Therefore, each tourism enterprise has increased its investment
Evaluation of Enterprise Development Efficiency Based on AHP-DEA 651

in the research and development of new tourism products and expansion of enterprise-
scale, which increased from 5.40 trillion in tourism revenue in 2017 to 5.97 trillion in
2018. The main source of revenue is the first-tier cities in the north. Still, the second-tier
and new first-tier cities also become the focus of people’s outbound travel; thus, the
national tourism form is better. The efficiency of the tourism scale is improved.

5 Conclusion

In order to increase the quantitative scientific research on the competitiveness of enter-


prise development, the AHP method is used to analyze the weights of multiple evaluation
indexes, determine the hierarchical structure of evaluation indexes, and use the weight
value of each sub-index to the total target as the basis, and determine the index with
higher weight value that has an important influence on enterprise competitiveness as
the final index for the next step of DEA model, thus avoiding the DEA method in the
determination of indexes of arbitrariness. Finally, the financial statements of listed com-
panies of listed tourism companies from 2017–2019 were taken as the research objects
to increase the credibility of data collection. The AHP-DEA tourism enterprise compet-
itiveness evaluation model was established. The performance changes of each enterprise
in each year were analyzed from static and quantitative aspects by the DEA-BCC model
using Deap2.1 software. And through the DEA-Malmquist production index model, the
competitiveness of tourism enterprises were analyzed from five perspectives: total factor
productivity, technical progress change, comprehensive technical efficiency, pure techni-
cal efficiency, and pure scale efficiency of each enterprise across years from the dynamic
quantitative aspect. Through the example, we can see that the AHP-DEA model can not
only overcome the subjectivity of using the AHP model for enterprise competitive-
ness evaluation and the arbitrariness of selecting evaluation indexes in DEA evaluation
model, but also determine the ranking of enterprise development, analyze the relative
effectiveness or ineffectiveness of enterprise development from the perspective of total
efficiency, and analyze the reasons for ineffectiveness, and provide further improve-
ment measures for ineffective development enterprises. Therefore, the evaluation of the
relative competitiveness of enterprises can be completed more comprehensively.

References
1. Zhang, J.T., Wang, Y., Liu, L.G.: Model system construction of intelligent tourism application
in the context of big data. Enterp. Econ. 36(5), 116–123 (2017)
2. Jiang, A.Y., Li, Z.R.: Research on credit evaluation of science and technology-based SMEs.
Dev. Res. (6), 60–64 (2014)
3. Tian, H.D., Shen, W.H., T, Y.B., Gan, G.J.: Research on the development path of forest
recreation based on hierarchical analysis--take Guangxi Cat’s Hill National Nature Reserve
as an example. Forestry Econ. 1–12 (2020)
4. Li, Y.: Research on risk evaluation of PPP projects based on F-AHP evaluation method. J.
Hunan Acad. Arts Sci. (Nat. Sci. Ed.) 32(04), 69–74 (2020)
5. Li, J.Y.: Exploring the decision making problem of airport cab drivers based on hierarchical
analysis. Sci. Technol. Innov. (34), 14–15 (2020)
652 W. Geng and M. Gao

6. Xu, X, Peng, G.Y.: Research on efficiency evaluation based on DEA-AHP/GRA - an example


of logistics enterprises in Chang Zhu Tan. Sci. Technol. Manag. Res. (35), 66–70 (2015)
7. Zhou, X.J., Chen, X.X.: A prediction method based on big data fusion DEA and RBF. Stat.
Decis. Mak. (22), 36–39 (2020)
8. Ma, S.Y., Ma, Z.X., Zhang, J., Qiao, J.M.: Basic generalized stochastic data envelopment
analysis method. Oper. Res. Manag. 29(11), 138–143 (2020)
9. Li, X.J., Shao, X.: Research on the effectiveness of road maintenance investment decision
based on optimization DEA model. J. Eng. Manag. 34(05), 81–85 (2020)
10. Zhai, F., Gao, S., Wu, L.X.: Research on financing efficiency of listed pharmaceutical com-
panies in Jiangsu province–an empirical analysis based on DEA-BCC model and Malmquist
index. J. Anhui Coll. Commer. Technol. (Soc. Sci. Ed.) 19(03), 41–46 (2020)
11. Färe, R., Grosskopf, S., Norris, M., Zhang, Z.: Productivity growth, technical progress, and
efficiency change in industrialized countries. Am. Econ. Rev. 84(1), 66–83 (1994)
Innovation Efficiency of Electronic
and Communication Equipment Industry
in China

Yi Song, Huolong Bi, and Yi Qu(B)

Harbin Vocational College of Science and Technology, Harbin 150300, Heilongjiang, China
quy@hrbcu.edu.cn

Abstract. The electronic and communication equipment manufacturing industry


plays an irreplaceable role in China’s continued international competitiveness and
realization of transformation and development. This paper selects panel data of the
electronic and communication equipment manufacturing industry in China’s 20
provinces from 2009 to 2018. By selecting reasonable input-output indicators, the
DDF-GM model is used to measure the electronic and communication equipment
industry (EECE). The following conclusions are drawn: (1) From 2009 to 2018,
the EECE showed positive growth in most years, and the overall growth is positive.
(2) From 2009 to 2018, the EECE was the highest in the central region, followed
by the western region, and lowest in the eastern region. (3) The decomposition
index of EECE is driven by efficiency and technology, and the contribution of
technology is greater than that of efficiency. Based on the empirical results, this
paper finally puts forward some policy suggestions to improve EECE.

Keywords: Innovation efficiency · Electronic and communication equipment


industry · DDF-GM model · Malmquist index

1 Introduction

The electronic and communication equipment manufacturing industry occupies a very


important position in China. According to the China Statistics Yearbook on High Tech-
nology Industry, from 2000 to 2018, the number of China’s electronic and communica-
tion equipment manufacturing enterprises increased from 3996 to 14634; The average
number of employees also increased from 1739147 to 8142256; The main business
income increased from 587.45 billion yuan to 7830.99 billion yuan, with an average
annual growth of 18.85%, and this proportion increased year by year. The electronic and
communication equipment manufacturing industry plays technical support and a leading
role in related emerging industries and has made great contributions in accelerating eco-
nomic growth and creating jobs. It has become the leading industry of China’s industry.
Therefore, it is necessary to study the electronic and communication equipment industry
(EECE).

© The Author(s), under exclusive license to Springer Nature Switzerland AG 2022


A. E. Hassanien et al. (Eds.): BIIT 2021, LNDECT 107, pp. 653–660, 2022.
https://doi.org/10.1007/978-3-030-92632-8_61
654 Y. Song et al.

J.A. Sehumpete, an Austrian American economist, first put forward the concept of
“innovation”. He regards “innovation” as a process of establishing a new production
function. He believes that innovation introduces a group of new combinations of pro-
duction factors and conditions into the production system. Most scholars believe that
although the innovation efficiency of China’s electronic and communication equipment
manufacturing industry has experienced a series of fluctuations, it is generally on the rise.
Sun (2014) used the DEA-Malmquist index to measure the operating efficiency of ten
LED manufacturing enterprises in Taiwan from 2003 to 2009. The innovation efficiency
of both upstream and downstream enterprises has improved accordingly [1]. Although
the research on innovation efficiency of high-tech industry and manufacturing industry
is very rich, the research on electronic and communication equipment manufacturing
industry is still very scarce. Henisz and Zelner (2001) analyzed the impact of the insti-
tutional environment on telecom industry investment by using cross-border panel data
of 147 countries from 1960 to 1994. The results show that the stronger a country’s gov-
ernment control, the faster the growth rate of Telecom investment [2]. Hemmert (2004)
analyzed the impact of institutional factors on the innovation capability of the high-tech
industry. This paper divides institutional factors into four aspects, a total of 27 indicators,
and empirically tests Germany and Japan’s pharmaceutical and semiconductor industries
[3]. Andonova (2006) found that the change of institutional environment has a greater
impact on wired information technology than wireless communication technology by
using cross-border panel data from 1960 to 2002 [4].
In terms of research methods, the DEA method is used in most efficiency measure-
ments. Wang and Huang (2007) used DEA method to evaluate the efficiency level of
the Research and Development (R&D) process in various countries. The results show
that more than 50% of countries are not fully effective in R&D activities, and more than
two thirds of countries are in the stage of increasing returns to scale [5]. Sharma and
Thomas (2008) used the DEA method to evaluate the innovation efficiency of more than
20 countries, and found that the innovation resources of developing countries can be
used reasonably [6]. Lee et al. (2009) used the DEA method to measure and compare
national R&D plans’ performance and provided policy suggestions for the government
to effectively formulate and implement national R & R&D plans [7]. Guan and Chen
(2012) used a two-stage DEA model to find that the overall efficiency of national inno-
vation mainly depends on the downstream commercialization efficiency [8]. But DEA
can only measure the static change of DMU efficiency, Malmquist index can measure
the dynamic change. Therefore, this paper selects the Global- Malmquist (GM) index
model to analyze the EECE of 20 provinces in China. The innovations are as follows:
Firstly, the research subject of this paper is the electronic and communication equip-
ment manufacturing industry, and there is little research on this industry at home and
abroad, so the research subject is novel. Secondly, based on the common frontier and
regional frontier, this paper uses the DDF-GM model to evaluate the EECE of the three
regions, and through a series of spatial analyses, puts forward effective policy measures
to improve EECE.
Innovation Efficiency of Electronic and Communication Equipment Industry 655

2 Methodology and Data


2.1 DDF Model
The production possibility set of DMU is Formula 1:
     
Qt at = bt at can produce bt (1)
DDF model can be defined as formula 2:
 0 (a, b; gb ) = max{β|(b + βgb ) ∈ Q(a)}
D (2)
According to O’Donnell et al. (2008) [9], the DDF model of common frontier and
regional frontier can be defined as formula 3 and formula 4 respectively:
 
 G at , bt ; bt = max β G
D
⎧ T K 0
⎨ G T KG  
αkt xkp
t ≤ xt ;
p αkt ykq
t ≥ 1 + β G y t ; α t ≥ 0;
q k (3)
s.t. t=1 k=1
⎩ t=1 k=1
k = 1, . . . , KG ; t = 1, . . . , T ; p = 1, . . . , P; q = 1, . . . , Q
 
D R at , bt ; bt = max β R
⎧ T K 0
⎨ R T KR  
μtk xkpt ≤ xt ;
p μtk ykq
t ≥ 1 + β R y t ; μt ≥ 0;
q k (4)
s.t. t=1 k=1
⎩ t=1 k=1
k = 1, . . . , KR ; t = 1, . . . , T ; p = 1, . . . , P; q = 1, . . . , Q

2.2 Global-Malmquist Index


Malmquist index calculation method includes the global frontier constructed by all
phases. It can be defined as formula 5:

QG (x) = Q1 a1 ∪ Q2 a2 ∪ . . . ∪ QT aT (5)
The Global-Malmquist index can be defined as formula 6:
ScoreG(at ,bt )
t
GMt−1 = (6)
( t−1 t−1 )
ScoreG a ,b

According to Wang et al. (2013) [10], GM index can be expressed as formula 7:


1−Dt−1 (at ,bt ;bt ) 1−Dt (at ,bt ;bt )
GM tt−1 = 1−Dt−1 (at−1 ,bt−1 ;bt−1 )
× 1−Dt (at−1 ,bt−1 ;bt−1 )
1−Dt−1 (at−1 ,bt−1 ;bt−1 )
× 1−Dt−1 a(t ,bt ;bt )
1−D at ,bt ;bt (7)
= 1−Dt (at−1 ,bt−1 ;bt−1 ) t( )
1−Dt (at ,bt ;bt )
× 1−D at−1 ,bt−1 ;bt−1 = TC t−1 × EC t−1
t t
t−1 ( )
Where TC represents the technological change, and EC is efficiency change. GM is
recorded as MI, and EECE is represented by the result of MI. Therefore, under the
common frontier and regional frontier, EECE can be expressed as formula 8 and formula
9 respectively:
1−Dt−1 (at ,bt ;bt )
g
1−Dt (at ,bt ;bt )
g
t
GMIt−1 = 1−Dt−1 (at−1 ,bt−1 ;bt−1 )
g × 1−Dt (at−1 ,bt−1 ;bt−1 )
g (8)
656 Y. Song et al.

2.3 3Data and Sample

Referring to the existing literature, this paper constructs the input-output index system
as follows.
This paper selects high-tech output R&D internal expenditure, high-tech output R&D
personnel full-time equivalent, new product development expenditure as input variables.
They represent the human investment, science and technology financial investment,
and production stage investment respectively. New product sales revenue and patent
applications are output variables. They represent the economic benefit output and R&D
output. The above indicators are all from China Statistics Yearbook on High Technology
Industry.
According to the availability of existing data, 20 provinces in China are selected
as research samples, which are Guangdong, Guangxi, Guizhou, Hebei, Anhui, Beijing,
Fujian, Gansu, Henan, Hubei, Hunan, Jiangsu, Sichuan, Tianjin, Zhejiang, Chongqing,
Jiangxi, Shandong, Shaanxi and Shanghai. They are divided into three regions: the east-
ern region (Hebei, Shandong, Jiangsu, Beijing, Fujian, Guangdong, Zhejiang, Shanghai,
Tianjin), the central region (Hunan, Jiangxi, Anhui, Henan, Hubei), and the western
region (Shaanxi, Sichuan, Chongqing, Gansu, Guangxi, Guizhou).

3 Empirical Research

3.1 Temporal Analysis of EECE

As can be seen from Fig. 1, under the common frontier, the average EECE in the past
ten years has increased by 6.47%. In 2012, the value of EECE (1.2097) was the highest,
greater than 1 was positive growth, while in 2014, the value of EECE (0.9520) was the
lowest, less than 1 was negative growth. Most of the years are positive growth, only 2014
(0.9520) and 2015 (0.9692) are less than 1. Since the “Western Development” in 2016,
technical support and preferential policies have narrowed the gap between the western
region and advanced regions. In terms of different regions, the values of EECE of 2010
(0.9237), 2013 (0.9461) and 2015 (0.8750) in the eastern region were less than 1, only
2014 (0.8843) in the central region was less than 1, and 2011 (0.9497), 2013 (0.9932)
and 2014 (0.8032) in the western region were less than 1. In terms of decomposition
efficiency, the contribution of TC at the national level is greater than that of EC. In terms
of regions, the growth of EECE in eastern region and western region is mainly supported
by TC. It shows that the eastern and western regions attach importance to the cultivation
of independent innovation ability. However, the growth of EECE in the central region
depends on the support of EC. It shows that the central region pays more attention to the
management.
Innovation Efficiency of Electronic and Communication Equipment Industry 657

Fig. 1. China’s EECE and its decomposition index under meta frontier.

As can be seen from Fig. 2, under the regional frontier, the average EECE in the
past ten years has increased by 3.66%. In 2016, the value of EECE (1.1952) was the
highest, greater than 1 was positive growth, while in 2014, the value of EECE (0.9442)
was the lowest, less than 1 was negative growth. Most years are positive growth, only
2010 (0.9822) and 2017 (0.9992) are less than 1. In terms of different regions, most of
the years in the eastern region are negative growth, but the overall average growth rate
in this decade is 1.94%, only in 2011 (1.0192), 2012 (1.1791), 2014 (1.0826) and 2016
(1.1895) the EECE is greater than 1. In the central region, only 2014 (0.8400) was less
than 1, while in the western region, 2010 (0.9660) and 2014 (0.8235) were less than 1.
From the perspective of decomposition efficiency, the contribution of TC is also greater
than that of EC at the national level. In terms of regions, the growth of EECE in the three
regions is mainly supported by TC.
Generally speaking, whether meta-frontier or regional frontier, the contribution of
technology is greater than that of efficiency. Under the meta-frontier, the EC and TC
were 1.0290 and 1.0445, and that were 1.0073 and 1.0408 under the regional frontier.

Fig. 2. China’s EECE and its decomposition index under regional frontier.
658 Y. Song et al.

3.2 Spatial Analysis of EECE

As shown in Fig. 3, the highest EECE in Hubei (1.1467) and Jiangxi (1.1668) is at
the common frontier, both of which are located in the central region, which makes
the overall EECE of the central region higher than the other two regions. The values
of EECE of Gansu, Shaanxi, Beijing, Zhejiang, Fujian, Guangdong and Guizhou are
relatively low. EECE in three regions is growing positively. The eastern and western
regions are technology oriented innovation, while the central region is efficiency oriented
innovation progress. The EECE and its decomposition indexes in eastern region were
1.0386, 1.0065 and 1.0360 respectively, those in central region were 1.1212, 1.0787
and 1.0497 respectively, and those in western region were 1.0569, 1.0214 and 1.0529
respectively. The EECE and its decomposition index of the three regions are all greater
than 1, showing a positive growth, indicating that the development of EECE in China is
good at present and has great potential in the future.

Fig. 3. Average EECE in different province and regions under the meta-frontier.

As shown in Fig. 4, the highest EECE in Hubei (1.1609) and Jiangxi (1.1408) is
also at the regional frontier. Both provinces are located in the central region, which
makes the overall EECE of the central region higher than the other two regions. The
values of EECE in Sichuan, Guangxi and Shanghai are relatively low. All three regions
are technological oriented innovation and progress. The EECE and its decomposition
indexes in the eastern region were 1.0194, 1.0106 and 1.0173, respectively, those in the
central region were 1.0816, 1.0119 and 1.0772 respectively, and those in the western
region were 1.0248, 0.9985 and 1.0458 respectively. The EC of the western region is
less than 1, and the TC of the western region is less than that of the central region, which
leads to the lower overall EECE of the western region.
Innovation Efficiency of Electronic and Communication Equipment Industry 659

Fig. 4. Average EECE in different province and regions under the regional frontier.

The overall growth of EECE has a positive growth. Whether in the common frontier
or regional frontier, Hubei and Jiangxi have the highest EECE, and the EECE in the
central region is higher than that in the eastern and western regions. The central region is
located in the middle of China. Its geographical position is superior. It can not only rely
on the advanced technical level of the eastern region, but also rely on the rich human
and geographical resources of the western region. Therefore, its development has more
potential. It is necessary to strengthen regional cooperation and attach importance to
the construction of R&D team. Strengthen cooperation in R&D, market cooperation,
cooperative production, etc.

4 Conclusions and Policy Suggestions

Based on the above analysis, we conclude that: (1) From 2009 to 2018, the EECE showed
positive growth in most years, and the overall growth is positive. The EECE growth rate
was 6.47% in the common frontier, and 3.66% in the regional frontier. (2) The EECE
in 2009–2018 was the highest in central region, followed by the western region and
lowest in the eastern region. Under the two frontiers, the EECE in the eastern region
increased by 3.86% and 1.94% respectively, while that in the central region increased
by 12.12% and 8.16% respectively, while that in the western region was 5.69% and
2.48% respectively. (3) The decomposition index of EECE is driven by efficiency and
technology, and the contribution of technology is greater than that of efficiency.
In order to improve the overall innovation efficiency, we need to take some measures
to adjust. First, we should strengthen the government’s financial, human and material
investment in innovation and R&D, specify incentive policies, and increase investment
in innovation resources. We should also attach importance to technology introduction
and absorption. At the same time, we should promote the transformation of science and
technology into productivity. Second, it is necessary to strengthen technical cooperation
and exchanges among regions in the electronic and communication equipment manu-
facturing industry. Technological innovation and development have a synergistic effect,
especially the dissemination of advanced technology and ideas, which can accelerate
the division of labor and cooperation among regions and improve the competitiveness
of regions. The third is to enrich the innovation efficiency evaluation system of the
660 Y. Song et al.

electronic and communication equipment manufacturing industry. Innovation activity


is a dynamic development process, and the utilization degree and output of innovation
resources need to be evaluated and monitored in time. Only in this way can govern-
ment departments make corresponding input-output decisions and effectively improve
the efficiency of resource allocation and utilization.

Acknowledgments. We thank the funding sponsored by the General Project of Teaching Reform
of Higher Vocational Education in Heilongjiang (SJGZY2019172) and Key Entrusted Project of
Teaching Reform of Higher Vocational Education in Heilongjiang (SJGZZ2019023).

References
1. Sun, C.C.: Assessing the relative efficiency and productivity growth of the Taiwan LED
industry: DEA and Malmquist index application. Math. Probl. Eng. 4, 1–13 (2014)
2. Henisz, W.J., Zelner, B.A.: The institutional environment for telecommunications investment.
J. Econ. Manag. Strategy 10(1), 123–147 (2001)
3. Hemmert, M.: The influence of institutional factors on the technology acquisition performance
of high-tech firms: survey results from Germany and Japan. Res. Policy 33, 1019–1039 (2004)
4. Andonova, V.: Mobile phones, the internet and the institutional environment. Telecommuni-
cations Policy 30, 29–45 (2006)
5. Wang, E.C., Huang, W.: Relative efficiency Of R&D activities: a cross-country study
accounting for environmental factors in the DEA approach. Res. Policy 36(2), 260–273 (2007)
6. Sharma, S., Thomas, V.J.: Inter-country R&D efficiency analysis: an application of data
envelopment analysis. Scientometrics 76(3), 483–501 (2008)
7. Lee, H., Park, Y., Choi, H.: Comparative evaluation of performance of national R&D programs
with heterogeneous objectives: a DEA approach. Eur. J. Oper. Res. 196(3), 847–855 (2009)
8. Guan, J., Chen, K.: Modeling the relative efficiency of national innovation systems. Res.
Policy 41(1), 102–115 (2012)
9. O’Donnell, C.J., Rao, D.S., Battese, G.E.: Metafrontier frameworks for the study of firm-level
efficiencies and technology ratios. Empirical Econ. 34, 231–255 (2008)
10. Wang, Q.W., Zhao, Z.Y., Zhou, P., Zhou, D.Q.: Energy efficiency and production technology
heterogeneity in China: a meta-frontier DEA approach. Econ. Model. 35, 283–289 (2013)
Internet Financial Regulation Based
on Evolutionary Game

Shu-bo Jiang(B) and Yao-yao Huang

Harbin University of Commerce, Harbin 150028, China

Abstract. With the rapid development of Internet technology, the integration of


the Internet and the financial industry has gradually deepened. Internet finance has
developed for many years in China, but there are deficiencies in the supervision of
Internet finance. This paper builds an evolutionary game model of Internet financial
enterprises, regulators, and financial consumers. This paper studies the relation-
ship between financial enterprises, consumers, and financial regulators using the
replication dynamics equation. Research shows that regulators, consumers and
enterprises will reach a steady state after a series of evolution. On the basis of the
theoretical model, this paper tries to make some suggestions to regulators.

Keywords: Internet finance · Evolutionary game · Financial regulation

1 Introduction
With the development of Internet technology in China, the integration of the Internet
and traditional industries is accelerating. At present, Internet finance has been a new
financial operation mode formed by the deep integration of the Internet and finance. It
has become an important part of China’s financial system [1]. By the end of 2020, the
market size of Internet financial management has reached 25.86 trillion yuan, and it has
generated 993.25 billion yuan of profits for investors. In 2020, the Internet index of the
financial industry was about 30.2, which is much higher than other industries. At present,
the development of the Internet financial shows a diversified competition phenomenon,
and the transaction scale continues to expand. Due to the rapid development of the
financial Internet, the corresponding internet regulatory system is not comprehensive.
There are many problems, such as fuzzy supervisory responsibility attribution, lack of
market access criteria, backward regulatory means, and lagging regulatory laws. The
scale of Internet finance continues to expand. Since 2016, the People’s Bank of China,
China Banking Regulatory Commission, China Securities Regulatory Commission, and
China Insurance Regulatory Commission have decided to jointly regulate the Internet
finance industry and issued a series of regulations. However, these regulations are less
effective because they do not form a complete regulatory system. Based on the current
situation of China’s Internet financial regulatory agency, this paper uses the dynamic
evolutionary game model to analyze the evolution of Internet financial regulation and
tries to provide practical suggestions for the formulation and improvement of relevant
policies.

© The Author(s), under exclusive license to Springer Nature Switzerland AG 2022


A. E. Hassanien et al. (Eds.): BIIT 2021, LNDECT 107, pp. 661–669, 2022.
https://doi.org/10.1007/978-3-030-92632-8_62
662 S. Jiang and Y. Huang

2 Theoretical Analysis of Internet Financial Supervision


In the internet finance industry, the relationship between supervision and regulated
objects is similar to supply and demand. The demand side is the regulatory object of
Internet finance, including Internet finance companies and investors. In terms of reg-
ulatory requirements, there are obvious differences between the two sides. Enterprises
are committed to obtaining high profits and long-term healthy development, which puts
forward a relatively macro-level requirement to the regulatory side. Supervisory author-
ities make an effort to create a great market environment conducive to the long-term
development of the company’s survival [2]. Because of the asymmetry of information
and the relatively incomplete professional knowledge, the investors hope that the regu-
latory authorities will issue relevant policies to reduce information asymmetry. It could
avoid the related moral hazard and adverse selection and protect the interests of investors.
Based on the demand-side perspective, Internet financial supervision is necessary. When
the social benefit of supervision is greater than the cost of supervision, the government
will maintain supervision as the supplier of supervision. Like the traditional economic
market, Internet finance also has an imbalance between regulatory supply and demand.
Inadequate government supervision will lead to insufficient social supply. Strengthening
government’s supervision will bring huge social benefits, so the government wants to
carry out regulatory reforms and institutional innovations. If the government controls
market activities too much, it will lead to a decline in market vitality. At this time,
moderate deregulation will release the vitality of the industry.
Appropriate supervision will create certain economic benefits. Like other industries,
there is also a contradiction between fairness and efficiency in the Internet financial
industry. The government is prudently maintaining the order of the industry and creat-
ing new industry regulations, and it also needs to give enterprises some freedom within
a certain range. The core of the supervision work of government agencies should be
placed before and during the event, which can curb misconduct from the source. Reg-
ulators can raise the barriers to entry for the industry and check and guide companies
trying to enter the industry. It can enable enterprises to have basic legal literacy and
ethics. The regulators should give appropriate freedom to specific management activ-
ities within enterprises to actively carry out business innovation activities. Different
companies always have different operating characteristics, even if they belong to one
industry. Therefore, the supervision work of the supervisory authority must uphold the
principle of differentiation.

3 Evolutionary Game Model of Internet Financial Regulation


3.1 The Basic Assumptions
The basic assumptions of the game model are as follows (1) The subject of the game
has bounded rationality; (2) The game between the subjects is a process of learning,
and strategy adjustment, which has the characteristics of long-term and continuity; (3)
Information asymmetry between the two sides of the game; (4) Regulators have two
optional strategies: regulating or not regulating the Internet financial industry, which
depending on the cost of regulation and the loss of society [3]. In the case of regulation,
Internet Financial Regulation Based on Evolutionary Game 663

the players can adopt strategies according to the form of regulation by the regulatory
authorities. If Internet finance companies strictly abide by the regulatory rules, they will
lose the extra benefits of law violations. Conversely, if companies do not comply with
regulations, they may be subject to high penalties while obtaining high benefits.

3.2 Analysis of Game-Agent


In economics, market subjects are mainly composed of government, enterprises, and
consumers. Therefore, we divide the subjects into internet finance regulatory agencies
(government), internet finance enterprises, and consumers—the first player in the inter-
net finance regulator. Regulators are the maintainers of the internet financial market’s
order. The core of their work is to build a well-ordered internet financial market and
protect the rights of consumers. They protect consumers from fraud and other unfair
treatment using laws and regulations. At the same time, it avoids systemic financial risks
and big changes in the internet finance industry. The second player is internet finan-
cial enterprises. Internet financial enterprises mainly refer to those enterprises that use
internet technology to carry out financial activities, [4] such as third-party payment,
Yu’ebao financial management, P2P network lending platform, etc. This paper mainly
analyzes the game of Internet financial products. The third player is consumers. Internet
consumers mainly refer to natural persons and legal persons who use internet platforms
to conduct financial transactions or consume [5]. This paper mainly uses natural persons
and legal persons who invest on internet wealth management platforms as examples to
conduct game analysis.

3.3 Evolutional Game Model Construction


The Dynamic Evolutionary Game Between Enterprises and Regulators. Game
model construction. To build the dynamic evolutionary game model, the strategy choice
of financial regulatory institutions is to regulate or not to regulate, and the strategic
choice of Internet financial enterprises is to comply with the law or not. We can get four
different game combinations, as shown in Table 1.

Table 1. The game between regulators and companies

Game subject Regulatory body


Illegal Legal
Financial firms Regulation Illegal, regulation Legal, regulation
No regulation Illegal, No regulation Legal, No regulation

It is assumed that the Internet financial management platform can obtain normal
returns (α) through legal operation and excess returns (β) through normal financial
innovation. Additional income (θ ) could be obtained through illegal means. If the reg-
ulatory agency regulates, the illegal enterprises will pay fines(M). As far as regulators
664 S. Jiang and Y. Huang

are concerned, the cost of supervision of financial enterprises is C, the overall benefit of
society is H, and the negative impact and externality caused by the illegal behavior of
the Internet financial market is u˛ . Based on the above assumptions, this paper presents
the payoff matrix of both sides in the game, as shown in Table 2.

Table 2. The revenue game between regulators and enterprises

Game subject Regulatory body


Illegal Legal
Financial firms Regulation H − C + M, α + θ − M H − C, α + β
No regulation −˛u, α + θ 0, α + β

The probability of supervision by regulators is x, and the probability of illegal oper-


ation of financial enterprises is y. We could get the expected return of the regulatory
agency under different decisions (E 1a and E 1b ) the average return (E 1 ).

E 1a = y(H − C + M) + (1 − y)(H − C) = My + H − C
E 1b = −˛uy + 0(1 − y) = −˛uy
E 1 = xE 1a + (1 − x)E 1b = xy(M + u˛ ) − u˛ y + x(H − C) [6].

The above equation shows that no matter what decisions the regulatory authority
makes, its expected revenue can be expressed as a function of x, then we could construct
the evolutionary equation of the regulatory agency’s behavior:

F(x) = dx/dt = x(E1a − E1 ) = x(1 − x)(My + uy + H − C) (1)

Similarly, the expected returns (E 2j and E2g ) and mean values (E 2 ) of financial enterprises
under different decisions can also be obtained.

E 2j = x(α + θ − M) + (1 − x)(α + θ )
E2g = x(α + β) + (1 − x)(α + β) = α + β
E 2 = yE 2j + (1 − y)E2g = −xyM + y(θ − β) + α + β

The evolutionary equation of financial enterprises can be obtained as

F(y) = dy/dt = y(E2j − E2 ) = y(y − 1)(xM + β − θ ) (2)

Analysis of the Replication Dynamic Equation of Supervision Subject’s. According


to Eq. (1), when dx/dt = 0, the market supervision reaches a stable state, and the steady-
state equilibrium point is (0,0) (0,1) (1,0) (1,1). Assuming that dF(x)/dx < 0, the anti-
interference of the regulation under the stable state is tested. When y = (C − H)/(M +
u˛ ) = 0, F(x) = 0, the four equilibrium points just reach a stable state. If y = (C − H)/(M
+ u˛ ) = 0, let F(x) = 0, then x = 0, 1 is the two equilibrium points. Whether the equation
can evolve the equilibrium of the stable strategy heavily depend on (C − H)/(M + u˛ ) and
the relationship between y and (C − H)/(M + u˛ ). [7] We can get the following results:
Internet Financial Regulation Based on Evolutionary Game 665

When (C − H)/(M + u˛ ) > 1, we can get y < C − H)/(M + u˛ ), then dF(x)/dx|x=0 = My


+ u˛ y + H − C < 0, dF(x)/dx|x=1 = My + u˛ y + H − C > 0, dF(x)/dx|x=1 = My + u˛ y
+ H − C > 0. Therefore, x = 0 is a stable evolution strategy; When 0 < (C − H)/(M +
u˛ ) < 1, we should compare the values of y and (C − H)/(M + u˛ ). When 0 < y < (C −
H)/(M + u˛ ), dF(x)/dx|x=0 = My + u˛ y + H − C < 0, dF(x)/dx|x=1 = My + u˛ y + H − C
> 0, x = 0 is a stable evolution strategy; When1 > y > (C − H)/(M + u˛ ), dF(x)/dx|x=0
= My + u˛ y + H − C < 0, dF(x)/dx|x=1 = My + u˛ y + H − C > 0, x = 1 is a stable
evolution strategy; When (C − H)/(M + u˛ ) < 0,constant has y > (C − H)/(M + u˛ ),
dF(x)/dx|x=0 = My + u˛ y + H − C < 0, dF(x)/dx|x=1 = My + u˛ y + H − C > 0, x = 1
is a stable evolution strategy.

Analysis of Replication Dynamic Equation of Financial Firms. According to Eq. (2),


when dy/dt = 0, the non-compliance of financial enterprises reaches a stable state. At
this point, we can get four stable strategy points about the stable state (0,0), (0,1), (1,0)
and (1,1). Assuming that dF(y)/dy < 0, testing the anti-interference performance of the
enterprise in stable state. When x = (θ − β)/M, F(y) = 0, then all four equilibrium points
are in a stable state. If x = (θ − β)/M, let F(y) = 0, then y = 0,1 is the two equilibrium
points of the equation. Whether the equation can evolve into the equilibrium point of
the stable strategy depends greatly on the magnitude of (θ − β)/M and the relationship
between x and (θ − β)/M. When(θ − β)/M > 1, then x < (θ − β)/M, We could deduce
the conclusion:

dF(y)/dy|y=0 = (1 − 2y)(−Mx + θ − β) > 0, dF(y)/dy|y=1 = (1 − 2y)(−Mx + θ


− β) < 0. Therefore, y = 1 is a stable evolutionary strategy. Assuming 0 < (θ − β)/M
< 1, we should analyze the magnitude relationship between x and (θ − β)/M. When x
< (θ − β)/M < 1, dF(y)/dy|y=0 = (1 − 2y)(−Mx + θ − β) > 0, dF(y)/dy|y=1 = (1 −
2y)(−Mx + θ − β) < 0, y = 1 is a stable evolutionary strategy. When 0 < (θ − β)/M <
x, dF(y)/dy|y=0 = (1 − 2y)(−Mx + θ − β) > 0, dF(y)/dy|y=1 = (1 − 2y)(−Mx + θ −
β) < 0, y = 0 is a stable evolutionary strategy. If (θ − β)/M < 0, then x > (θ − β)/M,
thereby dF(y)/dy|y=0 = (1 − 2y)(−Mx + θ − β) > 0, dF(y)/dy|y=1 = (1 − 2y)(−Mx +
θ − β) < 0, y = 0 is a stable evolutionary strategy.

Dynamic Evolutionary Game Between Enterprises and Financial Consumers.


Game model construction. There is a game between financial enterprises and consumers,
and the game players change their behaviors according to the change of strategies. There
is a certain information asymmetry between the Internet and the industry. Financial com-
panies can choose to tell consumers the truth or cheat consumers. As the subject of project
investment and consumption, they have the right to choose whether to buy or not, thus
forming the following game matrix (Table 3).

Assuming that the financial enterprises to the project information truthfully informed
the truth degree (D), the degree of deception (F) of the financial enterprise and its own
ability (H), the cost is a function of the degree of truthful disclosure (D), deception
degree of financial enterprises (F) and the degree of consumer supervision (K), thus its
profit function can be expressed as G(D, F, H) − Z(D, F, K). If the financial enterprise
tells the true information truthfully, it can temporarily ignore the consumer choice, and
the return function is G(D) − Z(D). If the financial enterprise deceives the consumer, the
666 S. Jiang and Y. Huang

Table 3. The regulatory game between consumers and enterprises

Game subject Regulatory body


Supervision No supervision
Financial firms Representation Representation, supervision Representation, No
supervision
Deceive Deceive, supervision Deceive, No
supervision

consumer will supervise it, then the income function is G(F) − Z(F) − PF 2 , where P
is the punishment intensity of the fraud; If the financial enterprise cheats the consumer
and they doesn’t supervise, the income function is G(D) − Z(D).
Let’s start to analyse the state of the consumer. If consumers give up supervision,
whether Internet financial enterprises choose to tell real information or not, their return
fuction is G(ω) (ω is the depth of consumer supervision). If the consumer gives up
supervision and the firm tells the truth, the consumer still gets G(ω). If consumers
choose regulation and financial firms commit fraud, the consumer income is G(ω) −
Z(K) + PF 2 . If the consumer chooses to supervise and the financial enterprise tell the
true information of the project, the consumer benefit is G(ω) − Z(K). We can get the
following matrix (Table 4):

Table 4. The revenue game between consumers and enterprises

Game subject Regulatory body


Supervision No supervision
Financial firms Representation G(D) − Z(D), G(ω) − Z(K) G(D) − Z(D), G(ω)
Deceive G(F) − Z(F) − PF 2 , G(ω) − Z(K) + G(F) − Z(F), G(ω)
PF 2

The probability of financial enterprises truthfully inform the true information of the
project is x, and the probability of consumers choosing supervision behavior is y. Thus,
the earnings of financial enterprises under different strategies (E1x and E11−x ), and the
mean value is AE 1 .

E1x = y(G(D) − Z(D)) + (1 − y)(G(D) − Z(D)) = G(D) − Z(D)


E11−x = y(G(F) − Z(F) − PF 2 ) + (1 − y)(G(F) − Z(F)) = G(F) − Z(F) − yPF 2
AE 1 = x(G(D) − Z(D)) + (1 − x)(G(F) − Z(F) − yPF 2 )

We could derive the revenue functions corresponding to whether consumers make


supervisory choices (E2x and E21−x ), and the mean value (AE 2 ):

E2x = x(G(ω) − Z(K)) + (1 − x)(G(ω) − Z(K) + PF 2 ) = G(ω) − Z(K) + PF 2 − xPF 2


Internet Financial Regulation Based on Evolutionary Game 667

E21−x = xG(ω) + (1 − x)G(ω) = G(ω)


AE 2 = y E2x + (1 − y)E21−x = y(G(ω) − Z(K) − xPF 2 ) + PF 2

Therefore, the replication dynamic equation of Internet financial enterprises and


consumers is as follows:

F(x) = dx/dt = x(E1x − AE1 ) = x(1 − x)(G(D) − Z(D) − G(F) + Z(F) + yPF 2 )
(3)

F(y) = dy/dt = y(E1x − AE2 ) = y(1 − y)(G(ω) − Z(K) − xPF 2 ) (4)

Analysis of Evolutionary Game. If y = [Z(D) − G(D) + G(F) − Z(F)]/PF, then all


the four equilibrium points of the equation can reach a stable state. If y = [Z(D) − G(D)
+ G(F) − Z(F)]/PF, then x = 0, 1 are all possible steady-state points. The stability
strategy is calculated by the two Eqs. (3) and (4). In the case of unsupervised consumer,
G(F) − Z(F) < G(D) − Z(D), y > [Z(D) − G(D) + G(F) − Z(F)]/PF 2 , so x = 1 is a
stable evolution strategy. At this time, financial enterprises will truthfully inform the real
information of the project. Otherwise, they will choose to conceal the key information.
If x = [G(ω) − Z(K)]/PF, that is F(y) = 0, then all y values will be in a stable state. If x =
[G(ω) − Z(K)]/PF, then y = 0, 1 may reach a stable state. When x > [G(ω) − Z(K)]/PF 2 ,
financial enterprises would rather choose the punishment brought by cheating than tell
consumers the true information of the project, so y = 0 is a stable evolutionary strategy.

Analysis of Influencing Factors of Internet Financial Regulation Game

– Factors influencing the strategic choice of regulatory agencies. The first factor is
the regulatory willingness of regulators. The probability of regulation is positively
correlated with the negative social effects of the industry’s indulgence of bad behav-
iors. The regulation can speed up the prosperity of the Internet financial industry and
reduce the negative impact of enterprises’ illegal activities. Then, the supervisory
authority’s willingness to supervise will decrease, and the systemic risks in the indus-
try will increase. The second factor is the comprehensive social benefits brought by
regulation. The greater the effect of the regulatory authority after disposing of illegal
enterprises, the greater the possibility of the regulatory agency to carry out regulatory
work.
– Factors Influencing Internet Financial Enterprises and Consumers’ Choice
Strategies

The first factor is illegal earnings. If other factors are determined, the benefits brought
by business illegal (cheating) behavior will exceed the costs. The companies will still
consider violations of the law as the best choice even if the regulatory authorities and
consumers conduct strict supervision.
The second factor is the cost of deception. When companies choose whether to cheat
or not, they must consider the cost of the cheating. A company’s willingness to break
the law is inversely proportional to its costs. A single consumer’s ability to participate
668 S. Jiang and Y. Huang

in the game of the Internet finance industry is very limited, and the supervision from
consumers must rely on the entire investment group. Therefore, consumers should pay
more attention to their screening ability and cooperate with government departments to
conduct effective public supervision.

4 Suggestions About Improving Supervision of Internet Finance


Firstly, internet finance companies must do a good job in industry self-discipline and
information disclosure. For the Internet financial industry to achieve vigorous long-term
development, it is not enough to rely solely on the supervision of external institutions,
and its internal self-control is more important. The internet financial industry must form
its industry associations. While ensuring industry development and fair competition, the
associations must issue industry regulations that have a certain influence and are widely
binding. Enterprises must consciously abide by the regulations. At the same time, the
industry must implement an information disclosure system to reduce disputes arising
from information asymmetry. Industry associations should issue relevant information
disclosure regulations and information review standards to ensure the immediacy and
authenticity of information disclosure.
Secondly, regulatory authorities should improve regulatorily. As the maintainers of
the Internet finance system, the regulators should simplify and refine the regulatory
process, improve the regulatory indicators, and pay attention to the risk prevention and
decision-making level [8]. We should improve the entry threshold of the Internet finan-
cial and include shareholders, businesses, and institutions in the scope of entry. Besides
strictly insisting on prudence, we should take technological facilities and entrepreneur-
ship guarantee measures as important conditions. The Internet financial industry is differ-
ent from other traditional industries. By taking different regulatory measures for different
enterprises and setting a reasonable legal red line, the regulators could resolve the possi-
ble risks in time [9]. Internet finance companies must establish a risk control system in
the process of operations. The business development is also procedurally managed, and
risks that may exist in the business should be guarded against. After the risk emerges,
the enterprise should report the risk to the supervisory department timely and properly
handle the risk crisis.
Last but not least, consumers should improve the consumption and supervision
awareness of Internet finance. Consumers should consider the current background of
the times and their consumption levels, choosing products reasonably. They also need to
update consumption concepts and investment models promptly. And consumers should
cultivate basic financial literacy and financial management concepts that conform to
the characteristics of the current era. Consumers should use new media platforms and
public power to promote active and healthy Internet financial products. And exposing
terrible financial products and financial institutions timely to ensure the long-term sound
development of the industry.

5 Conclusion
Based on evolutionary game theory, this paper constructs a dynamic evolutionary game
model among regulators, consumers, and internet financial companies. As far as business
Internet Financial Regulation Based on Evolutionary Game 669

and regulatory authorities are concerned, the initial game state has an influence on the
stable equilibrium point of repeated games in the process of asymmetric replicative
dynamic evolutionary games. According to the replicating equation of regulators and
financial enterprises, when the violation of the law can bring high profits, the enterprise
will not comply with the law to obtain additional profits, whereas financial enterprises
will comply with the law provisions [10]. Similarly, it will bring great social benefits
when the market without supervision, the regulatory subjects will give up the supervision
of the market. On the contrary, they will carry out a series of supervision activities. As
far as financial firms are concerned, the tendency to tell the true information of the
project increases only with the greater the loss caused by the fraud. The more serious
the phenomenon of corporate fraud and concealment, the higher the willingness of
consumers to supervise. Both enterprises’ and consumers’ decisions need a long-term
evolutionary game to reach a stable state.

References
1. Li, Z.: Thoughts on internet finance. Manag. World 4(7), 1–7 (2015)
2. Cao, D., Cao W., Wu, J.: A game study on financial innovation and regulation in the internet
era. J. Southeast Univ. (Philos. Soc. Sci. Ed.) 16(4), 59–64+135 (2014)
3. Smith, J.: Evolution and Game Theory. Shanghai Fudan University Press (2008)
4. Liu, Y., Xu, C., Yu, P.: Internet finance: origin, risk, and regulation. Soc. Sci. Res. 3, 28–33
(2014)
5. Huang, Y., Xu, H.: On protection of rights and interests of financial consumers in P2P online
lending. Hebei Law 34(9), 16–27 (2016)
6. Zhang, C., Liu, J.: Dynamic evolutionary game between internet financial innovation and
financial regulation. Guizhou Soc. Sci. 1, 151–159 (2020)
7. Wang, T., Qin, J.: Study on optimization of local financial regulatory framework in China: an
analysis based on dynamic evolutionary game model. Shanghai Econ. Res. 4, 14–22 (2016)
8. Song, Y., Xu, Y., Zhang, Z.: Soc. Sci. Res. 4(4), 25–31 (2018)
9. Su, Y., Rui, Z.: Development of internet finance and the emergence of government regulatory
system: based on the perspective of evolutionary game. Finance Account. Bull. 11, 19–22+129
(2015)
10. Liu, H.: On the dilemma of government regulation of internet finance and the way to solve
it. Legal Bus. Stud. 35(5), 58–69 (2018)
Research on Task Scheduling Method of Mobile
Delivery Cloud Computing Based on HPSO
Algorithm

Jianjun Li1,2,3 , Junjun Liu1,2,3 , Yu Yang1,2,3(B) , and Fangyuan Su1,2,3


1 School of Computer and Information Engineering,
Harbin University of Commerce, Harbin 150028, China
2 Heilongjiang Cultural Big Data Theory Application Research Center, Harbin 150028, China
3 Heilongjiang Key Laboratory of E-Commerce and Information Processing,

Harbin 150028, China

Abstract. Study of cloud computing task scheduling problem, in view of many


existing in the process of logistics transportation route choice question, to consider
in a certain amount of time constraints, road clearance restriction, road comfort,
and driving cost constraints under the condition of how to carry out the task allo-
cation, to achieve the shortest path as the optimization goal, application of particle
swarm optimization (PSO) algorithm to extend it a task scheduling method based
on hybrid particle swarm optimization (PSO) cloud is proposed. Experimental
results show that hybrid particle swarm optimization plays a good role in cloud
computing task scheduling, which can shorten the distance to a certain extent,
improve the efficiency, improve the applicability and scalability of the algorithm,
not trapped in local optimization, and achieve the optimal global effect.

Keywords: Cloud computing · Mixed particle swarm · Task assignment

1 Introduction
Particle swarm optimization transforms these massless particles into moving birds with
two properties: speed, how fast or slow they are moving, and position, which is where
they are moving. Every particle has their own behavior, in a certain range of space alone
to find the optimal solution, and record the optimal solution for individual an extreme
value, and to be Shared with other particles, and the extreme value is taken as the overall
optimal solution, and other particles adjust their current state according to this value.
Its advantages such as strong optimization ability and easy implementation have been
used to solve many practical problems [1]. It is a search algorithm in a population, using
individuals in the population to search a certain space of value. In this particular case,
the population can be called a swarm, and the individuals can be regarded as particles.
Other particles adjust their current state based on this value and speed and record the
best position during the adjustment.
With the continuous progress of contemporary society, more and more people buy
the goods they need through online shopping, which brings the number of orders is

© The Author(s), under exclusive license to Springer Nature Switzerland AG 2022


A. E. Hassanien et al. (Eds.): BIIT 2021, LNDECT 107, pp. 670–680, 2022.
https://doi.org/10.1007/978-3-030-92632-8_63
Research on Task Scheduling Method 671

increasing, so how to carry out distribution also has It has become a problem that needs
to be solved at present. For now, the vast majority of logistics in our country’s road
transportation, or a few special cases with the method of rail and air transport, however,
in this process, the number of trucks that can be put into use has a certain limit, so how to
use the limited conditions to achieve the lowest cost and the fastest delivery of logistics
is one of the goals that all logistics companies need to achieve. Therefore, this article
takes an in-depth look at cloud computing task scheduling. The main contents of this
article are as follows:

(1) Firstly, it summarizes many algorithms of the swarm intelligence algorithm and
related research on cloud task scheduling and finds that task assignment is
inseparable from the swarm intelligence algorithm.
(2) The task scheduling problem of cloud computing was studied. Under time con-
straints, road gap, road accessibility, and driving cost, the task assignment model
was established with the optimal path as the optimization goal.
(3) Integrating the relevant PSO in swarm intelligence, an HPSO task scheduling
method is proposed to find the most appropriate path under relevant constraints
(4) Through relevant simulation experiments to ensure the feasibility of task scheduling
method, the leadership of HPSO is obtained from the algorithm fitness and the
shortest path

2 Related Work
Cloud task scheduling is mainly divided into single cloud task scheduling and cross-cloud
task scheduling, optimized through different methods.
In terms of task allocation, the research results of relevant scholars are as follows:
Entisar S et al. proposed a new resource allocation model for task scheduling based on
multi-objective optimization (MOO) and PSO algorithm because of the complexity of
task scheduling and the complexity of clients’ demands on execution time and through-
put. On this basis, a new multi-objective PSO is proposed. Execution time and wait time
are shortened, and throughput is improved [2]. Awad AI et al. Proposed the task model
of lmbpso under the constraints of reliability and time, saving the cost and obtaining a
feasible plan [3]. Fahimeh ramezani et al. Established a multi-objective comprehensive
model of task optimization with the ultimate goal of minimum time [4]. Masdari M
et al. analyzed the cloud environment task and workflow scheduling scheme. They pro-
posed the particle swarm optimization model, which improved the efficiency and solved
the appropriate task and workflow scheduling [5]. According to the most difficult job
scheduling problems in parallel distributed computing environments such as clusters,
grids, and clouds, Kaur N et al. used meta-heuristic algorithms such as genetic algorithm
and ant colony algorithm to solve the approximate optimal solution of job scheduling
problems and effectively utilized computing resources [6].
As for the swarm intelligence algorithm, related scholars’ research results are as
follows: Yousif s et al. Came up with a new method. Based on the BSA algorithm,
swarm intelligence algorithms such as the bat algorithm solve an optimization module
of variables with different relationships in different ranges [7]. Xu Z et al. The effects
672 J. Li et al.

of ribosome-targeted antibiotics on the metabolism of Pseudomonas aeruginosa were


explored, and the first genome-scale modeling method was proposed. [8]. Mashwani
W K et al. proposed a hybrid algorithm group intelligence (HSI) algorithm using bat
algorithm (BA) and practical Group optimization algorithm (PSO) voters to perform
their search process to deal with the recently designed benchmark function in the special
session [9]. Mortazavi A et al. proposed A new optimization algorithm -- interactive
search algorithm, which has strong competitiveness with other perfect meta-heuristic
algorithms [10]. Mathew A T et al. On the premise of different path choices, their
efficiency is compared, and the problem of solving the shortest path for the ant colony
algorithm and particle swarm optimization algorithm is solved. In order to achieve more
efficiency in path planning, a new hybrid algorithm particle swarm optimization (PSO)
is proposed [11].
As for cloud task scheduling, the research results of relevant scholars are as follows:
Moon Y J et al. proposed a cloud task scheduling algorithm based on ant colony optimiza-
tion, which solves the optimal global problem by avoiding the long path of pheromone
error to solve the problem of subordinate ants, which are clustered by leading ants [12].
Based on the improvement of ant hill cloud control algorithm, Reddy g and other arti-
cles mainly help to shorten the work completion time to the greatest extent, realize the
multi-objective task planning (mots) process and improve the performance of the task
planning program by reducing the maximum completion time and heterogeneity [13].
Geng S et al. developed a multi-objective cloud model with four objectives: minimizing
time, minimizing time cost, maximizing resource utilization, and load balancing. At the
same time, multi-objective optimization is carried out to propose an algorithm based on
mixed angles to solve the model [14].
Therefore, this article will swarm intelligence algorithm is applied to cloud comput-
ing task scheduling and researching, the main consideration in the process of task assign-
ment, the distribution of the possible unreasonable situation, under a certain amount of
time constraints, in order to achieve high efficiency of distribution and the distance
optimization as the optimization goal, proposed hybrid particle swarm cloud computing
task scheduling method, improving the efficiency of the algorithm, realize the global
optimization.

3 Model Construction

3.1 Description of Cloud Task Assignment Problem

Based on the assumption that many logistics enterprises take road transportation as the
condition, M delivery vehicles will deliver N pieces of goods to the designated place.
Assuming all the restrictions are met, all the designated locations have been visited,
and the delivery vehicles eventually return to its point of departure. For the multi task
allocation of logistics, the following assumptions are made:

(1) Starting from the departure of the delivery vehicles and ending of the task, the task
will not be changed during the process. That is to say, the customer will not cancel
or modify the logistics information.
Research on Task Scheduling Method 673

(2) When carrying out logistics distribution, all delivery vehicles should have no
ongoing or pending distribution tasks.
(3) All trucks will depart from the logistics company and return to the company after
the task returns.

Under the condition of the above hypothesis, the logistics company received logistics
task, the task to the cloud platform, cloud platform to classify different tasks, will be
neat, the same type of task and then distributes tasks to delivery vehicles drivers, delivery
vehicles drivers perform tasks and logistics information on to the cloud platform, the
platform to record the number of delivery vehicles driver to perform a task, And feed
back the information uploaded by the delivery vehicles driver to the logistics company.
See the figure below (Fig. 1):

Fig. 1. Cloud task assignment model

3.2 Establishment of Cloud Computing Task Scheduling Model

The optimization of the current logistics distribution problem is mainly the process
of optimizing by shortening the time and distance. In real life may appear crowded
roads, road construction, road restrictions such as special cases, by adding the time
constraints, road space constraints, the smooth general characteristic, road and driving
cost constraints of the four different constraint conditions, improve the accuracy of the
logistics distribution route planning in the process, find the shortest path distribution
path, The optimization objective function in the process of task scheduling is presented
as follows:
 
D = min( t
Xab dab ) (1)
a∈ZM b∈ZM
674 J. Li et al.

Constraints include:


⎪ T ≤ Tc
⎨ t
Xab tab > tc
(2)
⎪ t > zfc
⎪ Xab zfab
⎩ 
C=k (1 + gab )Xab
t d <f
ab max

Equation (3) is to calculate the time constraint:


  dab
T= (3)
vab × tab
a∈ZM b∈ZM

Among them,at that time,dab ∈ M judge whether the current node is on the path,
if so, then let Xtab = 1, not so Xtab = 0; zfij ∈ (0, 1), zfij the closer to 1, the higher
the driving comfort coefficient zfc indicates minimum comfort during the driving of the
delivery vehicles. In the interval, when the comfort level is lower than this value, the
delivery vehicles driver refuses to drive; tab ∈ (0, 1), tab the closer to 1, the lower the
degree of traffic jam; tc represents the minimum acceptable congestion threshold. When
this value is 0, indicates that this part is not allowed to pass (Table 1).

Table 1. Symbols used in the model

Symbol Explain
M The set of paths that a point of departure contains
dab how far are a and b
t
Xab Judge the location of the path
ZM The set of network nodes for the path
T Total travel time of all transport vehicles
Tc Total maximum time consumed by all logistics vehicles
zfab The comfort factor for driving between nodes a and b
tab The level of traffic jams on the roads
C Total cost in logistics distribution process
k Average cost of fuel per kilometer
g Percentage increase in fuel consumption under different driving conditions
fmax Maximum allowable cost of vehicles during distribution
Vab The velocity from a to b
tc Minimum acceptable congestion threshold
Research on Task Scheduling Method 675

4 Hybrid Particle Swarm Optimization Task Scheduling Method


4.1 Particle Swarm Optimization Algorithm

The standard PSO algorithm is mainly applied to the optimization of continuous solution
space problems. Its mathematical description is as follows:
It is assumed that the problem solved is carried out in n-dimensional space and the
number of particle swarm is M. With the occurrence of iteration, the position of particle
will also change correspondingly. It is suppose that the position where particle I reaches
after t iterations is expressed as Xi (t) = {vi1 (t), vi2 (t), ..., vin (t)}, i = 1, 2, ..., M , And
the corresponding velocity of the particle is denoted as vi (t) = {vi1 (t), vi2 (t), ..., vin (t)},
Then the JTH (j = 1,2… n) is expressed in the following formula:

vij (t) = wvij (t − 1) + c1 r1 (pij − xij (t − 1)) + c2 r2 (gj − xij (t − 1))



vmax vij (t) > vmax
vij (t) =
−vmax vij (t) < −vmax

The position update of particle i at time t can be calculated by the following formula

xij (t) = xij (t − 1) + vij (t)

In the formula, w is the inertia weight, c1 and c2 are the acceleration factors, r1 and
r2 are the random numbers in [0,1], and gj is the optimal value in the j dimension.

4.2 Hybrid Particle Swarm Optimization Algorithm

Hybrid particle swarm algorithm flow chart as shown, first on the particle swarm ini-
tialization, set the number of initial population, and then the fitness calculation, it is
concluded that fitness curve, and then according to the fitness value of particles are
updated, find out the optimal particle and group optimal individual particles, individual
respectively and the optimal particle and crossover operation for the optimal particle
groups and groups, Each produces a new particle after the intersection, and this particle
after the mutation will produce another new particle (Fig. 2).
676 J. Li et al.

Fig. 2. Flowchart of hybrid particle swarm optimization

4.3 Algorithm Implementation


1. Individual coding
Consider each individual in the logistics process as a particle, and each particle has its
own exclusive number. Suppose that it needs to pass through 5 cities at present, then
[5 1 4 2 3], means that the city traversal starts from 5 and goes through 1,4,2,3…
And will eventually return to the starting point, this represents the completion of a
transport process.
2. Fitness value
The particle fitness value is expressed as the length of traversal path, and the
calculation formula is

n
fitness(a) = patha,b
a,b=1

Where, n is the number of cities, and patha,b is the distance between two cities
ab.
3. Cross operation
Cross the individual extremum and the whole extremum at a certain position to form
a new individual. The operation is as follows:

[5 1 4 2 3] cross
−→[5 2 1 2 3]
[5 4 1 3 2]

Because of the special case of the crossover operation, if the repeated particle
exists in the new individual generated after the crossover operation, then the repeated
Research on Task Scheduling Method 677

particle needs to be adjusted.


adjust
[5 2 1 2 3] −→[5 4 1 2 3]

4. Variation operation

Variation operation is the self-adjustment within the individual, which is characterized


by variability and randomness. That is, the location of variation is uncertain, which
can be the starting point, the end point, or the intermediate process. Assuming that the
mutation position is selected in process 1,2, the starting point is kept unchanged, the
positions of 1 and 2 are exchanged, and the position of 3 is advanced.
variation
[5 4 1 2 3] −→ [5 4 3 2 1]

5 Simulation Experiment
5.1 Experimental Environment and Parameter Setting

The hardware environment of the experiment is: Intel (R) Core (TM) I5-6200U CPU @
2.30 ghz, 12.00 GB memory, Windows10 system, Matlab R2016b simulation platform.
Simulation task C = 31, maximum number of iterations = 500, Ca = 200, p = 1, the
parameters are set as follows (Table 2).

Table 2. Parameter setting

The parameter name The size of the values


p 1
Ca 200
C 31
Maximum iteration number 500
Population size 100

5.2 Analysis of Experimental Results

(1) Fitness changes of particle swarm optimization algorithm (PSO) and mixed PSO
The maximum iteration of the algorithm is 500 times. The fitness change curves of
PSO and hybrid PSO algorithms are shown in Fig. 3.

(2) The shortest distance generation process


678 J. Li et al.

Fig. 3. Algorithm fitness curve

The position of each city is plotted according to the coordinate points of each city,
as shown in Fig. 4. The shortest distance is obtained by the hybrid particle swarm
optimization algorithm:15601.9195. There may be many shortest paths, but the shortest
distance is only one. According to Fig. 6, you can see the change of distance. It is not
difficult to find that both the shortest distance and the average distance decrease in a
certain interval and then plateau in the later period (Fig. 5).

Fig. 4. Schematic diagram of city location Fig. 5. Optimize the shortest path diagram
Research on Task Scheduling Method 679

Fig. 6. Comparison of the shortest distance and average distance in each generation

6 Conclusion
With the optimization target, this paper to realize the shortest path problem with time
constraints, road space constraints, the road smooth general characteristic and driving
cost constraints, four constraint conditions of cloud computing task scheduling model
is established, based on cloud computing task scheduling model, the collection related
features of swarm intelligence algorithm, hybrid particle swarm algorithm in cloud com-
puting task scheduling method, Each city is simulated as a point on the coordinate axis,
and the optimal path is selected. Through the simulation experiment, the approximate
distribution map of the city is drawn, and the optimal path is obtained through the hybrid
particle swarm optimization algorithm, which greatly reduces the transportation distance
and cost, and improves efficiency.

Acknowledgment. This work is partly supported by the project supported by the National Social
Science Foundation (16BJY125), Heilongjiang philosophy and social sciences research planning
project (19JYB026), Key topics in 2020 of the 13th five year plan of Educational Science in Hei-
longjiang Province (GJB1320276), Project supported by under-graduate teaching leading talent
training program of Harbin University of Commerce (201907), Key project of teaching reform and
teaching research of Harbin University of Commerce in 2020 (HSDJY202005(Z)), Innovation and
entrepreneurship project for college students of Harbin University of Commerce (202010240059),
School level scientific research project of Heilongjiang Oriental University (HDFKY200202), Key
entrusted projects of higher education teaching reform in 2020 (SJGZ20200138).

References
1. Wang, R.: A novel hybrid particle swarm optimization using adaptive strategy[J]. Inf Sci 579,
231–250 (2021)
2. Alkayal E, Jennings R, Abulkhair F (2017) Efficient task scheduling multi-objective particle
swarm optimization in cloud computing[C]. Local Computer Networks Workshops. IEEE
680 J. Li et al.

3. Awad, I., El-Hefnawy, A., Abdel, K.M.: Enhanced particle swarm optimization for task
scheduling in cloud computing environments[J]. Procedia Comput Sci 65(1), 920–929 (2015)
4. Ramezani F, Jie L, Hussain F (2013) Task scheduling optimization in cloud comput-
ing applying multi-objective particle swarm optimization[C]. International Conference on
Service-Oriented Computing. Springer, Berlin, Heidelberg
5. Masdari, M., Salehi, F., Jalali, M.: A survey of PSO-based scheduling algorithms in cloud
computing[J]. J Netw Syst Manag 25(1), 122–158 (2017)
6. Kaur, N.: Comparative analysis of job scheduling algorithms in parallel and distributed
computing environments[J]. Int J Adv Comput Res 8(3), 948–956 (2019)
7. Yousif, S., Saka, M.P.: Enhanced beetle antenna search: a swarm intelligence algorithm. Asian
J Civ Eng 22(6), 1185–1219 (2021). https://doi.org/10.1007/s42107-021-00374-z
8. Xu, Z., Ribaudo, N., Li, X.: A genome-scale modeling approach to investigate the antibiotics-
triggered perturbation in the metabolism of pseudomonas aeruginosa[J]. IEEE Life Sci Lett
99, 1 (2017)
9. Mashwani, K., Hamdi, A., Jan, M.: Large-scale global optimization based on hybrid swarm
intelligence algorithm[J]. J Intell Fuzzy Syst 39(1), 1257–1275 (2020)
10. Mortazavi A, Togan V, Moloodpoor M (2019) Solution of structural and mathematical opti-
mization problems using a new hybrid swarm intelligence optimization algorithm[J]. Adv
Eng Softw 127(Jan):106–123
11. Mathew, T., Paul, A., Rojan, A.: Implementation of swarm intelligence algorithms for path
planning[J]. J Phys Conf Ser 1831(1), 012008 (2021)
12. Moon J, Yu C, Gil M (2017) A slave ants based ant colony optimization algorithm for task
scheduling in cloud computing environments[J]. Hum -centric Comput Inf Sci 7(1):28
13. Reddy, G., Phanikumar, S.: Multi objective task scheduling using modified ant colony
optimization in cloud computing[J]. Int J Intell Eng Syst 11(3), 242–250 (2018)
14. Geng, S., Wu, D., Wang, P.: Many-objective cloud task scheduling[J]. IEEE Access 99, 1
(2020)
Research on the Evaluation of Forestry Industry
Technological Innovation Ability Based
on Entropy TOPSIS Method

Shangkun Lu(B)

School of Economics, Harbin University of Commerce, Harbin 150028, China

Abstract. Forestry is an important part of our national economy. In order to


improve the technological innovation capability of the forestry industry, based
on relevant research, a four-dimensional forestry industry technological innova-
tion capability index system has been established, including the forestry industry
technological innovation resource utilization capability, the forestry industry mar-
ket innovation capability, the forestry technological innovation support capability
and the forestry industry management Creativity. Through the entropy method
and TOPSIS method, the forestry industry of 29 provinces in my country was
evaluated statically and dynamically. The results show that Guangdong, Sichuan,
and Zhejiang have the relatively high static capacity for forestry industry technol-
ogy innovation; the contribution of the growth rate of total forestry output value,
timber sales in state-owned forest areas, the number of college graduates, and the
number of school faculty and staff to the forestry industry’s technological innova-
tion capacity Relatively weak, it cannot be a major factor in increasing the overall
growth rate of the forestry industry.

Keywords: Forestry industry · Technological innovation ability · Entropy


weight · TOPSIS

1 Introduction
The forestry industry is a basic industry for developing the national economy and a key
industry that coordinates ecological, economic, and social benefits. The “five-in-one”
modernization construction pattern clarifies the dominant position of ecological civi-
lization construction and highlights the fundamental support of forestry to social and
economic development. Technological innovation in the forestry industry is an important
guarantee for optimizing the structure of the forestry industry, adjusting the layout of the
forestry industry, improving the efficiency of forest resource utilization, and improving
the quality of forestry industry development. It is also a key factor in realizing the inten-
sive, large-scale, orderly, and sustainable forestry development. Based on the research
results of forestry industry technology innovation at home and abroad, this research will
conduct an empirical analysis of the forestry industry technology innovation capabilities
of 29 provinces in China through literature statistics, objectively evaluate the true level
of my country’s forestry industry technology innovation capabilities, and provide it for

© The Author(s), under exclusive license to Springer Nature Switzerland AG 2022


A. E. Hassanien et al. (Eds.): BIIT 2021, LNDECT 107, pp. 681–688, 2022.
https://doi.org/10.1007/978-3-030-92632-8_64
682 S. Lu

China’s forestry industry Provide pragmatic support for the sustainable development of
the industry.
Changbing Chen proposed that to enhance my country’s industrial technological
innovation capabilities, investment in technological innovation should be increased, and
the efficiency of technological innovation should be improved [1]. Baoming Chen pro-
posed building an evaluation index system of innovation capability based on influencing
factors [2]. Yong Fu explored the connotation, material basis, approach model, basic
management form and industrial attributes of sustainable forestry development, and the
importance of technological innovation in it, thereby in-depth analysis and revealing that
technological innovation is the core driving force of sustainable development forestry
development [3]. Shaopeng Zhang and Hongge Zhu used the spatial Dubin model to
prove that the development of the forestry industry has significant spatial spillover
effects [4]. Zhang Zeng and Wen Fan defined the concept of high-tech industry, and
they constructed an evaluation index system for technological innovation and industrial-
technological innovation capabilities [5]. Yun Chen and Chunfang Tan constructed a
theoretical system of evaluation indicators from two aspects of enterprises’ technologi-
cal innovation capability and innovation efficiency [6]. Yuwei Du and Zhifang Wan used
GEMS model and index system to measure the technological innovation capability of
forestry industry [7]. Zhifang Wan and Xiaolin Ma Use the entropy method to measure
the forestry industry’s technological innovation capability index and propose that tech-
nological innovation is the key to transformation [8]. Baiqing Ye believes that internal
expenditure of R&D expenditure, new product development expenditure, number of
scientific research personnel, full-time equivalent of R&D personnel and expenditure
of digestion and absorption expenditure are important factors affecting technological
innovation capability [9]. Gang Wang and Wei Chen used the Entropy-Topsis model
to empirically measure the competitiveness of the national forestry industry based on
inter-provincial cross-sectional data [10].
To sum up, the development of forestry industry has been related to China’s ecolog-
ical environment construction, social and economic development, the industrial social
responsibility of the benign recycling of resources. It is also a necessary prerequisite
for the protection and sustainable utilization of forest resources. The evaluation of tech-
nological innovation ability of forestry industry is a comprehensive process involving
multi-objective and multi-index, but at present, the index weight mostly adopts subjec-
tive assignment method, which makes the evaluation subjective and objective. In this
paper, the technological innovation ability of forestry industry is taken as the research
point. The entropy weight TOPSIS method is used to empirically measure the innovation
data of forestry industry in 29 provinces of China, objectively analyze the main influ-
encing factors of the technological innovation ability of forestry industry in China, and
put forward countermeasures and suggestions conducive to the sustainable development
of forestry industry.
Research on the Evaluation of Forestry Industry Technological Innovation Ability 683

2 Data Sources and Research Methods


2.1 Data Sources
The data used in this paper are all from China Statistical Yearbook, China Industrial Eco-
nomic Statistical Yearbook, China Forestry Statistical Yearbook, China Labor Statistical
Yearbook, China Financial Yearbook, and China Forestry Development Report issued
by the State Forestry Administration. In addition, labor productivity, market share, and
other related indicators are derived from the statistical yearbooks of China. In contrast,
the graduation number of regular and junior college students, local financial revenue,
and actual foreign investment in the forestry industry in each region are derived from
the statistical yearbooks of each province.

2.2 Evaluation Index System


Industrial technological innovation mainly includes management innovation, techno-
logical innovation, and market innovation system engineering. Technological innovation
activities of the forestry industry have certain particularity and also include innovation of
forestry resource utilization. Each factor supports, coordinates, and promotes each other
to support the whole innovation system. This study fully considers the characteristics
of the forestry industry, combines the research results of the technological innovation
ability of the forestry industry at home and abroad, and constructs the evaluation index
system of the technological innovation ability of the Forestry Industry in China based
on scientific, operational, targeted and applicable principles.

2.3 Evaluation Methods


TOPSIS method is a multi-objective decision-making method. The principle is to cal-
culate the distance between each evaluation object and the optimal and worst scheme
by measuring the optimal and worst schemes in the priority schemes and obtaining each
evaluation object’s relative closeness to the optimal scheme. The evaluation and rank-
ing of evaluation objects have the advantages of simple calculation, small sample size
requirement, and reasonable results.

(1) Dimensional and contend processing


Positive indicators.
 
xij − min(x1j , x2j , . . . , xnj )
yij = × 100 (1)
max(x1j , x2j , . . . , xnj ) − min(x1j , x2j , . . . , xnj )
Negative indicators.
 
max(x1j , x2j , . . . , xnj ) − xij
yij = × 100 (2)
max(x1j , x2j , . . . , xnj ) − min(x1j , x2j , . . . , xnj )
Further, get the canonical matrix.
yij
Z = (zij )m×n , In, zij =  (1 ≤ i ≤ m, 1 ≤ j ≤ m) (3)

m
yij2
i=1
684 S. Lu

(2) Determined weighting matrix specification.


 
X = (Xij )m×n = wj • zij m×n

(3) Calculate ideal A* and negative ideal solutions A-.


   
A∗ = x1∗ , x2∗ , . . . , xn∗ , A− = x1− , x2− . . . , xn− .

For efficiency criteria, that is, high-efficiency indicators Cj.


 
Xj∗ = max wj • zij |1 ≤ i ≤ m }, Xj− = min wj • zij |1 ≤ i ≤ m }

For cost-type criteria, which is inefficient index Cj.


 
Xj∗ = min wj • zij |1 ≤ i ≤ m }, Xj− = max wj • zij |1 ≤ i ≤ m }

(4) Calculate the Euclidean distance.


The distance between ideal solution A* and negative ideal solution A-
⎡ ⎤1 ⎡ ⎤1
n 2 n 2

di∗ =⎣ (xj∗ − wj • zij ) 2⎦


and di− =⎣ (xj∗ − wj • zij ) 2⎦
(i = 1, 2, . . . , m).
j=1 j=1

The smaller the result di∗ , the larger solution di− ,The better the plan.
(5) Calculate the closeness of each index to the optimal index.
The final relative closeness of Scheme Ai to the ideal solution is.
di∗
ci∗ = (1 ≤ i ≤ m).
di− + di∗

ci∗ Greater, Scheme Ai is closer to the ideal solution.


(6) Sort the schemes according to the size of ci∗ .

3 Empirical Research

In order to objectively and accurately evaluate the technological innovation ability of


forestry industry, this study clarified the real level of technological innovation ability
of the forestry industry in 29 provinces and regions of China through static evaluation.
It clarified the regional differences in the technological innovation ability of forestry
industry in each province. Through dynamic evaluation, the time sequence characteristics
of the technological innovation ability of the forestry industry in China are recognized,
and the development trend of the innovation ability of the forestry industry in China is
accurately grasped.
Research on the Evaluation of Forestry Industry Technological Innovation Ability 685

3.1 Static Evaluation

The use of the entropy method, because of its objectivity, is conducive to calculation and
evaluation. Therefore, this article uses cross-sectional data to calculate the arithmetic
average of each indicator in the four-year data of 29 provinces in the country from 2015 to
2018 to eliminate the accidental factors of a single year’s data. Calculate the normalized
results of the original data, and get the entropy matrix and weight matrix of each index,
as shown in Table 1 and Table 2.
From this, it can be concluded that the average weight of each level of the forestry
industry’s innovation ability in the country’s provinces from 2015 to 2018. Capability
of the forestry industry, the management innovation ability of the forestry industry has
the largest weight, and the overall entropy is the smallest; the innovation ability of the
forestry industry and the actual utilization of foreign capital in the forestry have the
largest weight; the innovation ability of the forestry market, The timber sales volume
and forest product sales revenue in state-owned forest areas have a greater impact; the
utilization capacity of forestry scientific and technological innovation resources and the
total output value of forest products have the largest impact.

Table 1. Index entropy.

Index D1 D2 D3 D4 D5 D6 D7 D8 D9 D10
Entropy 0.94 0.86 0.90 0.93 0.87 0.91 0.89 0.76 0.93 0.91
Index D11 D12 D13 D14 D15 D16 D17 D18 D19
Entropy 0.91 0.88 0.73 0.86 0.72 0.83 0.43 0.93 0.94

Table 2. Index weight.

Index D1 D2 D3 D4 D5 D6 D7 D8 D9 D10
Entropy 0.02 0.06 0.04 0.03 0.05 0.04 0.05 0.07 0.02 0.02
Index D11 D12 D13 D14 D15 D16 D17 D18 D19
Entropy 0.03 0.04 0.08 0.05 0.10 0.06 0.19 0.02 0.02

Calculate the above four dimensions of forestry industry innovation capability and the
distances of positive ideal solutions, negative ideal solutions, optimal solution closeness,
and the distribution ratio for decision-making, and rank according to closeness. The final
calculation result is shown in Fig. 1.
It can be seen from Fig. 1 that the static evaluation results of the forestry industry’s
innovation ability in Guangdong Province are the highest. Its resource utilization ability,
market innovation ability, technological innovation support ability and management
innovation ability all rank the highest, indicating that the static state of the forestry
686 S. Lu

1
0.8 Resource ulizaon
0.6
0.4
0.2 Market innovaon ability
0

Inner…
Jiangxi
Beijing
Fujian

Hubei

Hebei
Zhejiang

Shandong
Liaoning

Shaanxi
Tianjin
Yunnan
Chongqing

Gansu
Guangdong

Technical innovaon
support ability

Fig. 1. Forestry industry innovation results of the static evaluation.

industry’s innovation ability in Guangdong Province is very good. Sichuan Province


ranked second, with an increase of 9.96% over the previous year, and the growth rate
was 1.96 percentage points higher than the provincial average. Zhejiang Province ranked
third in the static evaluation results of forestry industry innovation capabilities. The
total output value of the province’s forestry industry has reached It has grown from
452.3 billion yuan in 2015 to 664.6 billion yuan in 2019. Its development momentum is
relatively large, and its technological innovation capability is strong (Fig. 2).

0.6
0.4
0.2
0 Inner…
Guangd…

Heilongji…
Jiangxi

Beijing
Jiangsu
Fujian

Hubei
Shanghai

Guangxi

Guizhou
Shaanxi
Qinghai

Hebei
Sichuan
Zhejiang

Shandong

Liaoning

Jilin

Tianjin

Shanxi
Henan
Hunan

Yunnan

Chongqing

Xinjiang

Gansu
Anhui

Hainan

Fig. 2. Forestry industry comprehensive innovation degree change.

Results: First, for the forestry industry’s technological innovation resource utilization
capacity index, the optimal solution closeness of most provinces is below 0.3, accounting
for 70% of all samples. The proximity of Guangdong, Sichuan, Zhejiang and Hunan is
above 0.5. It can be found that the output value of the tertiary industry has a high impact
on the resource utilization capacity of the forestry industry. Second, the forestry industry
market innovation capability index, most provinces have an optimal solution closeness
below 0.3, and only Guangxi has a closeness above 0.5 to the optimal solution. In the
forestry industry’s market innovation capability, the most weighted indicators are the
wood sales volume of state-owned forest areas and the total amount of major economic
forest products. Third, it affects the discovery of the forestry industry’s technological
innovation support ability, the sales revenue of forestry products and the amount of
forestry investment. Fourth, analyze the indicators that affect the management innovation
ability of the forestry industry. The actual use of foreign investment in the forestry
industry has a relatively important impact on the management innovation of the forestry
industry.
Research on the Evaluation of Forestry Industry Technological Innovation Ability 687

3.2 Dynamic Evaluation

Panel data from 2006 to 2018 were used for dynamic evaluation. The standardized results
of original data are calculated to obtain the entropy matrix and weight matrix of each
index and the average weight of innovation capability index of the forestry industry in
each province.
The secondary indicators of forestry industry innovation capability, the forestry
industry market innovation capability has the largest weight, and the overall entropy
is the smallest; among the forestry industry’s market innovation capability, the forestry
industry The contribution rate and government financial investment of forestry industry
have relatively large influencing factors; among the indicators of forestry industry inno-
vation capability, the number of state-owned units in the forestry system has obvious
factors; the forestry industry’s technological innovation supportability, the most influen-
tial factor is the amount of forestry investment; forestry industry technology innovation
resources Utilization capacity, the most influential factor is the gross value of tertiary
industry forest output.
By calculating the four dimensions of forestry industry innovation ability and the
distances of positive ideal solutions and negative ideal solutions, the closeness to the
optimal solution and the distribution ratio for decision-making, and ranking accord-
ing to closeness, the final calculation results. First, from 2006 to 2018, the forestry
industry’s innovative resource utilization capacity and the optimal solution approaching
degree increased. In 2012 and 2014, the approaching degree declined, and in 2010, the
approaching degree rose to more than 0.5. Among the forestry industry’s innovative
resource utilization capacity indicators, the main factors leading to the decline are the
growth rate of the total output value of the forestry industry and the total value of the
tertiary industry forestry. Second, in 2013, in addition to timber sales in state-owned
forest areas, the number of forestry enterprises, and educational funding, the proportion
of forestry industry structure, government financial investment in the forestry industry,
the total amount of major economic forest products, and the contribution rate of the
forestry industry were all significantly higher than in previous years.. Third, the degree
of closeness to the optimal solution fluctuates greatly in the forestry industry’s techno-
logical innovation supportability. There was a decline in 2012 and 2014, and a greater
improvement was achieved in 2013. The original data shows that the technical inno-
vation support capacity of forestry products is closely related to product sales revenue
and comes from the market. Fourth, in terms of forestry industry management inno-
vation ability, the degree of closeness to the optimal solution fluctuates greatly, with a
substantial increase in 2013 and a slight decline in 2012.

4 Conclusion
Through the entropy method and TOPSIS method, the forestry industry of 29 provinces
in my country is statically and dynamically evaluated. Find out the advantages and prob-
lems of forestry technology innovation in various provinces. Improve the forestry indus-
try technology innovation evaluation system and assessment indicators. Build a forestry
688 S. Lu

industry technology innovation financing platform, an industry-university research inno-


vation system platform, optimize the resource allocation of the forestry industry, acti-
vate the forestry industry innovation activities, and promote the organic integration of
the forestry industry and the economic system. Strengthen the investment in technolog-
ical innovation resources of the forestry industry, introduce and absorb leading science
and technology, stimulate innovation source power, and provide a strong guarantee for
technological innovation and continuous innovation in the forestry industry.

References
1. Chen, C.B.: Measurement and analysis of the degree of industrial technology innovation in
my country. Economic Aspects 6, 76–87 (2019)
2. Chen, B.M.: Research on my country’s industrial technology innovation capability evaluation
index system. Sci Technol Ind 6(11), 22–25 (2006)
3. Fu, Y.: Technological innovation is the core power of sustainable development of forestry
economy. Dev Res 143(S1), 133–135 (2009)
4. Zhang, S.P., Zhu, H.G.: The spatial impact mechanism of China’s forestry industry develop-
ment——a spatial econometric analysis based on panel data of 31 provinces. World For Res
32(2), 61–66 (2019)
5. Zeng, Z.N.: Research on the evaluation index system of technological innovation capability of
high-tech industry. Journal of Xi’an Shiyou University (Social Science Edition) 23(1), 6–10
(2014)
6. Chen, Y., Tan, C.F.: Research on the evaluation index system of technological innovation
capability of small and medium-sized technological SMEs. Science and Technology Progress
and Policy 29(2), 110–112 (2012)
7. Du, Y.W., Wan, Z.F.: Research on the path choice of forestry industry transformation in state-
owned forest regions of heilongjiang province. Issues of Forestry Economics 39(03), 25–33
(2019)
8. Wan, Z.F., Ma, X.L.: Dynamic evaluation of technical innovation ability of wood industry
based on entropy method. Stat Decis 541(01), 74–78 (2020)
9. Ye, B.Q., Wei, W.: Research on the influencing factors of technological innovation capability
of high-tech industries in the three northeast provinces based on PLS. Journal of Harbin
University of Commerce (Social Science Edition) 3, 122–128 (2015)
10. Wang, G., Chen, W.: measurement of forestry industry competitiveness based on entropy-
topsis. Stat Decis 534(18), 57–60 (2019)
Research on the Impact of Temporary Workers’
Psychological Contract Fulfillment on Task
Performance in the Sharing Economy

Genlin Zhang1(B) , Linlin Tian1 , and Jie Xie2


1 Xi’an University of Science and Technology, Xi’an 710000, China
2 Xi’an University of Architecture and Technology, Xi’an 710000, China

Abstract. In the sharing economy, temporary workers have a sense of isolation,


which leads to the weak cognition of the organization and the unpredictability of
job performance. Based on Social Exchange Theory, taking temporary workers
in the sharing economy as the research object, this paper examines the influ-
ence of psychological contract fulfillment perceived by employees on task per-
formance. It explores the mediating effect of organizational identification and the
moderating effect of length of service. The structural equation model is used to
analyze 572 shared temporary workers’ questionnaires. The results indicate that
employees perceived psychological contract fulfillment positively influences task
performance, and transactional psychological contract fulfillment directly influ-
ences task performance. Organizational identification plays a mediating part in
the path of psychological contract fulfillment’s influence on task performance,
and relational contract fulfillment has a mediating effect on task performance in
the mediating mechanism. The fulfillment of transactional psychological contracts
perceived by employees with shorter service has a greater influence on organiza-
tional identity. The fulfillment of relational psychological contracts perceived by
employees with longer service greatly influences task performance.

Keywords: Sharing economy · Psychological contract fulfillment ·


Organizational identification · Task performance

1 Introduction
The emergence of the sharing economy enables companies to hire labor from online
platforms. According to the “Annual Report on China’s Sharing Economy Development
(2021) [1]”, sharing economy platform enterprises have about 6.23 million staffs, and
platform employees have become a huge new labor group. In the practice of sharing
employment management, there is no substantial labor relationship between employ-
ees and the company. The flexibility of working time and space subvert the traditional
work model under the employment relationship. It is difficult to directly manage and
motivate employees, and they have low loyalty and enthusiasm, leading to inefficiency
and absenteeism. Currently, it is extremely important to form an implicit and unwritten

© The Author(s), under exclusive license to Springer Nature Switzerland AG 2022


A. E. Hassanien et al. (Eds.): BIIT 2021, LNDECT 107, pp. 689–702, 2022.
https://doi.org/10.1007/978-3-030-92632-8_65
690 G. Zhang et al.

psychological contract to regulate the relationship between the two parties. The psycho-
logical contract refers to the belief that staff members hold to the exchange agreement
between themselves and the organization. It is a theoretical framework for studying
employee attitudes and behaviors [2]. The psychological contract fulfillment will pro-
mote the employee’s trust in the organization and then affect the staff attitudes and acts,
including organizational identification, satisfaction, and task performance [3]. In social
exchange theory, employees show certain actions according to certain benefit exchange
principles. When staffs feel that the psychological contract is fulfilled well, they tend
to play a higher level of performance in return [4]. In the sharing economy, temporary
workers face unstable employment conditions. Encouraging staff to work and improve
their sense of psychological contract fulfillment is the core of stabilizing employment
relationships and improving performance. It is also one of the most urgent challenges
faced by Chinese companies.
Organizational identity refers to the fact that individuals regard themselves as part of
the organization, which depends on the significance of their relationship to the organiza-
tion [5]. When the psychological contract is destroyed, it will adversely affect the staff’s
sense of belonging [6]. The psychological contract fulfillment affects the organizational
identity and then the service-oriented organizational citizenship behavior [7]. Temporary
workers’ needs for affiliation and social connections positively influence organizational
recognition and commitment, which in turn affects its task performance. With the exten-
sion of working years, staffs have different expectations of the organization. Previous
research has discovered that the relationship between psychological contract fulfillment
and performance changes with increasing working years [8]. In other words, work-
ing years regulate the relationship between psychological contract fulfillment and work
results.
The existing research on shared employment mainly focuses on the characteristics of
the new human resource management model [9]. Few scholars discuss the driving mech-
anism of temporary labor performance from the viewpoint of employee psychological
contracts. Employees on the platform are independent of the organization, and their work
style is different from that of the traditional employees, assumed by the organizational
behavior theory. Because of this point, this article studies the relationship between the
performance of the psychological contract and the task performance of shared tempo-
rary workers from the angle of social exchange theory, divides the psychological contract
into two dimensions, and introduces the mediation variables of organizational identity
and the adjustment of working years. A logical model is established, constructing a
more effective theoretical framework for future research on psychological contract and
task performance and helping corporate managers in the sharing economy make better
strategies conducive to improving employees’ performance. In the face of the new labor
market conditions, this study tends to provide the government with suggestions on how
to guarantee sufficient talent and promote the sustainable development of enterprises.

2 Theoretical Background and Hypotheses


In the context of the sharing economy, it explores the influence of temporary workers’
sense of psychological contract fulfillment on organizational identity and task perfor-
mance, in theory, divides temporary workers into different working years, analyzes
Research on the Impact of Temporary Workers’ Psychological Contract 691

the moderating effect of working years, and makes assumptions reviewing the above
variables.

2.1 Psychological Contract Fulfillment and Task Performance


The psychological contract is about the terms of the exchange agreement between staffs
and organizations [10]. Different from the legal contract of service, the psychological
contract is subjective. It is recessive and depends on the emplyee’s understanding of the
mutual obligations between himself and the employment organization [11]. Psycholog-
ical contract fulfillment means the level to which an organization performs obligations
according to employees’ perception [12]. Rousseau (1995) proposes that the psycholog-
ical contract contains two dimensions, namely transactional and relational [13]. Trans-
actional psychological contract places emphasis on the exchange of remuneration and
benefits within a limited time, including providing more competitive and fairer salary
and positions, and the degree of correlation between salary and performance; relational
psychological contract mainly provides personal support and respect to employees, such
as training support and career development opportunities [14].
When employees think that the organization provides more than what is promised,
such as a substantial salary increase, improved welfare, and more career development
opportunities, they will strengthen social exchange relations by promoting contribution
to the organization, which is reflected in the increase of personal performance [15].
Performance can measure the degree of achievement of personal and organizational
goals. Some scholars define task performance as contributing to the organization or
maintaining the core of the organization’s technology through direct production activities
such as providing materials and services within the provisions of the work specification
[16]. As an important part of overall job performance, task performance reflects the
employees’ tasks they need to complete [17].
There is a perfect correlation between psychological contract fulfillment and staff’s
role performance [18]. In a specific situation, employees will be willing to improve their
work behavior when they realize that the employer has fulfilled their obligations, such as
wages, working times, working environment, etc. [19]. Both transactional and relational
psychological contracts positively influence employees’ task performance through the
adjustment of performance pay [20]. Therefore, this study believes that if the shared
employee psychological contract is satisfied, the staff’s devotion to the organization will
increase.
H1a: The fulfillment of transactional psychological contracts perceived by temporary
workers in the sharing economy positively correlates with task performance.
H1b: The fulfillment of relational psychological contracts perceived by temporary
workers in the sharing economy positively correlates with task performance.

2.2 Psychological Contract Fulfillment and Organizational Identification


From a cognitive perspective, Ashforth and Mael (1989) believe that organizational
identity reflects the consistency of employee and organizational cognition and is a sense
of belonging and dependence on the company [5]. The perception of organizational
identity refers to employee’s self-perception, which stems from the self-construction of
692 G. Zhang et al.

employee identity. The perception of the individual and the organization is synchronized
and consistent. When employees perceive their existence in the enterprise, they will have
a perception of group affiliation to the organization.
Psychological contract fulfillment is a vital driving force for staff organizational
identification [21]. Employees and the organization rely on emotional commitment and
mutual respect to maintaining the relationship. The more support and opportunities the
organization provides, the stronger the employee’s sense of belonging is. In transactional
contracts, the most important thing for shared employees is the short-term economic
return to satisfy their purpose of joining the organization and improving their sense of
organizational identity [22]. Therefore, this research believes that temporary workers
perceive the higher level of psychological contract fulfillment in the sharing economy,
the greater the sense of identity with the organization they will have.
H2a: The fulfillment of transactional psychological contracts perceived by temporary
workers in the sharing economy has a significant positive correlation with organizational
identity.
H2b: The fulfillment of the relational psychological contract perceived by temporary
workers in the sharing economy has a significant positive correlation with organizational
identity.

2.3 The Mediating Role of Organizational Identity

The identification of staffs to the organization depends on their perception that they
are legal members of the organization. A high-level psychological contract can narrow
the distance between the two parties and make the cooperation between the two parties
closer. While participating in organizational affairs, employees will continue to exer-
cise and rebuild their values. With the gradual unification of the values of both sides,
the individual’s sense of identity to the organization will gradually deepen [23]. Staffs
who have a forceful sense of identity with the organization are more likely to show a
supportive attitude towards the organization, have a stronger sense of belonging to the
organization, and tend to take actions that are helpful to the organization [24]. Therefore,
organizational identity is positively correlated with job performance [25]. Organizational
identity plays an important mediating role between psychological contract and staff’s
role performance [26]. The employee’s sense of the psychological contract fulfillment
can generate trust, loyalty, and a sense of identity, motivating employees’ organizational
citizenship behavior [27].
H3: Organizational identification is positively correlated with the task performance
of temporary workers in the sharing economy.
H4a: In the sharing economy, the identification of temporary labor organizations
plays an important mediating role in the relationship between perceived transactional
psychological contract fulfillment and task performance.
H4b: Shared temporary labor organization identity plays a significant mediating role
in the relationship between perceived relational psychological contract fulfillment and
task performance.
Research on the Impact of Temporary Workers’ Psychological Contract 693

2.4 The Moderating Effects of Length of Service

Length of service means the number of years staff has worked for a specific organization
[28]. Norris and Niebuhr (1984) believe that length of service regulates the relationship
between job satisfaction and job performance [29]. Wright and Bonett (2002) use meta-
analysis to reveal the moderating effect of working years on organizational commitment
and job performance [30]. Compared with short-term employees, long-term staffs have
a more stable attitude towards the enterprise and have less influence on the performance
of the psychological contract. Acceptance and recognition by organizations are the most
concerning issues for new employees. The improvement of salary and performance is the
most important early signal of new employees’ perception and acceptance. The above
research shows that transactional and relational motivations can increase the organiza-
tional identity of new employees. In shared employment, staffs have a weaker sense
of identity with the organization. With the increase of working years, employees will
adapt to the values of the organization. The relationship between employees’ manner,
perception and action will change with their length of service. As a result, this research
believes that psychological contract fulfillment on the organizational identity of shared
temporary workers with short working years is more significant than employees with
long working years.
H5a: Transactional psychological contract fulfillment has a stronger influence on
the organizational identity of temporary workers in the sharing economy with shorter
service than on employees with longer service.
H5b: Relational psychological contract fulfillment has a stronger influence on the
organizational identity of shared temporary workers with shorter service than employees
with longer service.
Bal et al. (2013) points out that length of service can moderate the relationship
between psychological contract and work outcome [8]. The behavior of employees with
short service is mainly driven by economic exchange, while the action of employees with
long service is mainly driven by emotional factors [30]. New employees’ investment in
work is mainly based on the compensation they receive from the organization. In contrast,
older employees have a weaker perception of the connection between effort and their
compensation [31]. When employees devote themselves to work and are willing to stay in
the organization over time, these employees will be more aware of their responsibilities
than those who do not want to stay, and are more likely to contribute to the organization’s
future. The performance of employees with long working years is related to the relational
psychological contract fulfillment they perceive [32]. Therefore, this research considers
that the relationship between psychological contract fulfillment and task performance is
moderated by working years.
H6a: Transactional psychological contract fulfillment has a stronger impact on the
task performance of temporary workers in the sharing economy with shorter service than
on of employees with longer service.
H6b: Relational psychological contract fulfillment has a stronger impact on the task
performance of temporary workers in the sharing economy with shorter service than on
that of employees with longer service.
694 G. Zhang et al.

3 Research Design
Through a theoretical review, the intermediary moderating effect model of psychological
contract performance and task performance perceived by temporary workers in the shar-
ing economy is shown in Fig. 1. The research is carried out by adopting a questionnaire
survey as the major approach. The survey objects and samples are selected, combined
with the actual situation in China, and the appropriate questionnaire scale is selected.
After modification and improvement, a formal questionnaire is formed.

Fig. 1. Research model.

3.1 Data Sources


The object of this research is temporary workers hired through the sharing platform.
Therefore, temporary workers from a human resources company in Shaanxi Province
were selected for investigation. The samples involve retail, transportation, commercial
service, accommodation, catering, and other industries. A total of 650 questionnaires
were provided in this survey. Invalid questionnaires excluded,572 copies of effective
questionnaires were recovered. The effective response rate of the questionnaires was
88%. The composition of valid samples is shown in Table 1.

3.2 Variables Measured


This research mainly involves four variables: transactional psychological contract fulfill-
ment (TCF), relational psychological contract fulfillment (RCF), organizational identity
(OI), and task performance (TP), all of which employ widely used and mature scales.
Discussed with management experts before issuing the questionnaire, the formal ques-
tionnaire was finally determined with the help of relevant professionals to eliminate ambi-
guities in understanding. Considering the particularity of shared employment, this survey
emphasizes that the organization in the questionnaire is the employer that the employees
serve. The questionnaire adopts Likert’s five-point scoring method, with figures from 1
to 5 respectively representing “completely disagree” to “completely agree”.
Research on the Impact of Temporary Workers’ Psychological Contract 695

Table 1. Effective sample composition.

Project Classification Number of Proportion (%)


people
Gender Male 330 57.7
Female 242 42.3
Age ≤25 years old 331 57.9
26–35 years old 188 32.9
36–45 years old 13 2.3
>45 years old 40 6.9
Education High school and 41 7.2
below
University and 531 92.8
above
Length of ≤1 year 266 46.5
service 1–2 years 221 38.6
2-3years 45 7.9
>3years 40 7.0

4 Analysis and Results


The sampling and data screening completed, reliability and validity analysis is performed
to test the reliability and representativeness, and confirmatory factor analysis is used to
check out the common method bias problem to support the next hypothesis test and
construct the structural equation.

4.1 Reliability and Validity Test

The reliability test uses Cronbach’s α coefficient, and the results show that the reliability
coefficients of the four scales are 0.881, 0.851, 0.887, 0.816, and the overall questionnaire
reliability is 0.889. All these figures are greater than the high-reliability standard of 0.7,
indicating that the internal consistency of the questionnaire is good.
Since this study is made on domestic and foreign mature scales, the content validity
is good. The construct validity is tested by aggregate validity and discriminative validity
[33]. According to Table 2, all factor loads in the measurement model are more than
0.5, the AVE (average extraction variation) value is more than 0.5, and the combination
reliability is greater than 0.7, indicating good aggregation validity.
696 G. Zhang et al.

Table 2. Convergent validity analysis.

Factor Factor loading AVE Combination


reliability
TCF 0.795–0.831 0.654 0.883
RCF 0.710–0.857 0.593 0.853
OI 0.699–0.808 0.568 0.887
TP 0.642–0.854 0.620 0.828

This paper compares the average extracted amount of variation with the square of
the correlation coefficient to judge the discriminative validity. Table 3 shows that all
AVE square roots extracted are higher than the correlation between the target variable
and other variables, so the variables have a good discriminative validity.

Table 3. Discrimination validity analysis.

Variable Average SD TCF RCF OI TP


TCF 3.524 0.761 0.654
RCF 3.822 0.737 0.001 0.593
OI 3.864 0.690 0.522*** 0.303*** 0.568
TP 3.710 0.716 0.395*** 0.633*** 0.547*** 0.620
AVE Square 0.809 0.770 0.754 0.787
Root
Note: *** means p < 0.001 (two-tailed test)

4.2 Common Method Deviation


Since this study uses the self-report method to collect questionnaire data, there may
be common method deviations in the survey. This study uses Harman’s single factor
analysis. The finding indicates that the explanation rate of the variance of the first factor
before rotation is 37.066%, which is less than the criterion of 40%. Therefore, the
common method deviation problem in this study is not serious.

4.3 Hypothesis Test


Path Test of Direct Effects. Structural equation model analysis is carried out in
AMOS25.0 to verify the hypothesis proposed in this research. The results show that
the model fits well (χ2 /df = 1.885, GFI = 0.916, IFI = 0.962, TLI = 0.953, CFI =
0.961, RMR = 0.039, RMSEA = 0.056, fitting index values are consistent with the
measured model). Figure 2 shows the path coefficient between variables in the model.
Research on the Impact of Temporary Workers’ Psychological Contract 697

The path regression coefficients of transactional and relational psychological contract


fulfillment and employee task performance are 0.562 (p < 0.001) and 0.273 (p < 0.001),
respectively, assuming H1a and H1b are verified. The regression coefficients of organi-
zational identity and transactional psychological contract fulfillment, relational contract
fulfillment are 0.302 (p < 0.001) and 0.522 (p < 0.001), respectively. The two dimen-
sions of psychological contract fulfillment have a significant positive correlation with
organizational identity. Suppose H2a , H2b is verified. The regression coefficient between
organizational identification and task performance is 0.234 (p < 0.01), and hypothesis
H3 is verified.

Fig. 2. Hypothesis testing model path.

Path Test of Mediation Effect. In this study, the bootstrap method is used to analyze
the mediation effect, the bootstrap sampling value is set to 2000. As shown in Table 4,
the test findings indicate that the confidence interval corresponding to each path of the
test does not contain 0, p < 0.01. The mediating effects are statistically significant. Com-
pared with the direct effect (β = 0.562, p < 0.01), transactional psychological contract
fulfillment has a relatively weaker influence on task performance through organizational
identification, but the indirect effect is notable (β = 0.071, p < 0.01), which tests the
hypothesis H4a. Relational psychological contract fulfillment has a significantly indi-
rect influence on task performance through organizational identification (β = 0.122, p
< 0.01), which supports hypothesis H4b.

Table 4. Mediating effect test.

Effect Path Effect value SE 95% Confidence interval


Lower Upper P
Total effect TCF → TP 0.633 0.050 0.532 0.727 0.001
Direct effect TCF → TP 0.562 0.059 0.445 0.681 0.001
Indirect effect TCF → OI → TP 0.071 0.027 0.023 0.128 0.005
Total effect RCF → TP 0.395 0.059 0.274 0.510 0.001
Direct effect RCF → TP 0.273 0.073 0.138 0.421 0.001
Indirect effect RCF → OI → TP 0.122 0.045 0.04 0.215 0.006
698 G. Zhang et al.

Moderating Effect Test. In order to further explore the moderating effect of length of
service, this study uses Model 59 in SPSS PROCESS 3.3 to test the moderating effect
[34]. This research divides the length of service sample into a high service years group (2–
3 years, >3 years) and a low service years group (≤1 year, 1–2 years). The research results
show that length of service plays a negative moderating role between the performance of
transactional psychological contract and organizational identification (B = -0.169, p <
0.01). The slope in Fig. 3a shows that temporary workers with low working years (simple
slope = 0.611, p < 0.001) have a greater impact on the performance of transactional
psychological contracts than temporary workers with high working years (simple slope
= 0.272, p < 0.001). The organization agrees, therefore, hypothesis 5a is supported.
However, the moderating effect of length of service in relational psychological contract
fulfillment and organizational identification is non-significant (B = -0.024, p = 0.680),
as shown in Fig. 3b, and hypothesis 5b is not valid. Figure 3c shows that the moderating
effect of working years between transactional psychological contract fulfillment and
task performance is non-significant (B = 0.019, p = 0.738), and hypothesis 6a has not
been verified. Length of service has a positive moderating effect on the fulfillment of
relational contract and task performance (B = 0.139, P < 0.01). Relational psychological
contract fulfillment perceived by temporary workers with high working years (simple
slope = 0.592, p < 0.001) has more influence on task performance than that perceived
by temporary workers with low working years (simple slope = 0.312, p < 0.001) does.
As a result, hypothesis 6b is supported.

Fig. 3. The moderation effect of length of service.

5 Discussion
The employment relationship in the sharing economy is flexible. There is no clear labor
relationship between employees and employers. Compared with other organizational
Research on the Impact of Temporary Workers’ Psychological Contract 699

behaviors, individuals tend to focus more on their task performance. Most temporary
workers are young people who have no economic savings for the time being, and they
are in urgent need of stable income to meet their own living needs. Therefore, in the short
term, employees’ work motivation largely comes from obtaining labor remuneration and
they are more sensitive to economic exchange. The conclusions of this study find that
both transactional and relational contract fulfillment impact task performance, while
transactional contract has a more greatly direct influence on task performance. This
conclusion demonstrates how different dimensions of psychological contracts have an
impact on task performance. The more employees are satisfied with the organization’s
income and profit distribution, the better their task performance will be.
The research results indicate that the fulfillment of relational psychological contracts
strongly influences task performance through organizational identification. China is a
sensible society. Out of the need for a sense of belonging, employees seek organizational
identity through perceived relational psychological contract fulfillment, thereby improv-
ing task performance. In shared employment, it’s a partnership between the employer
and the employee. The employability of employees is the basis for cooperation between
the two parties, which weakens the attention to attitudes and behaviors, and it is diffi-
cult to establish organizational identity [35]. Therefore, establishing temporary workers’
organizational identity tends to be a relational contract of social exchange.
Most employees with shorter service have just graduated and have no stable income.
The fulfillment of transactional psychological contracts, such as competitive income,
is regarded by these employees as a signal to be accepted by the organization, mak-
ing it easier for them to perceive the close connection with the employer and integrate
the organizational identity and personal identity. Therefore, the fulfillment of a transac-
tional psychological contract has more influence on the organizational identity of short-
term employees than that of long-term employees. At the same time, the fulfillment of
the relational psychological contract has a stronger influence on the task performance
of long-term staffs. Relational psychological contracts include intangible rewards and
exchanges and require a long period of time to establish and develop. The performance
and contribution of older employees are driven by the development of their relation-
ship with the organization. Temporary workers with long service time can better feel
the stable relationship with organizations. Their performance is not only driven by the
economy, but also by emotion.

6 Conclusion
The rapid development of the sharing economy and the emergence of new employment
methods are different from traditional forms of employment. New types of employment
bring higher flexibility, temporary workers’ psychological contracts have higher insta-
bility, and the mutual benefit between enterprises and employees is affected. Therefore,
maintaining and balancing the psychological contract of temporary workers directly
affects their work mentality, workability and performance level, and is also related to the
social identity and long-term development of shared talents. This research systematically
discusses the impact mechanism of temporary workers’ perceived psychological contract
fulfillment on task performance. The results show that under the background of shar-
ing economy, employees’ perceived transactional psychological contract fulfillment and
700 G. Zhang et al.

relational contract fulfillment have a significant positive influence on task performance.


Employees’ perceived transactional contract fulfillment has a more direct influence on
task performance than relational psychological contract fulfillment. Organizational iden-
tity plays a mediating role between temporary workers’ perceived psychological contract
fulfillment and task performance. Employees’ perceived relational contract fulfillment
greatly influences task performance through organizational identity than transactional
contract fulfillment does. The influence of transactional contract fulfillment perceived
by temporary workers with shorter service on organizational identity is stronger than
that of temporary workers with longer service. The influence of relational psycholog-
ical contract fulfillment perceived by temporary workers with longer service on task
performance is stronger than that of temporary workers with shorter service.

Acknowledgment. This research was supported by the grants of Shaanxi collaborative innovation
support project No. 2019KRM149 and Shaanxi Provincial Department of Education project No.
20JK0230.

Appendix
The Questionnaire used in this study (Fig. 4).

Fig. 4. Research on the scale.


Research on the Impact of Temporary Workers’ Psychological Contract 701

References
1. National Development and Reform Commission. https://www.ndrc.gov.cn/xxgk/jd/wsdwhfz/
202102/t20210222_1267536.html?code=&state=123 (2021)
2. Robinson, S.L.: Trust and breach of the psychological contract. Adm. Sci. Q. 41(4), 574–599
(1996)
3. Collins, M.D.: The effect of psychological contract fulfillment on manager turnover intentions
and its role as a mediator in a casual, limited-service restaurant environment. Int. J. Hosp.
Manag. 29(4), 736–742 (2010)
4. Wang, Y.S., Chen, H.M.: Research on sharing economic platform based on bilateral market
theory. Modernization of Management 2, 48–50 (2017)
5. Ashforth, B.E., Mael, F.: Social identity theory and the organization. Acad. Manag. Rev.
14(1), 20–39 (1989)
6. Epitropaki, O.: A multi-level investigation of psychological contract breach and organizational
identification through the lens of perceived organizational membership: testing a moderated–
mediated model. J. Organ. Behav. 34(1), 65–86 (2013)
7. Tufan, P., Wendt, H.: Organizational identification as a mediator for the effects of psycholog-
ical contract breaches on organizational citizenship behavior: insights from the perspective
of ethnic minority employees. Eur. Manag. J. 38(1), 179–190 (2020)
8. Bal, P.M., De Cooman, R., Mol, S.T.: Dynamics of psychological contracts with work engage-
ment and turnover intention: the influence of organizational tenure. Eur. J. Work. Organ. Psy.
22(1), 107–122 (2013)
9. Cheng, X.R., Li, P.B., Liang, H.: A study of human resource management patterns in sharing
economy——taking airbnb as an example. Human Resources Development of China 6, 20–25
(2016)
10. Rousseau, D.M., Tuoriwala, S.: Assessing psychological contracts: issues, alternatives and
measures. J. Organ. Behav. 19(51), 679–695 (1998)
11. Soares, M.E., Mosquera, P.: Fostering work engagement: the role of the psychological
contract. J. Bus. Res. 101, 469–476 (2019)
12. Van Hootegem, A., De Witte, H.: Qualitative job insecurity and informal learning: a longitu-
dinal test of occupational self-efficacy and psychological contract breach as mediators. Int. J.
Environ. Res. Public Health 16(10), 1847 (2019)
13. Rousseau, D.M.: Psychological contracts in organizations: understanding written and
unwritten agreements. Sage, Thousand Oaks, CA, USA (1995)
14. Dabos, G., Rousseau, D.M.: Mutuality and reciprocity: psychological contracts in research
teams. J. Appl. Psychol. 89(1), 52–72 (2004)
15. Wayne, S., Shore, L.M., Liden, R.C.: Perceived organizational support and leader-member
exchange: a social exchange perspective. Acad. Manag. J. 40(1), 82–111 (1997)
16. Borman, W.C., Motowidlo, S.J.: Expanding the criterion domain to include elements
of contextual performance. In: Schmitt, N., Borman, W.C. (eds.) Personnel selection in
organizations, pp. 71–98. Wiley, New York (1993)
17. Chiaburu, D.S., Oh, I., Wang, J., Stoverink, A.C.: A bigger piece of the pie: the relative
importance of affiliative and change-oriented citizenship and task performance in predicting
overall job performance. Hum. Resour. Manag. Rev. 27(1), 97–107 (2017)
18. Turnley, W.H., Bolino, M.C., Lester, S.W., Bloodgood, J.M.: The impact of psychological
contract fulfillment on the performance of in-role and organizational citizenship behaviors.
J. Manag. 29(2), 187–206 (2003)
19. Wu, C., Chen, T.: Psychological contract fulfillment in the hotel workplace: empowering
leadership, knowledge exchange, and service performance. Int. J. Hosp. Manag. 48, 27–38
(2015)
702 G. Zhang et al.

20. Long, L.R., Yi, M., Zhang, Y.: The impact of transactional and relational psychological
contract on employees task and contextual performance: the moderating roles of pay for
performance and perceived supervisor support. Forecasting 34(1), 8–14 (2015)
21. Rodwell, J., Ellershaw, J., Flower, R.: Fulfill psychological contract promises to manage
in-demand employees. Pers. Rev. 44(5), 689–701 (2015)
22. Yu, X.D., Liu, R., Chen, H.: Human resource policies in ‘internet +’——a case study based
on Di Di. Human Resources Development of China 6, 6–11 (2016)
23. Rentao, M., Bing, W.: Does not psychological contract breach contribute to organizational
identification? Journal of Capital University of Economics and Business 18(04), 50–57 (2016)
24. Epitropaki, O.: A multi-level investigation of psychological contract breach and organizational
identification through the lens of perceived organizational membership: testing a moderated-
mediated model: psychological contract breach and organizational identification. J. Organ.
Behav. 34(1), 65–86 (2012). https://doi.org/10.1002/job.1793
25. Hong, J., Bin, L.: Relationship between organizational identification and performance of
college teachers. Res. Econ. Manag. 36(12), 75–81 (2015)
26. Liu, W., He, C., Jiang, Y., Ji, R., Zhai, X.: Effect of Gig workers’ psychological contract
fulfillment on their task performance in a sharing economy—a perspective from the mediation
of organizational identification and the moderation of length of service. Int. J. Environ. Res.
Public Health 17(7), 2208 (2020)
27. Christ, O., van Dick, R., Wagner, U., Stellmacher, J.: When teachers go the extra mile: foci of
organizational identification as determinants of different forms of organizational citizenship
behavior among schoolteachers. Br. J. Educ. Psychol. 73(3), 329–341 (2003)
28. Ng, T.W.H., Feldman, D.C.: Does longer job tenure help or hinder job performance? J. Vocat.
Behav. 83(3), 305–314 (2013)
29. Norris, D.R., Niebuhr, R.E.: Organizational tenure as a moderator of the job satisfaction-job
performance relationship. J. Vocat. Behav. 24(2), 169–178 (1984)
30. Wright, T.A., Bonett, D.G.: the moderating effects of employee tenure on the relation between
organizational commitment and job performance: a meta-analysis. J. Appl. Psychol. 87(6),
1183–1190 (2002)
31. Conway, N., Coyle-Shapiro, J.A.: The reciprocal relationship between psychological contract
fulfilment and employee performance and the moderating role of perceived organizational
support and tenure. J. Occup. Organ. Psychol. 85(2), 277–299 (2012)
32. Rousseau, D.M.: New hire perceptions of their own and their employer’s obligations: a study
of psychological contracts. J. Organ. Behav. 11(5), 389–400 (1990)
33. Zhao, F.Q., Huang, H.Y., Chen, Y., Zhang, Q.H.: Effect of work-family balance human
resource practice on job performance: mediation of work-family relationship and moder-
ation of psychological capital. Human Resources Development of China 35(11), 124–140
(2017)
34. Hayes, A.F.: Introduction to mediation, moderation, and conditional process analysis: a
regression-based approach. The Guilford Press, New York (2013)
35. Ling, L., Gu, Y.H.: The relationship between employability and employment stability in share
economy: contradiction and balance. Human Resources Development of China 23, 10–15
(2015)
Research on the Influence of the Investment
Facilitation Level of the Host Country
on China’s OFDI——From the Perspective
of Investment Motivation

Chaoqun Niu(B) , Shuying Lei, and Chengwen Kang

Harbin University of Commerce, Harbin 150028, China

Abstract. Investment facilitation is an important guarantee for the rapid devel-


opment of investment activities. In this paper, the relatively complete investment
facilitation evaluation system is constructed. The investment facilitation level of
101 countries and regions globally is measured by the principal component analy-
sis method. The feasible generalized least squares method (FGLS) is used to empir-
ically test the impact of the host country’s investment facilitation level on China’s
OFDI. The results show significant differences in the level of investment facilita-
tion among regions, and there is still great room for improvement; China’s tends to
invest in countries with higher investment facilitation comprehensive index. When
investing abroad under different investment motives, market-seeking, efficiency-
seeking, and strategic asset-seeking foreign direct investment tend to favor host
countries with a higher comprehensive index of investment facilitation. In com-
parison, resource-seeking foreign direct investment tends to favor facilitation host
countries with a poorer comprehensive index.

Keywords: Investment facilitation · Outward foreign direct investment ·


Investment motivation

1 Introduction and Literature Review

Investment facilitation refers to a series of actions or measures taken by the government


at the whole investment stage to maximize the effectiveness and efficiency of enterprise
management to create a harmonious, transparent, stable, and efficient investment envi-
ronment. Investment facilitation plays an important role in reducing transaction costs
and improving investment efficiency. International organizations such as APEC and
WTO regard it as important content. With the improvement of the global investment
environment and the continuous strengthening of China’s attention to investment facili-
tation, China’s foreign direct investment has achieved remarkable results. According to
the 2020 world investment report, in 2019, China’s foreign direct flow was US $136.91
billion. Its foreign direct investment stock was the US $2,198.88 billion, ranking second
and third in the world, respectively. However, what is the level of investment facilitation
in countries around the world? Can the improvement of investment facilitation levels

© The Author(s), under exclusive license to Springer Nature Switzerland AG 2022


A. E. Hassanien et al. (Eds.): BIIT 2021, LNDECT 107, pp. 703–712, 2022.
https://doi.org/10.1007/978-3-030-92632-8_66
704 C. Niu et al.

in various countries promote increasing China’s investment in them? Whether different


investment models have different preferences for the host country’s investment facili-
tation level? Given these questions, this paper constructs a relatively perfect evaluation
system of investment facilitation, further explores the impact of the host country’s invest-
ment facilitation level on China’s OFDI by measuring the investment facilitation level
in the host country, which has important theoretical and practical significance for the
high-quality development of China’s foreign investment.
So far, the researches on investment facilitation and foreign direct investment have
been mainly focused on the following aspects. First, the construction of investment facil-
itation index system mainly consulted the research framework of Wilson et al. (2003)
[1], and adjusted the evaluation system of investment facilitation according to different
research purposes. Zhang Yabin (2016) [2] used the mean principal component analy-
sis method and selected infrastructure, commercial investment, information technology,
financial services, and institutional supply to measure the degree of investment facilita-
tion. Qiao Minjian (2019) [3], Xietian Ziguang, fan Xiufeng (2019) [4], Liu Yonghui, and
Zhao Xiaohui (2021) [5] also used principal component analysis to measure the invest-
ment facilitation level of the target country from the government behavior, infrastructure,
information technology and so on.
Second, for the impact of investment facilitation on direct investment, representative
views mainly include: Zhang Yabin (2016) [2], Liu Yonghui, Zhao Xiaohui (2021) [5],
and others believe that the improvement of the investment facilitation level of the host
country can significantly promote the increase of China’s foreign investment stock. Qiao
Minjian (2019) [3] introduced investment facilitation and its square term into the model
for the empirical test, finding that there is an “inverted U-shaped” relationship between
the investment facilitation level of the host country and China’s OFDI. Xietian Ziguang
and Fan Xiufeng (2019) [4] found that there is a structural mutation point ( a single
threshold) in the impact of the host country’s investment facilitation level on China’s
OFDI by the panel threshold model.
To sum up, although the research on investment facilitation is relatively rich, few
scholars empirically test the impact of the level of investment facilitation of various
countries on China’s OFDI from the perspective of investment motivation. Under of
situation of the available data, this paper selects 101 countries around the world as
the research object, including as many countries (regions) as possible into the sample,
constructs the investment facilitation index system and its related models, and introduces
the interaction term between investment facilitation and different investment motives,
to explore whether there is a selective preference for the level of investment facilitation
of the host country.

2 Construction, Measurement and Result Analysis of Investment


Facilitation Evaluation System

2.1 Construction of Evaluation System

Based on the research of Wilson et al. (2003) [1], this article, combining the charac-
teristics of international direct investment and the relevant provisions of the Investment
Research on the Influence of the Investment Facilitation Level 705

Table 1. Investment facilitation evaluation system

First-Level Indicators Second-Level Indicators Attributes Ranges


Government action Judicial independence (G1) + 1—7
Legal regulation efficiency regulations(G2) + 1—7
Government burden management (G3) + 1—7
Efficiency of legal framework in settling + 1—7
disputes(G4)
Intellectual property protection(G5) + 1—7
Infrastructure Quality of road infrastructure(I1) + 1—7
Quality of Railway infrastructure(I2) + 1—7
Quality of air transport infrastructure(I3) + 1—7
Quality of Maritime infrastructre(I4) + 1—7
Mobile-cellular telephone +
subscriptions(I5) —
Mobile-broadband subscriptions(I6) +

Financial services Risk capital availability(F1) + 1—7
Bank stability(F2) + 1—7
Market environment Organized crime(M1) + 1—7
Reliability of police services(M2) + 1—7
Extent of market dominance(M3) + 1—7
Trade tariffs(M4) −

Buyer maturity(M5) + 1—7
Industrial Cluster(M6) + 1—7
Labor market Staff training level(L1) + 1—7
Redundancy costs(L2) −

Hiring and firing practices(L3) + 1—7
Cooperation in labor-employer + 1—7
relations(L4)
Flexibility of wage (L5) + 1—7
Reliance on professional management(L6) + 1—7
Pay and productivity(L7) + 1—7
Note: “—” means the value range is uncertain; among the values “1–7”, 7 means the best
706 C. Niu et al.

Facilitation Action Plan, takes the five aspects including government behavior, infras-
tructure, financial services, market environment and the labor market as First-Level
Indicators. At the same time, in order to measure the host country’s investment facilita-
tion level as systematically and comprehensively as possible, these indicators are further
subdivided into 26 secondary indicators (see Table 1 for details). The relevant data are
from the Global Competitiveness Report (GCR) from 2010 to 2019.

2.2 Measurement and Result Analysis of Investment Facilitation Level


This paper uses SPSS25.0 software to carry out the principal component analysis. First,
it can be seen from Table 1 that there are both positive and negative indicators in the
index system to measure investment facilitation, and there are individual uncertainties
in the value range of the score except the value of 1–7. Therefore, in order to make the
above data easy to compare, this paper takes the reciprocal of all negative indicators
to make them have the same force as the positive indicators and then uses the linear
transformation method1 to standardize all secondary indicators that number values is
between 0–1. Second, KMO and Bartlett’s spherical tests are carried out. The results
show that there is a strong correlation between the secondary indexes selected in this
paper. Last, according to the principle that the eigenvalue is larger than 1, the principal
components are extracted from 26 secondary indicators. Due to layout restrictions, only
the investment facilitation levels of some important countries and regions are listed, as
shown in Table 2 and Table 3.

Table 2. Summary of comprehensive investment facilitation index of major countries

Country 2010 Rank 2015 Rank 2019 Rank Average Rank


Singapore 0.90 1 0.91 1 0.91 1 0.91 1
Switzerland 0.86 2 0.88 2 0.90 2 0.88 2
Finland 0.84 4 0.84 4 0.86 4 0.84 3
Holland 0.81 7 0.83 6 0.87 3 0.84 4
Luxembourg 0.82 6 0.83 7 0.83 8 0.83 5
Zimbabwe 0.47 89 0.46 99 0.49 99 0.48 97
Nepal 0.41 100 0.48 97 0.52 93 0.47 98
Burundi 0.41 101 0.43 100 0.58 75 0.47 99
Mozambique 0.46 93 0.48 98 0.45 100 0.46 100
Mauritania 0.44 98 0.41 101 0.44 101 0.43 101

According to Table 2 and Table 3, Asia and Singapore have always ranked first for
many years in terms of investment facilitation. Second, the countries with a high degree
1 The formula is expressed as: Y = X /X
i i max , Yi represents the new value after standardization;
Xi represents the index data obtained after reciprocal treatment; Xmax is the maximum value
that can be obtained in index.
Research on the Influence of the Investment Facilitation Level 707

Table 3. Trends of the comprehensive investment facilitation index classified by region

Country 2010 2011 2012 2013 2014 2015 2016 2017 2018 2019
Oceania 0.76 0.76 0.76 0.75 0.76 0.77 0.77 0.78 0.76 0.77
Africa 0.52 0.53 0.54 0.55 0.56 0.55 0.55 0.55 0.55 0.57
North America 0.63 0.63 0.65 0.67 0.68 0.67 0.67 0.69 0.67 0.68
South America 0.53 0.54 0.55 0.56 0.56 0.55 0.55 0.56 0.54 0.57
Europe 0.68 0.68 0.69 0.67 0.68 0.69 0.70 0.70 0.69 0.71
Asia 0.60 0.62 0.63 0.64 0.64 0.64 0.64 0.65 0.65 0.67

of investment facilitation are Switzerland, Finland, Netherlands, and Luxembourg. On


the contrary, the last five countries are Nepal and Zimbabwe, Burundi, Mozambique,
and Mauritania. It can be seen that the investment facilitation level is closely related to
country’s economic development level. From the perspective of regional change trends,
the level of investment facilitation among regions is generally on the rise from 2010
to 2019. Among them, the improvement of the investment facilitation level in Asia is
the most significant, from 0.60 in 2010 to 0.67 in 2019, followed by North America,
Africa, South America, Europe and Oceania. It can be seen that there are significant
differences in the level of investment facilitation among regions, and there is great room
for improvement.

3 The Impact of Investment Facilitation on China’s Outward


Foreign Direct Investment
3.1 Variable Selection
Explanatory variables: China’s outward foreign direct investment (OFDI). This paper
selects China’s OFDI stock data for analysis. The data comes from Statistical Bulletin
of Chine’s Outward Foreign Direct Investment.
Core explanatory variables: Investment Facilitation Comprehensive Index (IFI). All
are calculated from the previous content.
Explanatory variables of measure investment motivation: the market scale of the
host country (GDP), natural resource endowments (Resources), labor productivity level
(Labor-Productivity), and strategic asset level (Technology). This article uses the four
types of motivation proposed by Dunning (1993) [6]. The host country’s GDP is used to
measure Market-Seeking Motivation; the percentage of the host country’s fuel, ore, and
metal exports in total commodity exports is used to measure Resource-Seeking Motiva-
tion; the labor productivity level of the host country2 is applied to measure Efficiency-
Seeking Motivation; the percentage of High-Tech Exports in manufactured exports is
used to measure Strategic Asset-Seeking Motivation. The above data are all from the
World Bank database.
2 The formula is expressed as: LP = GDP/L, where GDP is the GDP measured in 2010 constant
price dollars, and L is the number of employees.
708 C. Niu et al.

Control variables: trade openness (LnTra), bilateral distance (LnDis). The trade open-
ness of a country is measured by the percentage of country’s imports and exports of goods
and services to the country’s GDP, the data is from the World Bank database; the bilateral
geographic distance is measured by the distance between Beijing and the capital of each
host country, the data coming is from the CEPII database. In addition, logarithms are
taken for all variables to reduce the impact of drastic changes and heteroscedasticity
among variables.

3.2 Model Construction


This paper explores whether the different models of OFDI in China have a selective
preference for investment facilitation level in the host countries based on the studies of
Zhou Chao, Liu Xia, and Gu Zhuan (2017) [7]; Wang Zhengxin, Zhou Qian (2019) [8];
The specific model is:

LnOFDIiit = β0 + β1 LnIFIit + β2 Motit ∗ LnIFIit + γ Controlit + δt + εij (1)

In the above model, β0 represents the constant term; β1 , β2 , γ are the parameters
to be estimated; δt is the time fixed effect; εij is the residual term; i represents the
host country; t represents the year; LnOFDIit is the stock of China’s outward foreign
direct investment; LnIFIit is the comprehensive index of the host country’s investment
facilitation; Controlit is the control variable; Motit represents the specific investment
motive; Motit *LnIFIit is the interaction item of the investment motive and the investment
facilitation comprehensive index of the host country.

4 Empirical Test and Result Analysis


4.1 The Impact of Investment Facilitation on China’s OFDI
This paper explores the impact of the host country’s investment facilitation level on
China’s OFDI by introducing the comprehensive index and the interaction between the
comprehensive index and investment motivation. Because the geographical distance in
the explanatory variable does not change with time, it cannot be identified by the fixed
effect, so we first use the method of Buckley et al. (2007) [9] to perform mixed effects
and random effects regressions on the data respectively, and then select the random effect
model according to the LM test results. Next, heteroscedasticity and autocorrelation tests
are carried out. The results show that there are heteroscedasticity and autocorrelation
on the data. Therefore, in order to eliminate the influence of heteroscedasticity and
autocorrelation, this paper uses the Feasible Generalized Least Squares (FGLS) method
to modify the model. See Table 4 for specific test results.
From the perspective of core explanatory variables, the coefficients of the LnIFI is
significantly positive at the significance level of 1%, indicating that the improvement
of the host country investment facilitation level can effectively promote the increase of
China’s transnational capital stock [2, 4, 5].
From the point of the variables for measuring investment motivation, the market
scale (LnGDP) coefficients and natural resource endowment (LnRes) coefficients of
Research on the Influence of the Investment Facilitation Level 709

the host country are both significantly positive at the level of 1% or 5%, indicating
that China’s OFDI exists obvious Market-Seeking Motivation and Resource-Seeking
Motivation. The coefficients of market scale are greater than the coefficients of natural
resource endowment, to a certain extent, it shows that China’s OFDI is more inclined
to the market scale of the host country between the market scale and natural resource
endowment. The labor productivity level (LnLP) coefficients are all significantly negative
at the level of 1%, indicating that Chinese companies prefer to invest capital in countries
with low productivity and abundant labor populations. The possible explanation for
this is that China’s OFDI flows mainly to low-tech, low-value-added industries. It pays
more attention to the demographic dividend rather than the labor productivity of the host
countries. The strategic asset level (LnTec) coefficient is significantly negative at the level
of 10%, indicating that the impact of the host country’s strategic asset level on China’s
OFDI is still uncertain. It may be because the strategic-seeking motivation of China’s
foreign direct investment is not obvious, or that China’s strategic-seeking motivation
investment for developed countries is smaller than its market-seeking motivation and
resource-seeking motivation investment for other countries.
From the perspective of the interaction, the coefficient of LnIFI*LnGDP in col-
umn (2) is significantly positive, indicating that with the continuous improvement of
the the host country’s investment facilitation level, the market scale has become more
attractive to China’s OFDI. In other words, China’s Market-Seeking OFDI is more
inclined to the host country with the higher level of investment facilitation. The coeffi-
cient of LnIFI*LnRes in column (3) is significantly negative, indicating that the worse
the investment facilitation level in the host country, the stronger the attractiveness of
natural resources to China’s OFDI. In other words, China’s Resource-Seeking Invest-
ment is more inclined to the host country with lower investment facilitation level. The
coefficient of LnIFI*LnLP in column (4) is significantly positive, showing that China’s
Efficiency-Seeking foreign direct investment is more inclined to the countries with a
higher investment facilitation comprehensive index when choosing a location. The coef-
ficient of LnIFI*LnTec in column (5) is significantly positive, indicating that China’s
strategic Asset-Seeking Investment also tends to the countries with higher investment
facilitation level.
From control variables, the trade openness (lnTra) coefficients are significantly pos-
itive at the level of 1%, indicating that the higher the trade openness of the host country,
the more conducive to the inflow of China’s transnational capital. The bilateral distance
(lnDis) coefficients are significantly negative at the level of 1%, indicating that the farther
the bilateral distance is, the less conducive to foreign capital inflow. It is demonstrated
that China’s foreign direct investment location is more inclined to Asia, especially East
and Southeast Asia.

4.2 Robustness Test


There may be bidirectional causality in the model between the core explanatory variable
(the host country’s investment facilitation level) and the explained variable (China’s
OFDI). On the one hand, the improvement of the host countries’ investment facilitation
level attracts more China’s transnational capital inflow. On the other hand, the host
countries’ massive capital inflow from China and other countries may also prompt them
710 C. Niu et al.

Table 4. Empirical results of the impact of host country investment facilitation on China’s OFDI

Variables (1) (2) (3) (4) (5)


LnIFI 1.021*** 1.180*** 1.189*** 1.142*** 1.129***
(0.236) (0.227) (0.239) (0.234) (0.239)
LnGDP 0.916*** 0.943*** 0.956*** 0.918*** 0.922***
(0.0317) (0.0315) (0.0322) (0.0348) (0.0309)
LnRes 0.0628*** 0.0716*** 0.0398** 0.0535*** 0.0587***
(0.0171) (0.0157) (0.0197) (0.0162) (0.0165)
LnLP −0.723*** −0.746*** −0.790*** −0.672*** −0.719***
(0.0394) (0.0358) (0.0416) (0.0475) (0.0341)
LnTec −0.0226* −0.0210* −0.0208 −0.0145 −0.0220
(0.0131) (0.0126) (0.0142) (0.0131) (0.0135)
LnTra 0.364*** 0.436*** 0.413*** 0.298*** 0.385***
(0.0673) (0.0610) (0.0670) (0.0759) (0.0581)
LnDis −0.263*** −0.238*** −0.210*** −0.289*** −0.226***
(0.0458) (0.0449) (0.0468) (0.0663) (0.0431)
LnIFI*LnGDP 0.529***
(0.0782)
LnIFI*LnRes −0.231**
(0.102)
LnIFI*LnLP 0.921***
(0.120)
LnIFI*LnTec 0.0917*
(0.0551)
_cons 0.0341 −0.744 −0.752 0.388 −0.358
(0.734) (0.713) (0.736) (0.960) (0.688)
Year Control Control Control Control Control
N 1010 1010 1010 1010 1010
Note: The data in parentheses are robust panel calibration standard errors. ***, **, and * indicate
significant at the level of 1%, 5%, and 10%, respectively
Research on the Influence of the Investment Facilitation Level 711

to take more measures to improve investment facilitation level. Therefore, this paper
draws on the previous research experience, taking the three lagging periods of investment
facilitation as an Instrumental Variable, and then uses Generalized Method of Moments
(GMM) to test the robustness of the benchmark model. The specific results are shown
in the Table 5. Comparing the results of the benchmark regression in the Table 4, the
coefficients, direction and significance of the main explanatory variables in the GMM
model have not changed significantly, indicating that the conclusion is robust, so we will
not repeat it.

Table 5. The model robustness test results

Variable (1) (2) (3) (4) (5)


LnIFI 4.369*** (0.686) 4.021*** (0.717) 5.242*** (0.708) 3.202*** (0.720) 4.521*** (0.672)
LnGDP 0.890*** (0.068) 0.883*** (0.068) 0.882*** (0.066) 0.889*** (0.067) 0.870*** (0.068)
LnRes 0.109** (0.045) 0.117** (0.046) 0.070(0.046) 0.143*** (0.046) 0.120*** (0.046)
LnLP −0.868*** (0.102) −0.836*** (0.102) −0.978*** (0.105) −0.769*** (0.098) −0.866*** (0.102)
LnTec −0.001(0.046) 0.007(0.045) −0.021(0.043) 0.019(0.046) 0.017(0.047)
LnTra 0.328* (0.177) 0.353** (0.176) 0.276(0.179) 0.295* (0.170) 0.265(0.175)
LnDis −0.163(0.158) −0.151(0.152) −0.200(0.152) −0.186(0.154) −0.179(0.163)
LnLab*LnGDP 0.527** (0.210)
LnLab*LnRes −1.498*** (0.373)
LnLab*LnPL 1.560*** (0.361)
LnLab*LnTec 0.720*** (0.158)
_cons 2.256(2.154) 1.810(2.090) 3.470(2.128) 1.652(2.102) 2.768(2.185)
Year Control Control Control Control Control
N 707 707 707 707 707

Note: The values in parentheses are z-values, ***, **, and * indicate significant at the level of 1%,
5%, and 10%, respectively

5 Conclusions
This paper constructs a relatively perfect evaluation system of investment facilitation. It
measures the investment facilitation of 101 countries around the world from 2010 to 2019
with the available data, which is found that there are significant differences in the level
of investment facilitation among regions, and there is great room for improvement. On
this basis, the different impact of the host countries’ investment facilitation comprehen-
sive index on China’s OFDI driven by different investment motives is examined. There
following conclusions are: First, China’s OFDI is more inclined to countries with higher
investment facilitation comprehensive index. Second, the host country’s investment facil-
itation comprehensive index has different effects on China’s different Investment-Motive
OFDI. Market-Seeking, Efficiency-Seeking, and Strategic Asset-Seeking OFDI aim at
host countries with higher investment facilitation comprehensive index. Still, Resource-
Seeking OFDI tends to countries with poorer investment facilitation comprehensive
index.
712 C. Niu et al.

Based on the above research conclusions, this paper provides the following measures
and suggestions. First of all, our government should provide some beneficial policy for
enterprises, strive to negotiate bilateral and multilateral investment agreements, and
protect Chinese enterprises’ legitimate rights and interests in the host country. Second,
Chinese enterprises should clarify their position, conduct a comprehensive and detailed
research on the investment environment of the host country, and make location selection
combing with their investment motivation and the characteristics of the host country
to avoid blind investment. Last, “the Belt and Road” initiative and the “Regional Com-
prehensive Economic Partnership” (RCEP) should be fully utilized to actively explore
overseas markets and continuously improve international competitive power.

References
1. Wilson, J.S., Mann, C.L., Otsuki, T.: Trade facilitation and economic development: a new
approach to quantifying the impact. World Bank Econ Rev 17(3), 367–389 (2003)
2. Zhang, Y.: The investment facilitation of “One Belt One Road” and choices of China’s foreign
direct investment-empirical analysis based on cross-panel data and investment gravity model.
J Int Trade 9, 165–176 (2016)
3. Qiao, M.: Will the increase in investment facilitation promote China’s outward foreign direct
investment? —based on panel data analysis of countries along the “Belt and Road.” Inq into
Econ Issues 1, 139–148 (2019)
4. Xietian, Z., Fan, X.: has investment facilitation promoted China’s outward foreign direct
investment? — threshold testing based on host country heterogeneity. Int Bus 6, 59–75 (2019)
5. Liu, Y., Zhao, X.: Investment facilitation in Central and Eastern European Countries and
Its Impact on China’s Foreign Direct Investment. The Journal of Quantitative & Technical
Economics. 1, 83–97 (2021)
6. Dunning, J.H.: Trade, location of economic activity and the multinational enterprise: a search
for an eclectic approach. Theory Transnatl Corp 1, 183–218 (1993)
7. Zhou, C., Liu, X., Guo, Z.: Business environment and China’s ODI—from the perspective of
investment motivation. J Int Trade 10, 143–152 (2017)
8. Wang, Z., Zhou, Q.: how business environment affects Chinese enterprises’ OFDI to countries
along “The Belt and Road.” Collected Essays on Finance and Economics 9, 42–52 (2019)
9. Buckley, P.J., Liu, X.: The determinants of chinese outward foreign direct investment. J Int
Bus Stud 38(4), 499–518 (2007)
Information Technology
and Applications
Adaptive Observer-Based Control for a Class
of Nonlinear Stochastic Systems with Parameter
Uncertainty

Xiufeng Miao1(B) and Yaoqun Xu2


1 Northeast Asia Service Outsourcing Research Center, Harbin University of Commerce,
Harbin 150028, China
2 Computer and Information Engineering College,

Harbin University of Commerce, Harbin 150028, China

Abstract. The adaptive control problem was discussed for stochastic nonlinear
systems with parameter uncertainty in this paper. The actual systems such as
distributed network systems, weather systems or industrial control systems are
constantly changing. There is a correlation between various factors and informa-
tion exchange, material exchange or energy exchange between the system and the
outside world in the system. These exchanges have randomness, more or less. For
these kinds of systems, the model descriptions need the help of stochastic differen-
tial equations. Considering stochastic nonlinear systems with unknown constant
parameters, we introduced the separation theory into the adaptive parameter esti-
mators design in the design process. Using Lyapunov functional and stochastic
analysis technology, we studied the design method of robust adaptive observers
for stochastic systems. So that the gain matrices can be easily obtained in sequence
in the same algorithm. A numerical test was presented to verify the feasibility of
the conclusions obtained.

Keywords: Parameter uncertainty · Adaptive estimators · Asymptotical stability

1 Introduction

For stochastic systems driven by Brown motion, stability analysis and controller design
are topics with high attention. When the system contains unknown parameters, adaptive
control is a very effective method to estimate unknown parameters. It can estimate the
system state and unknown parameters effectively, so it is widely used in fault detection,
fault isolation, and so on [1–3]. However, due to the influence of random factors, the
parameter estimation of a stochastic system with unknown parameters is hard to carry
out. From the analysis of reference [4], it can be seen that the complete separation of
feedback controller gain and state estimation gain is hard to achieve in stochastic systems
due to the effect of parameter fluctuation. Adaptive control for the stochastic nonlinear
systems with unknown parameters and adaptive estimator design problems has become
a very challenging topic when the random factors in the nonlinear system are inevitable.

© The Author(s), under exclusive license to Springer Nature Switzerland AG 2022


A. E. Hassanien et al. (Eds.): BIIT 2021, LNDECT 107, pp. 715–722, 2022.
https://doi.org/10.1007/978-3-030-92632-8_67
716 X. Miao and Y. Xu

In [5], the stochastic nonlinear systems without parameters uncertainty were dis-
cussed. On this basis, this paper makes further research. This paper is based on the
filtration Ft{t≥0} with the usual conditions and complete probability space (, F, P).
Denote the transposed matrix of A by AT , moreover,
   
A1 A2 A1 A2
= .
∗ A3 AT2 A3

Denote nonnegative function family V (x(t), t) on Rn × R+ by C 2,1 . Denote the


identity matrix by I . Define Ra as the a-dimension Euclidean space, define Ra×b as
a × b real matrices set.

2 Preliminaries

We investigate the stochastic system with parameter uncertainty as follow:



dx(t) = [(A + A(t))x(t) + g(x(t), u(t), l)]dt + K(t)dω(t)
, (1)
dh(t) = Yx(t)dt

where g(x(t), u(t), l) has the following form

g(x(t), u(t), l) = g1 (x(t), u(t)) + Eg2 (x(t), u(t))l,

here h(t) ∈ Rr represents the output vector, x(t) ∈ Rn represents the state vector, l
represents the unknown constant parameter. A(t) satisfied the formula

A(t) = BF(t)C,

F(t) represents an unknown matrix function with the following restriction


T
F (t)F(t) ≤ I ,

Where A, E, K,B, C represent known real constant matrices.


ω(t) represents standard Brownian motion, and
 
E dω2 (t) = dt,

E{dω(t)} = 0.

Next, we should give the two assumptions.

Assumption 1. The nonlinear part g(x(t), u(t), l) satisfied the following restriction:

||g(x(t), u(t), l) − g(x̂(t), u(t), l)≤ ax(t) − x̂(t)||, (2)

here a represents Lipschitz constant, and then,


Adaptive Observer-Based Control for a Class of Nonlinear Stochastic Systems 717

g(0, u(t), l) ≡ 0.
Assumption 2. A positive definite matrix R with the condition
E T RY ⊥ = 0

can be found, where Y ⊥ represent the orthogonal projection operator of null(Y ).


In [5], the state tracking technique is presented for stochastic systems. According
to the way of literature [5], we give the following nonlinear estimator for the stochastic
system (1).
dx̂(t) = [(A + A(t))x̂(t) + g(x̂(t), u(t), l̂)]dt
+ J [dy(t) − Y x̂(t)dt],
dθ̂ (t) = Dg2T (x̂(t), u(t))W (dy(t) − Y x̂(t))dt), D > 0. (3)
Next, we need to determine the matrices J and W so that the proposed convergence
property is satisfied. Denote the error vector by
e(t) = x(t) − x̂(t)
and
el (t) = l − l̂(t).
Thus, the state observer error systems can be rewritten as
de(t) = [((A + A(t)) − JY )e(t) + g̃ + Eg2 (x̂(t), u(t))el (t)]dt
+ Kx(t)dω(t),
del (t) = −Dg2T (x̂(t), u(t))W [dy(t) − Y x̂(t)dt], (4)
Where g̃ represents g(x(t), u(t), l) − g(x̂(t), u(t), l).

3 Adaptive State Estimators Design


We will give the design process of adaptive state estimators in this section.
Theorem 1. For the stochastic system (1) with Assumptions 1 and Assumptions 2, if
there exist positive definite matrices P, Q and positive scalars δ1 , δ2 , ε that make LMI
⎡ ⎤
11 0 P 0 PG 0
⎢ ∗  R 0 PG ⎥
⎢ 22 0 ⎥
⎢ ⎥
⎢ ∗ ∗ −δ1 0 0 0 ⎥
⎢ ⎥ < 0, (5)
⎢ ∗ ∗ ∗ −δ2 0 0 ⎥
⎢ ⎥
⎣ ∗ ∗ ∗ ∗ −εI 0 ⎦
∗ ∗ ∗ ∗ ∗ −εI
where,
718 X. Miao and Y. Xu

11 = PA + AT P + K T (P + Q)J + δ12 +εC T C,

22 = QA + AT Q − XY − Y T X T + δ2 a2 +εC T C,

then the matrices J and W can be found to make stochastic nonlinear systems (4)
asymptotically stable, meanwhile the observer gain matrix satisfies the following
equation

J = R−1 X .

Proof. The stochastic Lyapunov function is selected as

V (ξ(t), t) = ξ T (t)Pξ(t), (6)

where


ξ T (t) = xT (t) eT (t) elT (t) ,

P = diag{P, Q, D−1 }.

Based on It ô differential formula and Assumption 2, A matrix W exists so that

E T Q = WY ,

Thus, we have

LV (ξ(t), t) = 2xT (t)P((A + A(t))x(t) + g(x(t), u(t), l))


+ 2eT (t)Q[((A + A(t)) − JY )e(t) + g̃]
+ xT (t)K T (P + Q)Kx(t). (7)

Next, utilizing the Lipschitz condition, the following formula can be derived.

LV (ξ(t), t) ≤
xT (t)[P(A + A(t)) + (A + A(t))T P + 1/δ1 PP + δ1 a2 + K T (P + Q)K]x(t)
+ eT (t)[Q((A + A(t)) − JY ) + ((A + A(t)) − JY )T Q + 1/δ2 QQ + δ2 a2 ]e(t).
(8)

By using formula L = R−1 Y ,

 T   
x(t) 11 0 x(t)
LV (ξ(t), t) ≤ . (9)
e(t) ∗ 22 e(t)
Adaptive Observer-Based Control for a Class of Nonlinear Stochastic Systems 719

with
11 = P(A + A(t)) + (A + A(t))T P + 1/δ1 PP + δ1 a2 + K T (P + Q)K

22 = Q(A + A(t)) + (A + A(t))T Q − XY − Y T X T + 1/δ2 QQ + δ2 a2 .


applying Schur complement lemma, LMI (5) means that inequality
E{LV (x(t), t)} < 0
holds.
So, we obtained a conclusion that the error systems are the asymptotically stable by
LMI (5) (see [6]).

4 Adaptive Observer-Based Controller


Next, we discuss the following systems

dx(t) = [((A + A(t))x(t) + Vu(t) + g(x(t), u(t), l)]dt + Kx(t)dω(t)
, (10)
dy(t) = Yx(t)dt
here u(t) ∈ Rp represents control input vector, V represents the real constant matrix.
We give the following nonlinear estimator:
dx̂(t) = [((A + A(t))x̂(t) + Vu(t) + g(x̂(t), u(t), l̂)]dt
+Ju (dy(t) − Y x̂(t)dt), (11)
dl̂ = Dg2T (x̂(t), u(t))Wu [dy(t) − Y x̂(t)dt].
And we set
u(t) = Z x̂(t),
as such, the following result is available.
Theorem 2. With Assumptions 1 and 2, if exist matrices T = S −1 > 0, Q > 0, F, B
and scalars σ > 0, δ1 > 0, δ2 > 0, ε > 0, which satisfy the LMIs successively:
⎡ ⎤
AT + TAT + VF + F T V T + 1/δ1 I + σ I + εBBT T TK T TC T
⎢ ∗ −1/(a2 δ1 )I 0 0 ⎥
⎢ ⎥ < 0,
⎣ ∗ ∗ −T 0 ⎦
∗ ∗ ∗ −εI
(12)
⎡ ⎤
−σ SS + K RK
T −SVZ 0 0
⎢ ∗ RA + A R − BY − Y B + δ2 a I + εC C R RB ⎥
T T T 2 T
⎢ ⎥ < 0,
⎣ ∗ ∗ −δ2 I 0 ⎦
∗ ∗ ∗ −εI
(13)
then the closed-loop dynamic systems are asymptotically stable in mean-square, in
addition, the gains can be expressed as
720 X. Miao and Y. Xu

Z = FT −1 = FS,

Ju = R−1 B.
Proof. Set
P = diag{S, R, D−1 },

V (ξ(t), t) = ξ T (t)Pξ(t),
and u(t) is set to

u(t) = Z x̂(t),
we can obtain
dV (ξ(t), t) = LV (ξ(t), t)dt + 2(xT (t)S + eT (t)R)Kx(t)dω(t). (14)
And we can get the differential operator above is
LV (ξ(t), t) = 2xT (t)S[((A + A(t)) + VZ)x(t) − VZe(t) + g(x(t), u(t), l)]
+ 2eT (t)R[((A + A(t)) − Ju Y )e(t) + g̃ + Eg2 (x̂(t), u(t))el (t)]
− 2elT (t)g2T (x̂(t), u(t))W (y(t) − Y x̂(t)) + xT K T (S + R)Kx(t). (15)
Using the analysis process similar to Theorem 1, the following formula can be obtained
 T   
x(t) 11 −SVZ x(t)
LV (ξ(t), t) ≤ , (16)
e(t) ∗ 22 e(t)
Where
11 = S((A + A(t)) + VZ) + ((A + A(t)) + VZ)T S
,
+1/δ1 SS + δ1 a2 + K T (S + R)K,
22 = R((A + A(t)) − Ju Y ) + ((A + A(t)) − Ju Y )T R
.
+1/δ2 RR + δ2 a2 .
Set
F = ZT ,
 
11 −SVZ
one gets that matrix inequality (12) and < 0 are equivalent from Schur
∗ 22
complement Lemma.
By setting
B = RJu ,
then Theorem 2 can be proven from the similar proof process of Theorem 1.
Adaptive Observer-Based Control for a Class of Nonlinear Stochastic Systems 721

5 Numerical Examples
Here we present one numerical example, to verify the effectiveness of the obtained
conclusions.
Example 1. Considering a nonlinear stochastic system (10) as follow with
⎡ ⎤ ⎡ ⎤
−3 1 0 1.2
A = ⎣ 0.3 −4.5 1 ⎦, E = ⎣ 0.6 ⎦,
−0.1 0.3 −3.8 0.5
⎡ ⎤
0.1 −0.1 0.2
K = ⎣ 0.3 0.3 −0.4 ⎦,
0.1 0.1 −0.3
 
0.8 0.3 0
Y = ,
0.5 1 0.6
g1 (x(t), u(t)) = u(t),

g2 (x(t), u(t)) = −1.5 cos x2 (t),

l = 5, u(t) = sin 10t.


T
B = 0.02 0.03 0.1 ,

C = 0.01 0.04 0.03 ,
Assumption 1 is true when a = 1.5. According to Theorem 1, we can obtain
⎡ ⎤
32.2479 2.1605 −0.5442
P = ⎣ 2.1605 30.4087 −0.5534 ⎦,
−0.5442 −0.5534 31.9002
⎡ ⎤
12.3420 −4.7633 −6.0562
Q = ⎣ −4.7633 36.0585 4.0868 ⎦,
−6.0562 4.0868 30.7761
⎡ ⎤
4.0377 19.9637
X = ⎣ 1.4438 5.5993 ⎦,
0.3558 5.2890
δ1 = 33.0412,δ2 = 26.3316,ε = 34.5158,
Then we can get the observer gain matrix
⎡ ⎤
0.3981 2.0124
J = Q−1 X = ⎣ 0.0837 0.3622 ⎦.
0.0788 0.5198
And we set the matrix W satisfies Assumption 2 as
W = [1.1381 17.6211].
722 X. Miao and Y. Xu

6 Conclusion
This paper considers the estimation method of the state vector of nonlinear stochas-
tic systems with parameter uncertainty. By setting a reasonable Lyapunov–Krasovskii
functional, the corresponding asymptotic stability conclusion of estimation error is
given. The asymptotical stability is guaranteed for the closed-loop system by applying
a state-feedback controller.

References
1. Zhang, K., Jiang, B., Cocquempot, V.: Adaptive observer-based fast fault estimation. Int. J.
Control Autom. Syst. 6, 320–326 (2008)
2. Yan, X.G., Edwards, C.: Fault estimation for single output nonlinear systems using an adaptive
sliding mode estimator. IET Control Theory Appl. 2, 841–850 (2008)
3. Farza, M., M’Saad, M., Maatoug, T., Kamoun, M.: Adaptive observers for nonlinearly
parameterized class of nonlinear systems. Automatica 45, 2292–2299 (2009)
4. Deng, H., Krstić, M.: Output-feedback stabilization of stochastic nonlinear systems driven by
noise of unknown covariance. Syst. Control Lett. 39(3), 173–182 (2000)
5. Miao, X.F., Li, L.L., Yan, X.M.: Adaptive observer-based control for a class of nonlinear
stochastic systems. Int. J. Comput. Math. 92(11), 2251–2260 (2015)
6. Mao, X.R.: Stochastic Differential Equations and Their Applications. Horwood Publishing,
Chichester (1997)
Open Domain Question Answering Based
on Retriever-Reader Architecture

Dequan Zheng1,2(B) , Jing Yang1 , and Baishuo Yong1


1 School of Computer and Information Engineering,
Harbin University of Commerce, Harbin 150028, Heilongjiang, China
dqzheng@hrbcu.edu.cn
2 Institue of System Engineering, Harbin University of Commerce, Harbin 150028, China

Abstract. Open-domain Question Answering (OpenQA) is a trendy research


hotspot in the field of Natural Language Processing (NLP). In the past few years,
with the in-depth research on Deep Learning (DL) and Pre-training Language
Modules (PLM), the research on OpenQA has made rapid progress. OpenQA
aims to find relevant documents from a large-scale text corpus and then extract or
generate correct answers. In this paper, we introduce the development and research
status of OpenQA based on “Retriever-Reader.” Firstly, we briefly expound on the
relevant theories and traditional architecture. Then the function and category of
the Retriever and Reader and the related existing systems are analyzed. Retriever
can be regarded as an Information retrieval (IR) system. The reader can be viewed
as a machine reading comprehension system (MRC). The fourth section gives a
relatively simple introduction to the latest end-to-end training methods and their
related types. Finally, we summarize the content of the paper and look forward to
the future development trend.

Keywords: Open-domain question answering · Information retrieval · Machine


reading comprehension

1 Introduction
Question Answering (QA) system aims to produce precise answers to a given user query
directly and concisely. Instead of users searching for keywords and getting a series of
web links, today’s search engines have also added the function of the QA system. Users
can contact quickly without clicking on the relevant links again for most simple queries,
as shown in Fig. 1. OpenQA [1] searches for the final answer from a large text corpus for a
given query. The traditional OpenQA system comprises three modules, namely question
analysis, document retrieval, and answer extraction. Question analysis is subdivided
into question classification and question standardization. This module aims to match
the user’s given query to the question category that has been defined by using natural
language processing tools. According to the processed problems, the document retrieval
module relies on the IR engine to retrieve relevant documents in a small range from a
large-scale text corpus. Answer extraction aims to further.

© The Author(s), under exclusive license to Springer Nature Switzerland AG 2022


A. E. Hassanien et al. (Eds.): BIIT 2021, LNDECT 107, pp. 723–733, 2022.
https://doi.org/10.1007/978-3-030-92632-8_68
724 D. Zheng et al.

Fig. 1. An example of a Search engine with QA. Fig. 2. An overview of “Retriever-Reader.”

Extract the final accurate answer from the relevant documents returned in the previous
step. Traditional OpenQA performance is too dependent on the problem analysis phase,
specifically problem classification. Due to the complex nature and polysemy of language,
it is almost impossible to cover all semantic expressions of users, either by classifier or
by handwriting. Therefore, the traditional three-stage architecture is very limited.
As DL and Transfer Learning (TL) have made significant progress, OpenQA is
also undoubtedly driven by this trend. Especially in recent years, DrQA [2] proposed
by Chen et al. simplified the three-stage of traditional OpenQA into “Retriever-Reader”
with only two-stage firstly, as shown in Fig. 2. In the following sections, we will elaborate
on Retriever and Reader.

2 Retriever
The retriever is similar to the IR system, and its function is to return a collection of related
documents, including the correct answer. In general, the current retrievers of OpenQA
contain three categories: sparse retriever, dense retriever, and iterative retriever.

2.1 Sparse Retriever


Sparse retriever is the most primitive and classic algorithm. Such as TF-IDF and BM25.
DrQA [2] uses Wikipedia as the only source of knowledge. Its document retriever uses
binary hashing and TF-IDF to find documents related to a given query through reverse-
index search and then return top-k documents to send to the document reader. The
document reader is trained on the SQuAD [3] dataset using a multi-layer Recurrent
Neural Network (RNN). It also uses distant supervision to fine-tune the model and multi-
task joint learning to obtain more excellent results. Subsequently, many researchers were
carried out around “Retriever-Reader.”
Wang et al. [4] proposed a new Reinforced Ranker-Reader (R3) in response to the
low quality of DrQA’s retrieval quality. The role of its Ranker is to sort the retrieved
paragraphs according to the possibility of containing the correct answer. This correlation
calculation of the question and documents was completed by Match-LSTM. Its Ranker
and reader are jointly trained through Reinforcement Learning(RL). The reward source
of RL is the probability that the answer extracted by the reader from the most relevant
top-k documents returned is the correct answer. In the same year, they proved in [5]
that aggregation of answers in different paragraphs supports find answers better. Unlike
Open Domain Question Answering Based on Retriever-Reader Architecture 725

Ranker in [4], the paragraph ranker proposed by Lee et al. [6] focuses on reducing the
recall rate of answers by reducing noise. Kratzwald et al. [7] proposed that RankQA
added an answer rearrangement module. First, this module combines the characteristics
of the retrieval and reading phases. Second, in order to solve the problem of ignoring
other paragraphs due to too much focus on the first relevant paragraph, according to the
conclusion of [5], RankQA aggregates repeated candidate answers together to increase
the amount of information that will provide benefit for subsequent rankings. Finally, the
candidate answers will be reported in this module, and the first answer will be output.
Kratzwald et al. [8] believed that the fixed number of returned documents could not
acquire the optimal solution due to the influence of noise information. They proposed
adaptive document retrieval, which dynamically and adaptively adjusts the number of
returned documents according to the size of information sources and different queries.
Based on BERT [9], Yang et al. [10] proposed BERTserini, a new OpenQA model of
end-to-end training method. The model architecture uses Anserini [11] as a document
retriever, a query expansion-based model that uses BM25 as a rearrangement function.
The experiment found that the paragraph retrieval effect is the best comparing the three
levels of retrieval method. Multi-passage BERT [12] proposed multiple paragraphs to
train the reader in response to the inability to compare the scores of different paragraphs
caused by independent training of related paragraphs. It also uses the sliding window
mechanism and paragraph sorting mechanism to obtain optimal performance. And its
retriever is ElasticSearch based on the BM algorithm.
The above researches are all based on DrQA, and they all use sparse retrievers.
However, sparse retrieval only relies on matching similarity, and it is challenging to
address term mismatch. For this kind of problem, the dense retriever is remarkably
effective.

2.2 Dense Retriever

Dense retriever generally uses dual encoders to encode query and documents, and most
of them are Representation-based at the beginning. For example, OrQA [13], DPR [14],
REALM [15] all use BERT encoders to encode query and documents to generate dense
vector representations. The retrieval score of OrQA [13] is obtained by calculating the
inner product of the dense vector of query and documents and using the Inverse Cloze
Task (ICT) pretraining retriever. Its retrieval accuracy obtained is 19% higher than that
of BM25. Karpukhin et al. [14] proposed DPR. They believe that the ICT calculation
is too large and expensive. DPR realizes the training model only through the question-
answer pair, and the performance of this method has surpassed the traditional sparse
index algorithm.
The above model calculates the document representation in advance and query index
offline, so these representation-based dense retrievers are more efficient. Still, their inde-
pendent representation leads to the loss of due interaction. Retrieve-and-Read [16] stud-
ied the impact of learning answer span from retrieval. Its reader is based on the BiDAF
[17] model, Which utilizes bi-attention to enable effective query interaction with doc-
uments. Nie et al. [18] believed that the purpose of previous research is to find useful
information coverage for reading modules to improve reading ability, rather than to find
726 D. Zheng et al.

accurate information, which deviates from the essence of large-scale reading compre-
hension. They studied the impact of multi-granularity retrieval on the reading module
and realized rich information interaction between query and paragraphs. However, this
method requires a lot of computing power, and the training cost is expensive.
Of course, efficiency and accuracy are not incompatible. DC-BERT [19] mainly
studied the problem of high throughput input caused by the application of PLM in
OpenQA. It has two independent BERT models: an online BERT encodes the question
only once, and an offline BERT pre-encodes all documents and caches their encoding.
After enabling the cache, DC-BERT can immediately read out any document codes.
The decoupling encoding of query and documents, and then fed to the transformer layer
with query and document interaction, Which effectively generates context encoding of
the query-document pair. ColBERT [20] is a high-speed retriever based on BERT. And
proposed a novel sorting model, which uses independent coding to obtain advanced
offline documents. At the same time, a low-cost but powerful post-interaction step is
adopted to achieve a powerful effect of both effectiveness and speed. SPARTA [21]
abandons the inner product of the sequence and uses the fine-grained interaction based
on the token-level level to make retrieval faster and more accurate.

2.3 Iterative Retriever

Since a single round of retrieval may miss relevant documents, a lot of research has been
carried out in the past few years based on iterative retrievers, which retrieve documents
related through multiple steps. HotpotQA [22] is a multi-hop inference dataset released
in 2018. Its release has promoted multi-hop OpenQA to a certain extent and provided a
robust multi-hop dataset for subsequent research.
GOLDEN Retriever [23] the query reconstruction task recasting for MRC task,
because they all will be a problem and some context document as input, aims to generate
natural language strings. Das et al. [24] proposed the Multi-step Reasoner model, which
uses unified encoders to encode queries and paragraphs and fixed paragraphs to represent
vectors. In each round of iteration, the query vector is continuously updated so that
different question expression vectors are used to recall different paragraphs. The model
uses reinforcement learning for training. It takes the reader’s output as a reward to
calculate the strategy gradient to update the query vector and can continuously optimize
the retrieval quality. MUPPET [25] uses the MIPS searcher to reconstruct the query
vector from the paragraphs retrieved in the previous iteration. Its iteration is terminated
when it reaches the set maximum number of related documents. Path Retriever [26]
sequentially retrieves associated documents by using RNN. Inference paths are formed
based on the retrieval records, and then these paths are sorted using an MRC model.
The powerful interactivity of the searcher and reader enables more accurate reasoning
and answers to complex questions. The goal of reconstructing and generating the next
query based on the previous search results is to obtain more relevant documents during the
next search. The generated query generally has two forms: Explicit-query and Implicit-
query. [23] generated query is an explicit-query. In other words, it is the representation of
natural language. [24–26] generate implicit-query, which is dense vector representation.
In comparison, both have their own merits. Explicit-query is easy to understand and
Open Domain Question Answering Based on Retriever-Reader Architecture 727

control manually, but it is often limited by the glossary of terms. Implicit-query is not
restricted by terminology but is poor in interpretability.

3 Reader
Reader is the second core module of OpenQA, and it is usually regarded as an MRC task.
Its function is to get the final answer from the document finally returned by the retriever.
What is more challenging is that the given paragraph of ordinary reading comprehension
is usually a single. In contrast, OpenQA needs to find the final accurate answer in multiple
articles or paragraphs, which requires more complex reasoning. There are two categories
of existing readers: extractive reader and generative reader.

3.1 Extractive Reader


The extractive reader is relatively popular in the previous studies, also be more mature.
It assumes that the correct answer must exist in the relevant documents. And then, it
predicts the answer segment that appears at the starting point in the paragraph.
For example, DS-QA [27] sets up a paragraph selector to more fine-grained select the
most relevant paragraphs from the relevant documents and then calculate the similarity
between these paragraphs and query. Its reader calculates the start and end positions
only from the most relevant paragraph.
Some work studies are based on graph-reader, aiming to extract answer fragments
from retrieval graphs from learning [26, 28]. For example, the input of Graph Reader
[28] retrieve graph produced by graph-retriever. First, it mainly uses Graph Convolution
Networks(GCN) to learn paragraph representation and then extracts the answer span
from the most likely retrieval graph. In Path Retriever [26], its readers reorder reasoning
paths and extract answers simultaneously, applying multi-tasking learning to extract
correct answers from paths containing the highest probability of correct answers. These
related studies have achieved good results, which is inseparable from the success of
language models based on pretraining of natural language understanding, such as BERT,
RoBERTa [29], and SpanBERT [30].

3.2 Generative Reader


However, not all answers can be extracted directly. Some questions require the model to
reason to generate an answer that doesn’t appear in documents directly, such as arithmetic
types and yes/no questions. In order to solve this problem, in the last two years, there
have been researches focusing on the generative reader. The generative reader generates
new answers, which are not obviously in the document. Recently, a growing number of
pretraining models for the Natural Language Generation (NLG) have been put forward.
Such as GPT3, T5, BART [31–33] are often used as the basic models for generative
readers. RAG [34] is a generative open field question answering model that uses DPR
as a retriever and BART as a reader.Moreover, the reader for FID [35] is based on T5 or
BART. The Reader of FID is based on T5 or BART. The query and paragraphs are used as
the input of the encoder, and the decoder generates the final answer vector. Nevertheless,
the generative reader still faces very complex challenges. It needs further exploration in
the future.
728 D. Zheng et al.

Fig. 3. A taxonomy of “Retriever-Reader” OpenQA.

4 End-to-End Training

The past few years, a lot of work has begun to study open domain question answering
in an end-to-end training method. The OrQA, as mentioned above, Retrieve-and-Read,
REALM, and RAG all use end-to-end methods to train the model.
Not long ago, Baidu proposed a new RocketQA [36] model, which employs such
technologies as cross-batch negatives, denoised hard negative sampling, and data aug-
mentation. It dramatically improves the effectiveness of the dual retrieval model and
takes an important step for the end-to-end OpenQA. In addition, there is a retriever-only
model, such as Den-SPI [37], which is used for the real-time OpenQA. The dense-sparse
phrase index is proposed, which is set to capture the syntactic, semantic, and lexical infor-
mation to address the problem that some document information is filtered incorrectly
during the process. Discarding reading comprehension, we only use the query to get the
phrase index as the answer, which is very efficient in the end-to-end benchmark test.
Moreover, A new method called retriever-only has surfaced, which are all PLM
based on Seq2Seq, such as GPT-3 mentioned above. Such models have been trained
on a huge corpus, and a large amount of knowledge is stored in parameters. They can
answer questions without any external knowledge. This model can realize the OpenQA
without the retriever, which is a very different change from the past.
Open Domain Question Answering Based on Retriever-Reader Architecture 729

Table 1. Evaluation results of EM on several classic datasets. EM: exact-match. The datasets in
the table are widely used as benchmarks in OpenQA. We abbreviated the name of models and
datasets. S: SQuAD [3], CT: Curated TREC [42], WQ: Web Question [43], NQ: Natural Question
[44], Q-T: Quasar-T [45], H-Q: HotpotQA [20]. We don’t elaborate on these datasets in this paper.

Model S [3] CT [42] WQ [43] NQ [44] Q-T [45] H-Q [20]


DrQA [2] 29.8 25.7 20.7
R3 [4] 29.1 28.4 17.1 34.2
Par-Ran [6] 30.2 35.4 19.9
RankQA [7] 35.3 34.7 22.3
Ada-Retr [8] 29.8 29.3 19.6
BT-serini [10] 38.6
Mul-p-BT [12] 53.0 51.3
OrQA [13] 20.2 30.1 36.4 33.3
DPR [14] 29.8 49.4 42.4 41.5
REALM [15] 46.8 40.7 40.4
Retr&Re [16] 35.6
DC-BT [19] 27.4
SPARTA [21] 59.3 37.5
Gd-retr [23] 37.9
M-s-Retr [24] 31.9 39.5
MUP [25] 39.3 31.1
Path-Rtri [26] 56.5 60.5
DS-QA [27] 29.1 18.5 42.2
Gra-Rtri [28] 36.4 34.5
RAG [34] 52.2 45.5 44.5
FID [35] 56.7 51.4
Rok-QA [36] 42.8
DEN-SPI [37] 31.2
Den-Phra [40] 40.7 40.9
BPR [41] 41.6

In Fig. 3, We have made a comprehensive review of the structure of this paper


and classified the existing OpenQA systems so that readers can better understand our
description. The main metrics of OpenQA are MRR, EM, and F1.MRR is the mean
reciprocal rank of the first relevant document used in a retriever. EM(Exact match)
measures the percentage of predictions that match the ground truth exactly. F1 is a
weighted average of accuracy and recall rates to measure the average overlap between
the predicted answers and the ground truth. Finally, in Table 1, we list the existing
730 D. Zheng et al.

models mentioned in this paper regarding EM performance. EM is the most commonly


used evaluation metric of OpenQA. It can directly express system performances. These
numbers in the table are the highest scores for models on the corresponding datasets
in the original paper, although some models use external tools (such as multi-tasking
learning).

5 Challenges and the Future

After years of research and development, OpenQA has continuously improved its meth-
ods and related technologies. The ultimate goal of OpenQA is to build a system that
can answer any input question. The achievement of this great goal is a long and arduous
process. The NLP community still needs continuous research and progress. As follows,
we discussed some of the challenges and future development prospects of the existing
OpenQA:

Phrase Index. The phrase index is modular and extensible. The critical difference
between a phrase index and retriever is that the latter index each document by con-
tent, while the former requires an index for each phrase in its context. Although some
work has done related research on the index of the phrase [37–39], constructing phrase
representations in a large corpus of OpenQA is still a mighty challenge. In addition,
existing phrase indexing methods rely heavily on sparse representation to locate related
documents, and their performance is still not as good as the dense index. Seo et al.
[40], published in ACL2021, realized the dense-phrase index in independent learning of
OpenQA firstly. And achieved better results. The future of the dense-phrase index needs
to be further explored.

Retrieval Effectiveness and Efficiency. Generally speaking, sparse retrieval has high
efficiency but low effectiveness, while dense retrieval is the opposite. How to ensure the
effectiveness of retrieval and improve efficiency is a challenge faced by the retriever.
Some studies have made improvements in this area. For example, Seo et al. [41] proposed
a BPR retrieval model. The model combines learning-to-hash techniques on DPR, which
effectively reduces the index space by using binary codes. To get a better and more
effective model, there is still room for exploration and improvement in future work.

End-to-End OpenQA. The end-to-End method eliminates the complex components of


a traditional pipeline, significantly reducing the system’s complexity, and each module
is learnable, enabling the whole system to achieve end-to-end training. This is the devel-
opment trend of OpenQA, which may lead to a new generation of technical changes in
OpenQA.
Open Domain Question Answering Based on Retriever-Reader Architecture 731

References
1. Voorhees, E.M.: The TREC-8 question answering track report. In: Trec, Vol. 99, pp 77–82.
Citeseer (1999)
2. Chen, D.Q., Fisch, A., Weston, J., Bordes, A.: Reading wikipedia to answer open-domain
questions. In: 55th ACL, pp. 1870–1879. ACL, Stroudsburg (2017)
3. Rajpurkar, P., Zhang, J., Lopyrev, K., Liang, P.: SQuAD: 100,000+ questions for machine
comprehension of text. In: Proceedings of the 2016 Conference on EMNLP, pp. 2383–2392.
ACL, Texas (2016)
4. Wang, S., et al.: R3: reinforced ranker-reader for open-domain question answering. In: AAAI-
18. Louisiana (2018)
5. Wang, S., et al.: Evidence aggregation for answer re-ranking in open-domain question
answering. In: 6th ICLR. ICLR, Vancouver (2018)
6. Lee, J., Yun, S., Kim, H., Ko, M., Kang, J.: Ranking paragraphs for improving answer recall
in open-domain question answering. In: Proceedings of the 2018 conference on EMNLP,
pp. 565–569. ACL, Brussels (2018)
7. Kratzwald, B., Eigenmann, A., Feuerriegel, S.: RankQA: neural question answering with
answer re-ranking. In: 57th ACL, pp. 6076–6085. ACL, Florence (2019)
8. Kratzwald, B., Feuerriegel, S.: Adaptive document retrieval for deep question answering. In:
Proceedings of the 2018 conference on EMNLP, pp. 576–581. ACL, Brussels (2018)
9. Devlin, J., Chang, M., Lee, K., Toutanova, K.: BERT: pretraining of deep bidirectional trans-
formers for language understanding. In: Proceedings of the 2019 conference of the NAACL,
pp 4171–4186. ACL, Minnesota (2019)
10. Yang, W., et al.: End-to-end open-domain question answering with BERTserini. In: Proceed-
ings of the 2019 conference of the NAACL, pp. 72–77. ACL, Minnesota (2019)
11. Yang, P., Fang, H., Lin, J.J.: Anserini: enabling the use of lucene for information retrieval
research. In: Proceedings of the 40th international ACM SIGIR conference on research and
development in information retrieval, pp 1253–1256. ACM, Tokyo (2017)
12. Wang, Z., Ng, P., Ma, X., Nallapati, R., Xiang, B.: Multi-passage BERT: a globally normalized
BERT Model for open-domain question answering. In: Proceedings of the 2019 Conference
on EMNLP and 9th IJCNLP), pp 5878–5882. ACL, Hong Kong (2019
13. Lee, K., Chang, M., Toutanova, K.: Latent retrieval for weakly supervised open domain
question answering. In: 57th ACL, pp. 6086–6096. ACL, Florence (2019)
14. Karpukhin, V., et al.: Dense passage retrieval for open-domain question answering. In:
Proceedings of the 2020 conference on EMNLP, pp. 6769–6781. ACL, Online (2020)
15. Guu, K., Lee, K., Tung, Z., Pasupat, P., Chang, M.: Retrieval augmented language model
pre-training. In: Hal, D., III, Aarti, S. (eds.) Proceedings of the 37th international conference
on machine learning, vol. 119, pp. 3929-3938. PMLR, Proceedings of Machine Learning
Research (2020)
16. Nishida, K., Saito, I., Otsuka, A., Asano, H., Tomita, J.: Retrieve-and-read: multi-task learn-
ing of information retrieval and reading comprehension. In: Proceedings of the 27th ACM
international conference on information and knowledge management, pp. 647–656. ACM,
Torino (2018)
17. Seo, M., Kembhavi, A., Farhadi, A., Hajishirzi, H.: Bidirectional attention flow for machine
comprehension. arXiv preprint arXiv:1611.01603 (2016)
18. Nie, Y., Wang, S., Bansal, M.: Revealing the importance of semantic retrieval for machine
reading at scale. In: Proceedings of the 2019 conference on EMNLP and 9th IJCNLP,
pp. 2553–2566. ACL, Hong Kong (2019)
732 D. Zheng et al.

19. Zhang, Y., et al.: DC-BERT: decoupling question and document for efficient contextual
encoding. In: Proceedings of the 43rd international ACM SIGIR conference on research and
development in information retrieval, pp. 1829–1832. Association for Computing Machinery
(2020)
20. Khattab, O., Zaharia, M.: ColBERT: efficient and effective passage search via contextualized
late interaction over BERT. In: Proceedings of the 43rd international acm sigir conference
on research and development in information retrieval, pp. 39–48. Association for Computing
Machinery (2020)
21. Zhao, T., Lu, X., Lee, K.: SPARTA: efficient open-domain question answering via sparse
transformer matching retrieval. In: Proceedings of the 2021 conference of the NAACL,
pp. 565–575. ACL, Online (2021)
22. Yang, Z., et al.: HotpotQA: a dataset for diverse, explainable multi-hop question answering.
In: Proceedings of the 2018 conference on EMNLP, pp. 2369–2380. ACL, Brussels (2018)
23. Qi, P,. Lin, X., Mehr, L., Wang, Z., Manning, C.D.: Answering complex open-domain ques-
tions through iterative query generation. In: Proceedings of the 2019 conference on EMNLP
and 9th IJCNLP, pp. 2590–2602. ACL, Hong Kong (2019)
24. Das, R., Dhuliawala, S., Zaheer, M., McCallum, A.: Multi-step retriever-reader interaction
for scalable open-domain question answering. In: 7th ICLR. ICLR, Louisiana (2019)
25. Feldman, Y., El-Yaniv, R.: Multi-Hop paragraph retrieval for open-domain question answer-
ing. In: 57th ACL, pp. 2296–2309. ACL, Florence (2019)
26. Asai, A., Hashimoto, K., Hajishirzi, H., Socher, R., Xiong, C.: Learning to retrieve reasoning
paths over wikipedia graph for question answering. In: 7th ICLR. ICLR, Louisiana (2019)
27. Lin, Y., Ji, H., Liu, Z., Sun, M.: Denoising distantly supervised open-domain question
answering. In: 56th ACL, pp. 1736–1745. ACL, Melbourne (2018)
28. Min, S., Chen, D., Zettlemoyer, L., Hajishirzi, H.: Knowledge guided text retrieval and reading
for open domain question answering. arXiv preprint arXiv:1911.03868 (2019)
29. Liu, Y., et al.: Roberta: a robustly optimized bert pretraining approach. arXiv preprint arXiv:
1907.11692(2019)
30. Joshi, M., Chen, D., Liu, Y., Weld, D.S., Zettlemoyer, L., Levy, O.: SpanBERT: improving
pretraining by representing and predicting spans. Trans. Assoc Comput. Linguistics 8, 64–77
(2019)
31. Brown, T.B., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165
(2020)
32. Raffel, C., et al.: Exploring the limits of transfer learning with a unified text-to-text transformer.
arXiv preprint arXiv:1910.10683 (2019)
33. Lewis, M., et al.: BART: denoising sequence-to-sequence pre-training for natural language
generation, translation, and comprehension. In: 58th ACL, pp. 7871–7880. ACL, Online
(2020)
34. Lewis, P., et al.: Retrieval-augmented generation for knowledge-intensive nlp tasks. arXiv
preprint arXiv:2005.11401 (2020)
35. Izacard, G., Grave, E.: Leveraging passage retrieval with generative models for open domain
question answering. In: 16th EACL, pp. 874–880. ACL, Online (2021)
36. Qu, Y., et al.: RocketQA: an optimized training approach to dense passage retrieval for open-
domain question answering. In: Proceedings of the 2021 conference of the NAACL, pp. 5835–
5847. ACL, Online (2021)
37. Seo, M., et al.: Real-time open-domain question answering with dense-sparse phrase index.
In: 57th ACL, pp. 4430–4441. ACL, Florence (2019)
38. Seo, M., Kwiatkowski, T., Parikh, A.P., Farhadi, A., Hajishirzi, H.: Phrase-indexed question
answering: a new challenge for scalable document comprehension. In: Proceedings of the
2018 conference on EMNLP, pp. 559–564. ACL, Brussels (2018)
Open Domain Question Answering Based on Retriever-Reader Architecture 733

39. Lee, J., Seo, M., Hajishirzi, H., Kang, J.: Contextualized sparse representations for real-time
open-domain question answering. In: 58th ACL, pp. 912–919. ACL, Online (2020)
40. Lee, J., Sung, M., Kang, J., Chen, D.: Learning dense representations of phrases at scale. In:
59th ACL and 11th IJCNLP, pp. 6634–6647. ACL, Online (2021)
41. Yamada, I., Asai, A., Hajishirzi, H.: Efficient passage retrieval with hashing for open-domain
question answering. In: 59th ACL and 11th IJCNLP, pp. 979–986. ACL, Online (2021)
42. Baudiš, P., Šedivý, J.: Modeling of the question answering task in the yodaqa system. In:
International conference of the cross-language evaluation forum for European languages,
pp. 222–228. Springer (2015)
43. Berant, J., Chou, A.K., Frostig, R., Liang, P.: Semantic parsing on freebase from question-
answer pairs. In: Proceedings of the 2013 Conference on EMNLP, pp. 1533–1544. ACL,
Washington (2013)
44. Kwiatkowski, T., et al.: Natural questions: a benchmark for question answering research, vol.
7, pp. 452–466 (2019)
45. Dhingra, B., Mazaitis, K., Cohen, W.W.: Quasar: datasets for question answering by search
and reading. arXiv preprint arXiv:1707.03904 (2017)
Does Network Infrastructure Improve
the Information Efficiency of Regional Capital
Market?—Quasi Natural Experiment Based
on “Broadband China” Strategy

Guang Yang1(B) , Dengping Li2 , and Yan Wen2


1 Institute of Business Economics, Harbin Business University, Harbin, Heilongjiang, China
2 Faculty of Economics, Harbin Business University, Harbin, Heilongjiang, China

Abstract. Based on theoretical analysis, using the Quasi-natural experiment of


the “broadband China” strategy, this paper constructs a multi-time dual difference
model between network infrastructure construction and information efficiency of
the regional capital market. It conducts an empirical test by using 12156 samples of
1013 A-share listed companies from 2008 to 2019. STATA16 was used for model
regression and calculation. The results show that the construction of network
infrastructure has a significant positive impact on stock price synchronization.
The lower the ownership concentration, the greater the company’s growth, and
the greater the trading noise in the stock market, the more significant the network
promotes the stock price synchronization. Higher stock price synchronization
means lower market information efficiency. Further analysis shows that the con-
struction of network infrastructure negatively impacts the information efficiency
of the regional capital market by reducing the quality of information disclosure in
the capital market. Finally, we hope to face up to the information efficiency of the
capital market under the background of the Internet revolution.

Keywords: Broadband China · Network infrastructure · Stock price


synchronization · Information efficiency · Regional capital market

1 Introduction
The expansion of Internet applications makes human activities more and more guided
by it, and the third internet-based technological revolution will become a new driving
force for economic development. China especially benefits from the Internet revolution,
realizing the rapid digital economy and digital finance development. The “Internet plus”
action has promoted the integrated development of the Internet and all industries, show-
ing a good development momentum. The 19th CPC National Congress also proposed
building a cyber power, a digital China, and a smart society. Therefore, in the context
of supply-side structural reform, network infrastructure construction plays a key role
in developing the Internet, which will become a hot academic concern. Therefore, the
quasi-natural experiment implementing the “broadband China” strategy is endowed with
important practical research significance.

© The Author(s), under exclusive license to Springer Nature Switzerland AG 2022


A. E. Hassanien et al. (Eds.): BIIT 2021, LNDECT 107, pp. 734–743, 2022.
https://doi.org/10.1007/978-3-030-92632-8_69
Does Network Infrastructure 735

So, will be based on the literature analysis and put forward the theoretical assump-
tions, whether to be included in the “broadband China” demonstration city for logo,
build more point of double difference model, at the same time, control the time effect
and industry effects, regional effects, using 1013 a-share listed companies from 2008
to 2019 in 12156 samples of panel data to carry on the empirical research. STATA16
was used for model regression and calculation. Discuss the “broadband China” strategic
impact on stock price synchronicity and company equity concentration, the regulation
of growth and stock market transactions in which the noise effect, finally draw a network
infrastructure to regional capital market information efficiency influence and revelation,
in order to the Internet integration development of regional capital market information
efficiency to supplement the theoretical basis.

2 Theoretical Analysis and Hypothesis

2.1 Network Infrastructure Construction and Regional Capital Market


Information Efficiency

Network infrastructure has a long-term impact on the behavior and development of eco-
nomic participants, accelerates the processing and integration of massive information,
and has a significant positive effect on the development of various industries. At the
same time, however, other scholars hold another view. Lin et al. [1] believe that network
infrastructure will not directly improve economic efficiency. Yilmaz et al. [2] find that
network infrastructure construction in other regions will negatively impact local areas,
so the total spill-off effect produced by network infrastructure construction is negative.
Moreover, as for capital market information, on the one hand, network infrastructure
construction accelerates the transmission efficiency of both idiosyncratic information
and market and industry information, which has the opportunity to improve the syn-
chronicity of stock prices. Capital market information disclosure quality, on the other
hand, are under the influence of the surplus radical and smooth [3], the application of
the Internet to improve stock market transaction number and the information diffusion
speed, bring opportunities and challenges, which affect surplus radical and smooth, make
the information quality to drop, so the network infrastructure construction to a certain
extent caused the company to quality information be flooded, And then the synchronicity
of stock prices.
Therefore, the following hypothesis H1 is proposed. There is a significant positive
correlation between the city where the company is registered as the “broadband Chi-
na” demonstration city and the stock price synchronization and network infrastructure
construction harms the information efficiency of the regional capital market.

2.2 Network Infrastructure Construction, Firm Heterogeneity, and Regional


Capital Market Information Efficiency

First, the moderating effect of corporate ownership concentration on the influence of


network infrastructure construction on regional capital market information efficiency.
The proportion of the largest shareholder is one of the indicators reflecting the ownership
736 G. Yang et al.

concentration. The higher the proportion of the largest shareholder, the lower the quality
of capital market information disclosure from earnings management [4]. This means
that ownership concentration plays a partial substitution role in the negative effect of
network infrastructure construction on the quality of information disclosure in the capital
market. Therefore, ownership concentration will weaken the positive effect of network
infrastructure construction on stock price synchronization.
Therefore, the following hypothesis H2 is proposed: The larger the proportion of
the largest shareholder, the weaker the positive correlation between the city where the
company is registered and the “broadband China” demonstration city and the stock
price synchronization, that is, the lower the ownership concentration, the stronger the
negative impact of network infrastructure construction on the information efficiency of
the regional capital market.
Second, the moderating effect of firm growth on the influence of network infrastruc-
ture construction on regional capital market information efficiency. Since the completion
of network infrastructure, the integration of Internet and enterprise goods and services
has played a key role in the growth of enterprises to a large extent. The market-to-book
ratio is an important indicator reflecting the growth of a company. The higher the market-
to-book ratio is, the stronger the enterprise’s incentive to violate the law [5]. Therefore,
the growth of a company increases the possibility of network infrastructure’s influence
on information quality. Therefore, a company’s growth will enhance the positive effect
of network infrastructure construction on stock price coherence.
Therefore, the following hypothesis H3 is proposed: The higher the market-to-book
ratio is, the stronger the positive correlation between the city where the company is
registered and the demonstration city of “Broadband China” and the stock price syn-
chronization is, that is, the greater the growth of the company is, and the stronger the
negative impact of network infrastructure construction on the information efficiency of
regional capital market is.

2.3 Network Infrastructure Construction, Stock Market Transaction Noise,


and Regional Capital Market Information Efficiency

The moderating effect of stock market transaction noise on the influence of network
infrastructure construction on regional capital market information efficiency. Stock price
synchronicity is influenced by enterprise quality information and noise [6], on the net-
work, information infrastructure construction and the capital market efficiency should be
caused by noise interference is discussed, only when the noise is big, the network infras-
tructure construction and the relationship between stock price synchronicity or signifi-
cantly stronger, to draw a network infrastructure influence the efficiency of information
rather than the noise of the conclusion.
Therefore, the following hypothesis H4 is proposed: The greater the trading noise
in the stock market, the stronger the positive correlation between the city where the
company is registered and the demonstration city of “Broadband China” and the stock
price synchronization, that is, the stronger the negative impact of network infrastructure
construction on the information efficiency of the regional capital market.
Does Network Infrastructure 737

3 Study Design
3.1 Variable Set
Explained Variables. Stock price synchronization is the first choice to measure the
information efficiency of the capital market, and it is also one of the most widely used
indicators. Based on Gul et al. [7] measurement of stock price synchronicity, the market
model is as follows:

Ri,w = α0 + α1 Rm,w + α2 Rm,w−1 + α3 Rind ,w + α4 Rind ,w−1 + εi,w (1)

Where, i represents company I, W represents the w week, W − 1 represents the


lagging period, Ri and T represent the return rate of individual stocks, Rm and W represent
the market return rate, and Rind and W represent the average return rate of the industry
excluding the company. The R2 of the company i in year T was obtained by annual pair
model regression, which was substituted into the following Eq. (2):
 
R2i,t
SYNi,t = ln (2)
1 − R2i,t

Among them, SYNit represents the stock price synchronization of the company i in
t years. The deformation of the model (2) is to expand the index value range to infinity,
and SYN is set as the explained variable. The higher the stock price synchronization is,
the more serious the phenomenon of “rising and falling together” will be, and the lower
the information efficiency of the capital market will be.

Explanatory Variables. Referring to the measurement method of network infrastruc-


ture by Xue Cheng et al. [8], network infrastructure construction takes the “broadband
China” strategy as the boundary and whether the city where the company is registered
is included in the “Broadband China” strategy demonstration city list as the standard.
Moreover, since the list of demonstration cities of “Broadband China” was published
in 2014, 2015, and 2016 respectively, the TreatPost variable was set as the explanatory
variable. If the company’s registered city does not appear in the list of demonstration
cities of “Broadband China” before the t year or has not been there, then TreatPost vari-
able value is 0. If the company’s registered city appears in the list of “Broadband China”
demonstration cities in year T, then the value of TreatPost variable in year T and after is
1.

Adjustment Variables. According to Chen Donghua and Yao Zhenye [9], the owner-
ship concentration of a company is measured by the proportion of the largest shareholder,
which is set as First, and the growth of the company is measured by the market-to-book
ratio, which is set as MB. The intensity of trading noise is defined by Shen Yongtao and
Gao Yusen [10], and the trading noise in the stock market is measured on the basis of
the negative of the first order autocorrelation coefficient of the regression residual term
of the above market model (1). The greater the variable value, the greater the noise, set
as Noise.
738 G. Yang et al.

Control Variables. Reference dong-hua Chen and Yao Zhenye [9], the selection of
control variable is shown below: The variable of company size is the natural logarithm
of the total assets of the enterprise, set as LNSIZE; The financial leverage variable is
the total liabilities divided by the total assets, set as Lev; The variable of listing years
is the natural logarithm of the listing time of the enterprise, set as LNAGE. Profitability
variable is the return on equity of the enterprise, set as ROE; The variable of market
activity is the natural logarithm of stock market trading volume, set as LNSTV. The
dummy variable of audit quality is whether the Big Four accounting firms audit the
auditor. If the value is 1, if not, it is 0, and set to BIG4. The dummy variable of enterprise
nature is whether it is state-owned holding, if it is 1, if it is not, it is 0, and set as SOE.

3.2 Data Sources and Sample Selection


Explanatory variable data were obtained from the official Ministry of Industry and Infor-
mation Technology, PRC, and other variable data were all obtained from CSMAR. Since
the list of “Broadband China” demonstration cities was first published in 2014, the sam-
ple time span of this paper was selected from 2008 to 2019. A-share listed companies
were selected as the research objects, and missing key variables, less than 30 weeks of
stock trading per year, financial industry, *ST and ST companies were excluded. The
financial reporting structure and figures of the financial industry are quite special, which
differs greatly from other types of enterprises, and is not conducive to drawing scien-
tific conclusions. Therefore, it is excluded from most authoritative articles as a research
practice. Finally, A sample of 12,156 a-share listed companies of 1,013 companies was
obtained. As shown in Table 1 below:

Table 1. Sample description

Variables N Mean Sd Min Max


SYN 12,156 −0.0533 0.825 −5.163 2.430
TreatPost 12,156 0.301 0.459 0 1
First 11,682 35.88 15.45 0.290 89.99
MB 11,996 2.023 4.177 0.694 393.0
Noise 12,156 0.0807 0.155 −0.638 0.716
LNSIZE 12,152 22.48 1.415 15.42 28.64
Lev 12,152 0.508 0.594 0.00173 55.41
LNAGE 12,156 2.588 0.503 0 3.401
ROE 12,071 0.0744 0.183 −5.268 5.682
LNSTV 12,156 21.39 0.999 17.58 26.13
BIG4 12,151 0.0532 0.225 0 1
SOE 11,531 0.640 0.480 0 1
Does Network Infrastructure 739

3.3 Model Design


Referring to the model design of analysis on the effect of policies implemented in
multiple periods by Xue Cheng et al. [8], the variables Treat and Post can be omitted
because the individual fixed effect and time fixed effect are added. At the same time, to
ensure the robustness of the model, the urban fixed effect and industry fixed effect are
added.

SYNi,t,c,ind = β0 + β1 TreatPosti,t,c,ind + Controlsi,t,c,ind +
(3)
μi + λt + ηc + γind + εi,t,c,ind

In which, I represents company I, T represents t years, C represents CITY C, IND


represents ind industry, SYN represents the information efficiency of the explained vari-
able capital market, and the larger its value is, the lower the information efficiency is,
TreatPost represents the network infrastructure construction of the explanatory vari-
able, Controls represents all control variables, μ I represents individual fixed effects,
TreatPost represents the network infrastructure construction of the explanatory variable,
Controls represents all control variables, μ I represents individual fixed effects, TreatPost
represents the network infrastructure construction of the explanatory variable, Controls
represents all control variables, and μ I represents individual fixed effects. λt represents
the time fixed effect, ηc represents the city fixed effect γind represents the industry fixed
effect ε I,t,c,ind represents the random error term.
Model (4) is used to demonstrate the moderating effect of corporate ownership con-
centration, model (5) is used to demonstrate the moderating effect of corporate growth,
and model (6) is used to demonstrate the moderating effect of stock market transaction
noise. The specific models are shown as follows:

SYNi,t,c,ind = β10 + β11 TreatPosti,t,c,ind + β12 Firsti,t,c,ind + β13 FirstTreatPosti,t,c,ind



+ Controlsi,t,c,ind + μi + λt + ηc + γind + εi,t,c,ind
(4)
β20 + β21 TreatPosti,t,c,ind + β22 MBi,t,c,ind
+β23 MBtTreatPosti,t,c,ind
SYNi,t,c,ind =  (5)
+ Controlsi,t,c,ind
+μi + λt + ηc + γind + εi,t,c,ind
β30 + β31 TreatPosti,t,c,ind + β22 MBi,t,c,ind
+β33 MBtTreatPosti,t,c,ind
SYNi,t,c,ind =  (6)
+ Controlsi,t,c,ind
+μi + λt + ηc + γind + εi,t,c,ind

4 Analysis of Empirical Results


In The empirical regression results in Table 2, equations Eqs. (1)–(5) are the empirical
results of the influence and moderating effect of network infrastructure construction on
740 G. Yang et al.

the information efficiency of the regional capital information market. Equation (1) is the
result that the synchronicity of stock price is only affected by whether the city where the
company is registered is in the list of “Broadband China” demonstration cities. In order
to ensure the stability of the results, the fixed effects of time, individual, region, and
industry were considered. Equation (2) adds regulating variables and control variables
based on Eq. (1), which is an empirical study of the model (3). Equation (3) considers
the moderating effect of ownership concentration and is an empirical study of the model
(4). Equation (4) considers the moderating effect of company growth and is an empirical
study of the model (5). Equation (5) takes into account the moderating effect of stock
trading noise and is an empirical study of the model (6). At the same time, whether fixed
effects were added, sample size, the goodness of fit and overall significance test results
of the equation were marked at the bottom of the table.
Network infrastructure construction has a negative impact on the information effi-
ciency of regional capital market. In Eq. (1), the influence coefficient of the explanatory
variable on the explained variable is positive 0.065, and the variable passes the T test and
the equation passes the F test at the 1% significance level. It indicates that whether the
city where the company is registered is in the list of “Broadband China” demonstration
cities has a significant impact on the stock price synchronization at the 1% level and a
positive correlation, that is, network infrastructure construction has a negative impact on
the information efficiency of regional capital market. Equation (2) shows that even on
the basis of control variables, the relationship between the two still passes the T-test at
the significance level of 1%, and the coefficient increases to 0.071. After adjustment, R2
increases from 0.364 to 0.396, indicating that the goodness of fit of the equation is higher.
The F value of the significance test of the equation rises from 7.867 to 38.02, indicating
that the significance of the equation is higher. It can be seen that the conclusion obtained
by Eq. (1) is more reliable, so hypothesis H1 is supported by empirical evidence.
The lower the ownership concentration, the stronger the negative impact of network
infrastructure construction on regional capital market information efficiency. In Eq. (3),
the variable “FirstTreatPost” is the product of variable “First” and variable “TreatPost”
after decentralized treatment. After adding the variable “FirstTreatPost” into Eq. (2),
the meaning of “First” and “TreatPost” no longer represents the variable itself. The
purpose of decentralization is to make First significant without unnecessarily affecting
the conclusion. It can be seen that the equation passes the F test at the significance level
of 1%, the FirstTreatPost variable passes the T test at the significance level of 1%, and
the coefficient is negative −0.003. In contrast, the coefficient of TreatPost variable is
significantly positive, indicating that the regulating variable First has a weakening effect
in the influence of the main model. The higher the ownership concentration, The network
infrastructure construction has a weaker negative impact on the information efficiency
of regional capital market. Therefore, hypothesis H2 is empirically supported.
The greater the growth of the company, the stronger the negative impact of network
infrastructure construction on the information efficiency of regional capital market. Same
as the above principle, the variable MBTreatPost in Eq. (4) is the product of variable
MB after decentralized treatment and variable TreatPost. The equation passes the F
test at the significance level of 1%, and the variable MBTreatPost passes the T test
at the significance level of 1%, with a positive coefficient of 0.062. The coefficient of
Does Network Infrastructure 741

TreatPost is the same as that of TreatPost, indicating that the regulating variable MB
has an enhanced effect in the influence of the main model. The greater the growth of
the company, the stronger the negative impact of network infrastructure construction
on the information efficiency of regional capital market. Therefore, hypothesis H3 is
empirically supported.
The greater the noise of stock market, the stronger the negative impact of network
infrastructure construction on the information efficiency of regional capital market.
Equation (5) The variable NoiseTreatPost is the product of variable Noise and vari-
able TreatPost. The equation passes the F test at the significance level of 1%, and the
variable NoiseTreatPost passes the T test at the significance level of 5%, with a posi-
tive coefficient of 0.202. The coefficient of TreatPost is the same as that of TreatPost,
indicating that the moderating variable Noise has an enhanced effect in the influence of
the main model. The greater the Noise of stock market trading, the stronger the negative
impact of network infrastructure construction on the information efficiency of regional
capital market. Therefore, hypothesis H4 is empirically supported.

Table 2. Empirical regression results

(1) (2) (3) (4) (5)


VARIABLES SYN SYN SYN SYN SYN
TreatPost 0.065*** 0.071*** 0.072*** 0.071*** 0.057**
(2.805) (2.983) (3.020) (2.977) (2.308)
First −0.004*** −0.003*** −0.004*** −0.004***
(-3.528) (-2.756) (-3.781) (-3.506)
FirstTreatPost −0.003***
(-3.078)
MB −0.032*** −0.032*** −0.066*** −0.031***
(−5.143) (−5.249) (−8.215) (−5.100)
MBTreatPost 0.062***
(6.644)
Noise 0.267*** 0.266*** 0.263*** 0.209***
(6.449) (6.443) (6.378) (4.271)
NoiseTreatPost 0.202**
(2.219)
LNSIZE 0.158*** 0.156*** 0.149*** 0.158***
(9.019) (8.936) (8.535) (9.030)
Lev −0.441*** −0.443*** −0.433*** −0.440***
(−6.406) (−6.435) (−6.296) (−6.384)
(continued)
742 G. Yang et al.

Table 2. (continued)

(1) (2) (3) (4) (5)


LNAGE 0.222*** 0.223*** 0.214*** 0.223***
(4.705) (4.739) (4.558) (4.727)
ROE −0.196*** −0.196*** −0.196*** −0.196***
(−5.062) (−5.087) (−5.093) (−5.081)
LNSTV −0.191*** −0.190*** −0.190*** −0.190***
(−14.604) (−14.563) (−14.593) (−14.596)
BIG4 −0.093** −0.078* −0.075* −0.092**
(−2.223) (−1.843) (−1.786) (−2.201)
SOE 0.070 0.077 0.062 0.072
(1.389) (1.513) (1.229) (1.411)
Observations 12,152 11,294 11,294 11,294 11,294
Company FE YES YES YES YES YES
Year FE YES YES YES YES YES
Industry FE YES YES YES YES YES
City FE YES YES YES YES YES
R-squared 0.418 0.452 0.452 0.454 0.452
Adj R-squared 0.364 0.396 0.396 0.398 0.396
F 7.867 38.02 35.67 38.68 35.28
F test 0.00504 0.000 0.000 0.000 0.000

5 Conclusion and Enlightenment

In this paper, A multi-time difference model of network infrastructure construction and


regional capital market information efficiency is constructed. The moderating effect of
ownership concentration, company growth, and stock market noise are discussey. Panel
data of 12,156 samples of 1,013 A-share listed enterprises from 2008 to 2019 are selected
for empirical analysis. The hypothesis of theoretical analysis is verified: network infras-
tructure construction has a negative impact on the information efficiency of the regional
capital market. The lower the ownership concentration, the greater the growth of the
company, and the greater the transaction noise of the stock market, the stronger the neg-
ative impact of network infrastructure construction on the information efficiency of the
regional capital market. The results show that the network infrastructure construction has
a significant positive impact on the stock price synchronization. The lower the owner-
ship concentration, the greater the growth of the company, and the greater the transaction
noise of the stock market, the more significant the effect of the network on the stock
price synchronization, and the higher the stock price synchronization means the lower
the market information efficiency. Further analysis shows that network infrastructure
Does Network Infrastructure 743

construction has a negative impact on regional capital market information efficiency by


reducing the quality of capital market information disclosure.
According to the conclusions, the practical implications of the policy suggestions
are as follows: first, in the face of the third internet-based technological revolution, it
plays a key role in economic development, but for the information-based capital mar-
ket, it is still necessary to be vigilant about the negative impact of the construction of
Internet on the information efficiency of the capital market. For companies with lower
ownership concentration and greater growth, the negative impact is more significant, so
relevant policies need to be improved to balance. At the same time, relevant systems can
be established to reduce the noise of stock market transactions to weaken its negative
impact. Secondly, attention should be paid to the supervision and control of the quality
of information disclosure in the capital market under the background of the Internet rev-
olution, which can be managed from both internal and external sides. On the one hand,
internal corporate governance should face the important role of high-quality information
disclosure for the long-term development of enterprises, avoid excessive radicalization
and smoothness, and conduct internal self-motivation. On the other hand, external gov-
ernments severely punish low-quality information disclosure for encouraging enterprises
to take active actions due to loss avoidance.

Acknowledgment. This research was supported by the 2020 Ideological and Political Work
Research Project of Harbin University of Commerce (2020SZY002)and Humanities and Social
Science Research Planning Foundation of Ministry of Education of China(21YJAZH099).

References
1. Lin, W.T., Shao, B.B.M.: The business value of information technology and inputs substitu-
tion: the productivity paradox revisited. Decis. Support Syst. 42(2), 493–507 (2006)
2. Yilmaz, S., Haynes, K.E., Dinc, M.: Geographic and network neighbors, spillover effects of
telecommunications infrastructure. J. Reg. Sci. 42(2), 339–360 (2002)
3. Bhattacharya, U., Daouk, H., Welker, M.: The world price of earnings opacity. Accounting
Rev. 78(3), 641–678 (2003)
4. Lian, Y.H., Zhang, L.: Ownership concentration, market constraints and earnings management
of commercial banks. J. Financ. Econ. 34(02), 125–138 (2019)
5. Xiao, Q., Shen, H.Y.: Research on analyst concern, growth and corporate violations. Bus.
Res. 10, 116–125 (2017)
6. Zhang, Y.R., Li, X.Y.: Information content measurement in R2 and stock price. J. Manage.
Sci. 13(05), 82–90 (2010)
7. Gul, F.A., Kim, J.-B., Qiu, A.A.: Ownership concentration, foreign shareholding, audit quality,
and stock price synchronicity: evidence from China. J. Finan. Econ. 95(3), 425–442 (2010).
https://doi.org/10.1016/j.jfineco.2009.11.005
8. Xue, C., Meng, Q.X., He, X.J.: network infrastructure construction and enterprise technology
knowledge diffusion, a quasi-natural experiment from “broadband China” strategy. Finan.
Econ. Res. 46(04), 48–62 (2020)
9. Chen, D.G., Yao, Z.Y.: Does government action necessarily increase stock price synchroniza-
tion? – Empirical research based on China’s industrial policy. Econ. Res. J. 53(12), 112–128
(2018)
10. Shen, Y.T., Gao, Y.S.: Is the existence of a-share analysts reasonable? Shanghai Finan. 05,
24–32 (2020)
Optimized Coloring Algorithm Based
on Non-local Neighborhood Search

Haitao Xin(B) and Ezhen Peng

Harbin University of Commerce, Harbin 150028, Heilongjiang, China


102714@hrbcu.edu.cn

Abstract. Image coloring is the process of adding color to grayscale images with
the assistance of a computer. This paper proposes an optimal coloring algorithm
based on non-local neighborhood search to better deal with boundary color pen-
etration when colorizing grayscale images. First, the K non-local texture most
similar neighborhoods are searched based on the image texture features. Then,
based on the neighborhood pixels with similar intensity having similar color, the
group of similar neighborhood pixels is further searched. The search results take
into account not only the pixel texture features but also the light intensity distribu-
tion. Finally, the neighborhood pixels are introduced into the optimized coloring
method, and the image is colored. The experimental results show that the way can
effectively reduce boundary color penetration by non-local propagation of color
information. In the case of only a tiny amount of color smearing, the method in
this paper can provide better visual effects.

Keywords: Colorization · K Nearest Neighbors(KNN) · Optimization-based


colorization · Neighborhood similarity pixel

1 Introduction
Image coloring is a challenging computer vision problem. In recent years, more and
more researchers have shifted their attention to the direction of image coloring. Image
coloring refers to the image processing technology that adds color to gray images or
videos through computers according to the color of the user’s graffiti. This is a term
put forward by Wilson Markel in 1970, used to describe his computer-aided process of
adding color to grayscale movies or TV programs [1]. In recent years, image coloring
is also widely used in science, military, and medical fields [2, 3].
With the further exploration of image coloring by researchers at home and abroad,
many effective algorithms have been put forward, which can achieve the goal of the
accurate coloring of gray images to a certain extent. Among them, the original method
is to segment the picture according to the texture information of the gray image and then
color each region separately [4]. Although this method is simple, the coloring effect
of the image edge parts, such as hair and other complex parts, will have a particular
deviation. Therefore, the segmentation algorithm can only deal with simple images, and
image coloring puts forward higher requirements for the segmentation algorithm. Among

© The Author(s), under exclusive license to Springer Nature Switzerland AG 2022


A. E. Hassanien et al. (Eds.): BIIT 2021, LNDECT 107, pp. 744–753, 2022.
https://doi.org/10.1007/978-3-030-92632-8_70
Optimized Coloring Algorithm Based on Non-local Neighborhood Search 745

the coloring methods based on color labeling, the representative one is the optimized
coloring algorithm proposed by Levin et al. [5], which assumes a similar relationship
between color and brightness of adjacent pixels in space. Subsequently, Yatziv et al.
[6] proposed a re-coloring method based on weighted distance function derivation and
fusion, and Fattal et al. [7] applied wavelet transform technology to the image editing
method. Wang et al. [8] proposed an algorithm based on neighborhood correlation and
optimization technology. Generally, these methods can’t get ideal results at the boundary,
especially when users doodle a few colors. The algorithms described above are all image
coloring methods based on local neighborhood propagation.
The image coloring method based on global neighborhood propagation can realize
global color propagation. Even if the colored pixels are far away from the user’s location,
they can be better colored. Charpiat et al. [9] proposed a global optimization method
for graph cut for automatic color allocation. Sheng et al. [10] proposed the progressive
coloring optimized by the Gabor filter in the feature space based on the non-local idea.
Musialski et al. [11] proposed a color replacement strategy to keep the distance between
the input image pixels and map the source image sample to the target image. Compared
with the local neighborhood coloring method, global neighborhood propagation reduces
the number of user color graffiti but lacks local or direct color selection. The coloring
method based on local neighborhood propagation or global neighborhood propagation
has certain limitations. When two similar colors are colored into different colors, it will
cause color mixing, resulting in error coloring results. Therefore, this paper proposes an
image coloring method based on a non-local neighborhood search.
Machine learning has been continuously developed in recent years, and it has gradu-
ally penetrated various academic fields. The architecture of the network is also constantly
improving, and network models such as the VGG network, GoogLeNet network, and
ResNet network gradually appear. More and more scholars began to apply machine
learning to image coloring and achieved good results in automatic or semi-automatic
color. The whole process of coloring can be divided into the training stage and the
coloring stage. Model training usually requires many different types of images, which
reduces complex user interaction and reduces the complexity of selecting reference
images. Cheng et al. [12] use many different types of pictures as training sets and CNN
as training models, so extracting image features and color information as training sets is
necessary. Subsequently, Iizuka et al. [13] proposed an end-to-end convolutional neural
network model, which can directly obtain color images from gray images, thus elim-
inating the links of pre-processing and post-processing. But the above network model
needs a lot of pictures for training. The algorithm proposed in this paper can fully play
its advantages when the number of images is minimal.
In this paper, we can better solve the discontinuous or overflow color area problem
based on the optimized coloring algorithm, combined with non-local neighborhood
search. At present, most existing coloring technologies only consider the light intensity
distribution. Although these methods have a good coloring effect on smooth image areas,
they are usually ineffective for vital texture areas unless the user doodles the image areas
in sufficient detail. Therefore, this paper introduces a non-local neighborhood search
method based on texture similarity and spatial coordinate distance. Experimental results
show that the coloring method proposed in this paper is very effective.
746 H. Xin and E. Peng

2 Related Algorithm Theory


2.1 Similar Neighborhood of Texture KNN Search
FLANN [14] is a fast neighbor search library in high-dimensional space, which realizes
a set of search algorithms including kd-tree, etc. It can automatically select the most
suitable algorithm to complete the K neighbor search task according to the characteris-
tics of the data set itself. Taking the algorithm as the nearest neighbor search program
parameter, the problem is simplified to the parameter of determining the optimal solu-
tion. This is an optimization problem in parameter space. The algorithm uses build time
weight wb and memory weight wm to control the relative importance of search overhead
and calculate the overall cost:
s+wb
cost = (s+wb )opt + wm m (1)

where s denotes the search time for the number of vectors in the sample data set. b
denotes the time to build the tree, m = mt /md denotes the ratio of the memory used
for the tree (mt ) to the memory used for storing data (md ). The build time weight (wb )
controls the importance of the tree build time relative to the search time.
Philip et al. [15] proved that non-local theory could effectively reduce user input
and maintain accuracy. However, with the increase of the searched pixel neighborhood,
the time of searching k non-local texture most similar neighborhoods increases sig-
nificantly. The KNN method has been widely used because of its simplicity and easy
implementation. Using FLANN, it is easy to search the most similar neighborhoods of k
non-local textures, which is proven effective in [16]. It takes non-local texture similarity
as constraint, and the constructed feature vector is expressed as:

F = {Y (Np ), γ × I , γ × j} (2)
In which Y(Np) is a small luminance area around pixel p in the luminance channel.
Np represents a square neighborhood with a fixed size centered on pixel p. (i, j) is
the coordinate of pixel p, and the parameter γ is used to adjust the search range of a
similar neighborhood. When parameter γ becomes larger, the search area of a similar
neighborhood becomes smaller. When the γ parameter is too large, it does not work
when searching k non-local nearest texture similar neighborhoods. Huang et al. [17]
concluded through experiments that the representative value of parameter γ ranges from
0.5 to 5 and the representative value of parameter K ranges from 5 to 15 in K nearest
neighbor search. Therefore, in this paper, the parameters are fixed at γ = 1 and K = 15,
are used to search K non-local nearest texture-similar neighborhood pixels.
Because the gray image has no color information, only make full use of gray infor-
mation to establish the matching relationship between images. This algorithm can be
applied to images with rich textures. Considering the texture information of the image,
it can better solve the color penetration at the image boundary.

2.2 Neighborhood Similarity Pixels Search


The color of any pixel in an image is connected with the color of its neighboring pixels.
Therefore, the location of neighboring pixels has an important influence on the estimation
Optimized Coloring Algorithm Based on Non-local Neighborhood Search 747

of chroma value. By observation, adjacent pixels with similar intensity have similar
colors. In this algorithm, the neighborhood similarity pixel search method proposed by
Wang et al. [18] is used. The neighborhood pixels are searched by taking a pixel position
and pixel intensity as constraints.
As shown in Fig. 1, purple pixels represent pixels to be estimated. For purple pix-
els, if the distribution of adjacent pixel positions is uniform, such as the green part in
Fig. 1(a) when the chromaticity value of purple pixels is a linear weighting function of
the chromaticity value of green pixels, the final estimated chromaticity value will cause
deviation. If the neighborhood similarity pixel method is used to search, as shown in
Fig. 1(b), the green part is an irregular distribution. It can be seen from Fig. 1(b) that the
chromaticity values of all green pixels are closer to those of purple pixels. Therefore,
based on the neighborhood similarity distribution, the coloring effect of purple pixels is
most relative to its original image.

(a) Uniform distribution (b) Similarity distribution based


on neighborhood
Fig. 1. Distribution of neighboring pixels of purple pixels

Pixels to be colored are extremely sensitive to neighborhood positions. Therefore,


before calculating the weighting coefficient of its neighborhood, the position of the
neighborhood pixel of the pixel to be colored should be searched. In the window where
the pixel r to be colored is located, the similarity between them and the pixel r is calculated
using formula (3).
   
di = exp − |xr −x i| |Ir −Ii |2
2
σ2
exp − σ2
(3)
s r

The parameters σs2 and σr2 adjust the spatial similarity and intensity similarity, respec-
tively. It can be known from formula (2) that when two pixels are close in spatial position
and brightness value, the similarity d i value should be larger. When two pixels are far
away, and the difference in brightness value is larger, the similarity d i value should be
smaller. Order d from high to low, and the first m pixels are the neighborhood d i pixels
of the pixel r that this paper wants to acquire.
Due to the close relationship between the pixels to be colored and the neighborhood
position, the algorithm considers the brightness information of the image. It combines
the location information of the pixels to calculate the neighborhood similarity and realize
748 H. Xin and E. Peng

the non-local neighborhood search. In the case of a small amount of artificial coloring,
it has a good coloring effect.

3 Algorithm Steps
Our algorithm works in YUV color space. YUV [19] is a color-coding method adopted by
European TV systems, in which Y is a luminance channel, which is usually called lumi-
nance or intensity, while U and V are chrominance channels, which are two components
of the color.
In this experiment, firstly, we use FLANN [14] method to search k non-local nearest
neighbor pixels with similar textures. Then, based on the KNN algorithm, pixel location
and pixel intensity are taken as constraints of neighborhood search, and similar neigh-
borhood pixel groups are further searched. Then, the local pixel neighborhood relation
wrs of the pixel to be colored in the luminance channel is calculated, and these weight-
ing coefficients are transferred to the chroma. Finally, the color of pixels is obtained by
solving a quadratic optimization problem.
The brief framework of the whole algorithm is shown in the following Fig. 2.

Fig. 2. Framework of coloring method

Step 1: Non-local Neighborhood Pixel Search. For the Y channel of a given


monochrome image, the neighborhood pixels of the pixel to be colored are searched
first. In this paper, the search method discussed is used to find the required neighbor-
hood position. Set the size of the search window as 10 * 10 with the pixel r to be colored
as the center. Firstly, the algorithm in this paper uses the formula (2) to calculate the
texture similarity of the rest pixels in the search window with pixel R as the center, which
decreases in sequence according to the texture similarity and takes the first 15 pixels as
the texture similarity neighborhood. On the basis of these 15 pixels, the similarity of
neighboring pixels of pixel r is calculated by using the formula (3). Sorting according
to the descending similarity, taking the first 8 pixels of similarity as the neighborhood
pixels of pixel r.
Step 2: Calculate the Brightness Weighting Coefficient. The weighting coefficient
used in this paper is also commonly used in the image segmentation algorithm, which is
based on the square difference between two intensities. In this algorithm, a luminance
Y (x, y, t) is input, and two-color components U (x, y, t) and V (x, y, t) are output. In
order to simplify symbols, (x, y, t) is represented by bold letters (e.g. r,s). Y(r) is the
brightness of a specific pixel. When the neighborhood pixels are found in step 1, the
luminance weighting coefficients of the neighborhood pixels are calculated. Finally, the
weighting coefficient wrs is obtained by solving the linear Eq. (3).
Optimized Coloring Algorithm Based on Non-local Neighborhood Search 749

wrs = e−(Y (r)−Y (s))


2 /2σ 2
r (4)

Where σr2 is the variance of the intensity of the neighboring pixels searched for in pixel
r. wrs is a weighting coefficient whose sum is 1. It can be seen from formula (4) that the
more similar Y(r) is to Y(s), the larger the value of wrs is. When the two intensities are
different, the weighting coefficient is more minor.

Step 3: Calculate Brightness Value. As mentioned in Levin’s method [5], this paper
wants to constrain two adjacent pixels r, s. If their intensities are similar, they should
have similar colors. Therefore, it is desirable to minimize the color weighted average at
the adjacent pixel s at the pixel U(r) and V(r), thereby reducing the difference between
the pixel r and the adjacent pixel s. In this paper, it is calculated by the formula (5) and
formula (6). When the calculated value tends to the minimum, the chroma value of each
pixel is obtained.

 2
 
J (U ) = U (r) − wrs U (s) (5)
r s∈N (r)
 2
 
J (U ) = V (r) − wrs V (s) (6)
r s∈N (r)

4 Experimental Results and Analysis


The experimental environment adopts Microsoft Visual Studio 2019 developed by
Microsoft Corporation of America, and the programming language used is C++. The
computer configuration used in the experiment is: Windows10 processor: Intel (R) Core
(TM) i7-9750H CPU @ 2.60 GHz 2.59 GHz; Memory: 16.00GB; System type: 64-bit
operating system. In this paper, the algorithm realizes the coloring of the grayscale image
under the experimental environment and computer configuration.
The data set used in this paper is the SUN data set, which contains 899 categories
and 130,519 extended scenes of images. We will select different kinds of pictures from
the SUN data set to display the coloring effect. Figure 3 shows the original color image
of the image to be colored. Figure 4 shows the comparison of the coloring effect of this
algorithm with the Levin algorithm [5] and Yatziv algorithm [6]. Figure 4(a) firstly, the
original color image is converted into a gray image, which is manually marked according
to the requirements of the optimized coloring method, and the user needs to mark colors
for each area of the image. Figure 4 (b) and (c) show the coloring effect of the Levin
algorithm and Yatziv algorithm in turn. It can be seen from Fig. (4) that Levin et al.’
s method carries out color propagation according to the similar relationship between
color and brightness of adjacent pixels in space and can only carry out color propagation
to the smooth neighborhood of adjacent space. Therefore, when the painted lines are
750 H. Xin and E. Peng

insufficient, serious color penetration will occur. Yatziv algorithm, like Levin algorithm,
also has different degrees of color penetration problems. The coloring effect of the new
method proposed in this paper is as shown in Fig. 4(d), which can keep the authenticity
and accuracy of the coloring effect on the premise of using the same user input as the
previous two methods. This method can spread color information globally because the
feature space contains global color’s texture and brightness features. When the marking
range of color lines is insufficient, image color penetration can be better solved.

Fig. 3. Original color image

Tables 1 and 2 lists the PSNR value and SSIM results of the grayscale image after
coloring by different methods. It can be seen from the following table that the output
of the coloring algorithm in this paper is almost superior to the Levin algorithm and
Yatziv algorithm. As shown in a bathroom in Fig. 4, when the user tag area is small and
the boundary is obvious, the Levin and Yatziv algorithms have more boundary color
penetration problems. The proposed method can better solve these problems, and the
corresponding PSNR values are also significantly higher than those of the other two
methods.
Optimized Coloring Algorithm Based on Non-local Neighborhood Search 751

archive

bathroo
m

beach

valley

(a) Stroke (b)Levin (c)Yatziv (d) This


image paper
Fig. 4. Comparison of the coloring effect between the algorithm in this paper and Levin algorithm
and Yatziv algorithm

Table 1. The PSNR results of the new algorithm, Levin algorithm and Yatziv algorithm.

Name Levin (dB) Yatziv (dB) This paper (dB)


Archive 29.54 26.12 30.62
Bathroom 29.64 28.72 33.91
Beach 29.93 29.81 29.95
Valley 27.15 24.58 27.29
752 H. Xin and E. Peng

Table 2. The SSIM results of the new algorithm, Levin algorithm and Yatziv algorithm.

Name Levin Yatziv This paper


Archive 0.961 0.948 0.963
Bathroom 0.984 0.980 0.986
Beach 0.952 0.950 0.954
Valley 0.947 0.937 0.946

5 Conclusion
At present, the optimized coloring algorithm has some problems, such as complicated
user interaction and easy color penetration at the image boundary. To solve these prob-
lems, we propose a new search method that combines non-local neighborhood search
with an optimized coloring algorithm and effectively combines local and non-local color-
ing advantages. When the user mark range is insufficient, the color penetration of image
boundary is effectively solved. The method proposed in this paper can also provide
convenience for other fields, such as animation, film and television, medical treatment,
advertising design, decoration design, etc. Based on the existing algorithm, the follow-
ing improvements can be made. Because the CPU calculates the neighborhood pixels
in the non-local neighborhood search, its execution efficiency needs to be improved.
Therefore, the next step is to use CUDA programming to realize the calculation by GPU
and reduce the running time of this algorithm [20].

Acknowledgment. This work was supported by the grants of the Nature Science Foundation of
Heilongjiang Province (LH2021F035).

References
1. Burns, G.: Colorization [EB/OL]. http://www.museum.tv/archives/etv/
2. Lagodzinski, P., Smolka, B.: Application of the extended distance transformation in digital
image colorization. Multimedia Tools Appl. 69(1), 111–137 (2014). https://doi.org/10.1007/
s11042-012-1246-2
3. Lin, S., Ritchie, D., Fisher, M., Hanrahan, P.: Probabilistic color-by-numbers. ACM Trans.
Graphics 32(4), 1–12 (2013)
4. Jia, Y., Hu, S.: Interactive image dyeing algorithm based on graph segmentation. Chin. J
Compu. 29(3), 508–513 (2016)
5. Levin, A., Lischinski, D., Weiss, Y.: Colorization using optimization. ACM Trans. Graphics
(TOG) 23(3), 689–693 (2004)
6. Yatziv, L., Sapiro, G.: (2006) Fast image video colorization using chrominance blending.
IEEE Trans. Image Process. 15(5), 1120–1129 (2006)
7. Fattal, R.: Edge-avoiding wavelets and their applications. ACM Trans. Graphics (TOG) 28(3),
1–10 (2009)
8. Wang, M., Chen, Z.: A Color Transfer algorithm based on neighborhood correlation and
optimization techniques. In: Fourth International Symposium on Computational Intelligence
and Design, vol. 2, pp. 31–34 (2011)
Optimized Coloring Algorithm Based on Non-local Neighborhood Search 753

9. Charpiat, G., Hofmann, M., Schölkopf, B.: Automatic image colorization via multimodal
predictions. In: Forsyth, D., Torr, P., Zisserman, A. (eds.) ECCV 2008. LNCS, vol. 5304,
pp. 126–139. Springer, Heidelberg (2008). https://doi.org/10.1007/978-3-540-88690-7_10
10. Sheng, B., Sun, H., Chen, S., et al.: Colorization using the rotation-invariant feature space.
IEEE Comput. Graphics Appl. 31(2), 24–35 (2011)
11. Musialski, P., Cui, M., Ye, J., et al.: A framework for interactive image color editing. Visual
Comput. 29(11), 1173–1186 (2013)
12. Cheng, Z.,Yang, Q., Sheng, B.: Deep colorization. In: Proceedings of the IEEE International
Conference on Computer Vision, pp. 415–423 (2015)
13. Lizuka, S., Simo-Serra, E., Ishikawa, H.: Let there be color: joint end-to-end learning of global
andlocal image priors for automatic image colorization with simultaneous classification. ACM
Trans. Graphics (TOG) 35(4), 110 (2016)
14. Muja, M., Lowe, D.G.: Fast approximate nearest neighbors with automatic algorithm
configuration. In: VISAPP, no. 1, pp 331–340 (2009)
15. Lee, P., Wu, Y.: Non-local matting. In: Proceedings of the 2011 IEEE conference on com-
puter vision and pattern recognition, CVPR ’11, pp 2193–2200. IEEE Computer Society,
Washington, DC (2011)
16. Chen, Q., Li, D.: Tang C-K Knn matting. IEEE Trans. Pattern Mach. Intell. 35(9), 2175–2188
(2013)
17. Huang, H., Li, X., Zhao, H., Nie, G., Hu, Z., Xiao, L.: Manifold-preserving image colorization
with non-local estimation. Multimedia Tools Appl. 74(18), 7555–7568 (2015)
18. Wang, H., Gan, Z., Zhang, Y., Zhu, X.: Novel colorization method based on correlation
neighborhood similarity pixels priori[A]. IEEE Beijing Section. In: Proceedings of 2012
IEEE 11th International Conference on Signal Processing (ICSP 2012)[C], pp. 953–956.
IEEE Beijing Section:IEEE BEIJING SECTION (2012)
19. Jack, K.: Video Demystified, 4th edn. Elsevier Science & Technology (2001)
20. Joshi, M., Nkenyereye, L., Joshi, G., et al.: Auto-colorization of historical images using deep
convolutional neural networks. Mathematics 8(12), 2258 (2020)
Research on Digital Twin for Building Structure
Health Monitoring

Su Jincheng1,2(B)
1 Postdoctoral Research Workstation of Northeast Asia Service Outsourcing Research Center,
Harbin University of Commerce, 150028 Harbin, Heilongjiang Province,
People’s Republic of China
2 School of Energy and Civil Engineering, Harbin University of Commerce, 150028 Harbin,

Heilongjiang Province, People’s Republic of China

Abstract. Existing buildings are vulnerable to damage due to adverse factors


such as multi-field coupling, material degradation, and corrosion. With the help of
the big data provided by the health monitoring system, the digital twin uses data
analysis to diagnose the current state of the building, predict the future performance
of the building, and give real-time decisions on the operations to be taken in the
future according to the information provided by the data. The technical system
of the digital twin, theoretical, and technical composition of structural health
monitoring are described. The key problems and research paradigms of building
structure health management are combed based on the digital twin model. With
the breakthrough of technology, the accumulation of data, and the improvement
of computing power, digital twins for building structure health management will
become a trend.

Keywords: Digital twin · Structure health monitoring · Big data

1 Introduction
Buildings are a kind of man-made thing, and man-made things have life. The design
service life of ordinary houses is 50 years. Within the design basis period, only normal
maintenance without overhaul can be used for the intended purpose and complete the
predetermined function. Buildings are vulnerable to damage due to adverse factors such
as multi-field coupling, material degradation, and corrosion. In engineering practice,
structural damage means that the building cannot work in the best state or achieve the
design performance under the target service environment. Generally, structural health
monitoring is used to identify the damage of engineering structures and provide early
warning.
Since the end of the 20th century, civil structures’ large-scale, complex and intel-
ligent development has made structural health monitoring more and more important.
Structural health monitoring extends from single load stress monitoring to structural
damage detection, rapid positioning, and life prediction. Structural health monitoring is
a process of structural damage identification and early warning using the data obtained

© The Author(s), under exclusive license to Springer Nature Switzerland AG 2022


A. E. Hassanien et al. (Eds.): BIIT 2021, LNDECT 107, pp. 754–760, 2022.
https://doi.org/10.1007/978-3-030-92632-8_71
Research on Digital Twin for Building Structure Health Monitoring 755

by sensors. The degradation of service performance generally reflects the damage of


engineering structures. Structural health monitoring generally depends on the structure
for reverse damage diagnosis and life prediction.
Single measurement data or digital models are not enough to support reliable diag-
nosis and prediction. Using data to make up for the model’s error and using the model
to supplement the limitation of data can they make more accurate simulations and pre-
dictions according to the real-time updated dynamic model. Digital twin emphasizes the
use of all data, the diversity of data provided by sensors, and the use of data to drive
knowledge acquisition to increase cognition.
Digital China has become a national strategy. It is proposed to “meet the digital age,
activate the potential of data elements, promote the construction of a network power,
accelerate the construction of digital economy, digital society, and digital government,
and drive the transformation of production mode, lifestyle and governance mode with
digital transformation as a whole”. It is proposed to explore and build a digital twin city.
The low-carbon construction of the county points out that “vigorously develop green
buildings and energy-saving buildings… Promote the energy-saving and water-saving
transformation and function improvement of old communities”. The transformation of
old urban communities is a major livelihood project and development project. It is
of great significance to meet the people’s needs for a better life, promote the benefit
of people’s livelihood and expand domestic demand, promote the transformation of
urban renewal, development and construction mode, and promote high-quality economic
development. In 2020, the guiding opinions of the general office of the State Council
on comprehensively promoting the transformation of old urban communities made it
clear that all localities should reasonably define the scope of transformation objects in
their own areas in combination with reality. Taking Harbin as an example, 10.03 million
square meters of old residential area will be reconstructed in 2021; determining which
buildings need to be reconstructed needs a decision-making basis. The effectiveness of
the transformation needs to be monitored and evaluated.
With the help of the big data provided by the health monitoring system, the digital
twin uses data analysis to diagnose the current state of the building, predict the future
performance of the building, and give real-time decisions on the operations to be taken
in the future according to the information provided by the data.
Therefore, the digital twin research for building structure health management has
the following practical significance:

(1) Evaluate the operation of the existing state of the building.


(2) Predict the life of the building.
(3) Provide a decision-making basis for building reconstruction.
(4) Monitor and evaluate the effect of building reconstruction.
(5) It provides a reference for forming the institutional framework, policy system, and
working mechanism for the transformation of old urban communities.

2 Digital Twin
The term “digital twin” comes from the US Department of Defense and is used for
aerospace vehicles’ health maintenance and guarantee [1]. In 2011, the US Air Force
756 S. Jincheng

Research Laboratory proposed applying digital twin technology to the life management
of aircraft structure, resulting in the concept of aircraft digital twin, to solve aircraft oper-
ation and maintenance in complex service environments in the future. The application
of the concept of the digital twin has also been further extended to urban management.
Xiong’an New Area proposes to “adhere to the synchronous planning and construction
of digital city and real city”. Based on the theory of urban complex adaptive system and
digital twin technology, the “hidden order” of the urban system can be made explicit
by building a complex system of mutual mapping and collaborative interaction between
physical city and digital city [2].
At present, digital twins have many different definitions and understandings in aca-
demic circles. It is widely accepted that the digital twin uses physical models, sensor
updates, operation history, and other data. It is a simulation process integrating multi-
disciplines, multi-physical fields, and multi-scale and multi-probability. It completes the
mapping in the virtual space and reflects the whole life cycle process of the corresponding
entity equipment.
In other words, the digital twin technology system must support elements such as
virtual space, physical space, and two-way information flow and play a role in its whole
life cycle. Figure 1 shows the technical system of digital twins.

Modeling

Virtual Space
Simulation and Optimization

Operation Management
Technology
System of Physical Space Automation
Digital Twins

Digital Analysis

Information integration and


CoordinaƟon feedback of virtual space and
physical space

Fig. 1. Technology system of digital twins

3 Key Issues
Monitoring is a direct means for humans to scientifically understand the world and the
basis and premise for controlled substances and their development laws. The health
monitoring of engineering structures is to deploy large-scale, multi-species, distributed
sensor networks and acquisition, transmission, management, analysis, and early warning
Research on Digital Twin for Building Structure Health Monitoring 757

systems. It is used to real-time perceive, identify, diagnose and evaluate the damage and
safety state of the structure and its evolution law, reveal the whole life behavior mecha-
nism of the real structure under the coupling action of real load and environment, and the
intelligent function of bionic human self-sensing and self-diagnosis [3]. Figure 2 shows
the theoretical system of structural health management. Figure 3 shows the technical
composition of the structural health monitoring system.

Damage Identification Safety Assessment


Model Modification Long Term Evolution Law

Data Collection
Data Transmission
Data Management

Early Warning
Validate Existing Methods
Evaluation
Developing New Methods
Decision Making

Fig. 2. Theoretical system of structural health management

Structural health monitoring includes two major scientific problems: sensing and
data [4]. Since 2005, the application of a structural health monitoring system has pro-
duced more and more monitoring data. These data contain important information such as
structural load and environmental action, behavior mechanism, performance evolution
law, evolution level, etc. A key problem is how to mine and analyze monitoring data to
find structural load and environmental action, structural response, behavior mechanism
and performance evolution law, evaluate and predict structural safety level and change
law, and establish structural health monitoring theory based on monitoring data.

4 Research Paradigm

The digital twin has three meanings for structural health monitoring: deepening under-
standing, strengthening prediction, and in-depth optimization. The main key problems
are multi-physical and multi-scale modeling with high fidelity, prediction and analysis
with high confidence, and high real-time data interaction.
Digital twin emphasizes the use of all data and the diversity of data provided by
sensors and data to drive the acquisition of knowledge to increase cognition. Digital twin
integrates big data, dynamic data-driven application systems, and machine intelligence,
which has changed the research paradigm of structural health management.
758 S. Jincheng

Environment Condition Data Acquisition,


Local Behavior Structural Health Monitoring System Transmission and
Holistic Behavior Management

Multi Data Synchronous Acquisition


Intelligent Sensor Network
Data Remote Transmission
Database System
Analysis, Modeling and Diagnosis
Graphic Display

Damage Identification
Basic Information of Structure
Model Modification
Monitoring System Information
Structure Reanalysis Remote Network Monitoring Data Information
Health and Safety Assessment Control Safety Warning Information
Reliability and Life Prediction
Safety Assessment Information

Fig. 3. Technical composition of structural health monitoring system

Big data analysis finds hidden patterns, unknown relationships, and other useful
information that can help decision-making from big data and convert the above infor-
mation into executable knowledge. There are three main modes of big data analysis:
descriptive analysis, predictive analysis, and prescriptive analysis. Predictive analysis is
the core of big data analysis by mining patterns or relationships in big historical data,
analyzing trends, production prediction models and scores to evaluate possible events in
the future and predict what may happen in the future. The descriptive analysis summa-
rizes the original data into an understandable format and reveals what happened, which
is the basis of big data analysis. The prescriptive analysis is based on the prediction
model to analyze the possible different results under different decisions and determine
the strategies to be adopted.
The National Natural Science Foundation of the United States formally put forward
the dynamic data-driven application system in 2000, a new research model of cooperative
application and sensing systems. For complex systems such as building structures, the
system state evolves with the system’s operation and has dynamic characteristics. The
dynamic test data is used to modify the analysis model in real-time and adaptively,
eliminate the influence of uncertain factors to the greatest extent, give more accurate
results, and control the implementation of the actual system by participating in the
system decision-making [5].
Driven by big data technology, the digital twin is becoming a reality. A digital twin
can track the system’s state in real-time, carry out online state evaluation, improve the
response to faults, speed up decision-making, and realize system optimization. Unlike
the post-diagnosis mode of diagnosis and maintenance after the system has problems, the
digital twin is transformed into life-cycle management, pre-diagnosis of structural state,
and implementation of preventive maintenance, which greatly increases the availability
and service life of the system and reduces the maintenance cost.
Research on Digital Twin for Building Structure Health Monitoring 759

The uncertainty of structural health monitoring comes from the uncertainty of the
parameters and modeling of the structure itself and the measurement noise of the sensor.
Therefore, the uncertainty of the structure model and the measurement noise can be
considered well by studying the structural uncertainty damage identification method.
At present, structural uncertain damage identification methods can be divided into
two categories: structural damage identification methods based on probability and struc-
tural damage identification methods based on information fusion theory. The former
includes the structural damage identification method based on Bayesian probability the-
ory and the structural damage identification method based on the stochastic finite ele-
ment. The latter includes the structural damage identification method based on Dempster-
Shafer (D-S) evidence theory and the information fusion structural damage identification
method based on Bayesian decision theory.
Unlike finite element modeling, a data-driven (statistical) approach using monitoring
data provides information about the actual operational performance of a specific infras-
tructure [6–9]. The modeling workload related to data-driven technology is relatively
lower than that of finite element modeling. However, even the most complex data-driven
model updated by the artificial neural network, Gaussian process regression, or Bayesian
model requires many actual training data for prediction. Ye et al. described the com-
parison between finite element and data-driven methods based on hybrid or digital twin
methods [10]. These challenges drive research in the field of model updating and system
identification, which attempts to use finite element models and data-driven models (i.e.,
models derived from sensor data) to better predict system behavior [11].However, the
current modeling method seems to be unable to synthesize the uncertainty of measure-
ment data and the prediction from the finite element model specified by inherent error
to generate continuous prediction when the new measurement data is available. This
capability is essential to the realization of digital twins.

5 Conclusions
In this study, the digital twin of building structure health management is studied. The
digital twin for building structure health management has the following meanings:

(1) Evaluate the operation of the existing state of the building.


(2) Predict the life of the building.
(3) Provide a decision-making basis for building reconstruction.
(4) Monitor and evaluate the effect of building reconstruction.
(5) It provides a reference for the formation of the institutional framework, policy
system, and working mechanism for the transformation of old urban communities

With the increase of service time of buildings, the demand for real-time and efficient
structural health management is becoming stronger and stronger. The digital twin will
play a great role in this field and form a life cycle management paradigm based on the
digital twin.
760 S. Jincheng

Acknowledgment. This work was supported by University Nursing Program for Young Scholars
with Creative Talents in Heilongjiang Province (UNPYSCT-201720 7) and Heilongjiang Natural
Science Foundation (E2017055).

References
1. Tuegel, E.J., Ingraffea, A.R., Eason, T.G., et al.: Reengineering aircraft structural life
prediction using a digital twin. Int. J. Aerosp. Eng. 2011, 1687–5966 (2011)
2. Zhou, Y., Liy, C.C.: The logic and innovation of building digital twin city in Xiong’an new
area. Urban Dev. Stud. 25(10), 60–67 (2018)
3. Ou, J.: Research and practice of intelligent sensing technologies in civil structural health
monitoring in the mainland of China. In: Nondestructive Evaluation and Health Monitoring
of Aerospace Materials, Composites, and Civil Infrastructure, vol. 6176, pp. 293–304 (2006)
4. Li, H., Bao, Y., Ou, J.: Structural damage identification based on integration of information
fusion and shannon entropy. Mech. Syst. Signal Process. 22(6), 1427–1440 (2008)
5. Frederica, D., Rotea, M.: Dynamic data-driven applications systems. In: Proceedings of the
2006 ACM/IEEE Conference on Supercomputing, p. 2 (2006)
6. Eky, F., et al.: A Self-Sensing Digital Twin of a Railway Bridge Using the Statistical Finite
Element Method. arXiv: Numerical Analysis (2021)
7. Girolami, M., Febrianto, E., Yin, G., et al.: The statistical finite element method (statFEM)
for coherent synthesis of observation data and model predictions. Comput. Methods Appl.
Mech. Eng. 375, 113533:1-113533:32 (2021)
8. Malekzadeh, M., Atia, G., Catbas, F.N.: Performance-based structural health monitoring
through an innovative hybrid data interpretation framework. J. Civ. Struct. Health Monit.
5, 287–305 (2015)
9. Liu, X., et al.: Application of Structural Health Monitoring for Structural Digital Twin.
Offshore Technology Conference Asia (2020)
10. Ye, C., Butler, L., Bartek, C., et al.: A Digital twin of bridges for structural health monitoring.
In: 12th International Workshop on Structural Health Monitoring (2019)
11. Upmanyu, T., Hussain, S., Bharadwaj, S., Saxena, S.: Digital health monitoring and pervasive
healthcare using cloud-connected smart wearable devices. Int. J. u - and e – Serv. Sci. Technol.
10(1), 289–298 (2017)
Research on Modeling and Solution of Reactive
Power Optimization in Time Period
of Power System

Chen Deyu(B)

School of Computer and Information Engineering,


Harbin University of Commerce, Harbin 150028, China
cdy1974610@sohu.com

Abstract. In the reactive power and voltage optimization of the power grid, dis-
crete control equipment such as capacitors and transformer taps act once in a
period to realize the optimization of the integral sum of multi-objective functions
such as line loss, total generator reactive power reserve, and equalization of reac-
tive power reserve of each generator at each time, which can be transformed into
the optimization of the expectation and variance of the objective function at each
time in the period, It is the optimization model of discrete control variables in the
period; As a continuous control variable, the generator terminal voltage can be
adjusted at each time further to realize the optimization of the period objective
function, that is, it is defined as the period continuous control variable optimization
model. The discrete and continuous control variable optimization model is solved
by cross iteration until convergence, and an adaptive weight genetic algorithm is
proposed to solve it. A simulation example verifies the effectiveness of the model
and algorithm.

Keywords: Reactive power optimization · Time period · Expectation ·


Variance · Adaptive weight genetic algorithm

1 Introduction
In the actual production of the power grid, the service life of static compensation devices,
such as capacitors, will be reduced with frequent switching under working conditions.
At the same time, the static compensation devices are discrete and jump-type control,
and large changes in operating state after control may also affect the reliability of system
operation. Therefore, discrete control devices, such as capacitors, are generally adjusted
once in a period. This value is adapted to the working conditions at each time in this
period as far as possible, and its action frequency is reduced as far as possible. In contrast,
the finer control can be tracked and regulated by continuous control devices such as gen-
erator sets. In production practice, most of the actions of discrete and continuous control
devices are arranged according to operation experience. However, with the complexity
and changeability of power grid structure and operation mode, the control strategies
obtained from operation experience may be inapplicable. A strict mathematical model

© The Author(s), under exclusive license to Springer Nature Switzerland AG 2022


A. E. Hassanien et al. (Eds.): BIIT 2021, LNDECT 107, pp. 761–774, 2022.
https://doi.org/10.1007/978-3-030-92632-8_72
762 C. Deyu

of reactive power optimization is established in a period with obvious physical signif-


icance, and a reasonable optimization algorithm is adopted to solve it. The obtained
control strategy can guide the production practice more effectively, which has urgent
practical and theoretical significance. The reactive power optimization problem can be
defined as the reactive power optimization in a period. The discrete control equipment
can only act once in a period due to the limitation of the number of actions. In contrast,
the continuous control equipment can act at every moment, to coordinate the two to
realize the reasonable reactive power distribution of the system and further reduce the
energy consumption in a certain period, improve the voltage quality at each time and
even optimize the voltage stability.
Relevant research can be used for reference. Reference [1, 2] is divided into desig-
nated segments according to the changing trend of the load curve, and the optimization
goal of each time segment is to minimize energy consumption, which is solved by genetic
algorithm; The reference [3] defines the load curve subsection formula, which divides
the working conditions with similar variation laws and adjacent time into a time, and
calculates the equivalent load of the period, then the reactive power optimization with
minimum energy consumption in the period is carried out; Reference [4] combines the
optimized line loss curves without control equipment action times constraint, derives
the equivalent load of each node in each time after combination, and performs static
optimization based on the equivalent load to solve the dynamic optimization problem.
The introduction of artificial intelligence is a research hotspot of reactive power opti-
mization, and more research has been carried out on the reactive power optimization of
new energy access [5–8].
In this paper, based on the solution ideas proposed in the above references, the typical
time periods such as “peak, valley and level” are divided by the recursive merging method
of working condition trends, and the number of divided periods is the action number
constraint of discrete control equipment. The reactive power optimization modeling
and solution of each period are studied emphatically. The action times constraint is
generally obtained by considering the service life and operation reliability of discrete
control equipment such as capacitors. Discrete control devices such as capacitors act
once to consider reactive power optimization at each time, and at the same time, set
the terminal voltage of continuous control devices such as generators at each time. A
time-period reactive power optimization model for coordinating discrete and continuous
control equipment is established, and its solution algorithm is given. The rationality of
the proposed model and solution algorithm is verified based on a simulation example.

2 Reactive Power Optimization Model in the Time Period


⎡ ⎛ ⎞
  
max ⎣F1 = − Ploss (x(t), uc (t), ud )dt , F2 = ⎝ QGim (x(t), uc (t), ud )⎠dt,
T T i∈nG
Research on Modeling and Solution of Reactive Power Optimization 763

⎛ ⎞ ⎤
 
F3 = − ⎝ QGi,j
m
(x(t), uc (t), ud )⎠dt ⎦
T i,j∈nG

s.t.
⎧ (1)

⎪ f (x(t), uc (t), ud ) = 0

h ≤ h(x(t), uc (t), ud ) ≤ h

⎪ uc ≤ uc (t) ≤ uc

ud ≤ ud ≤ ud

Where: T Is the time period; t is each time in the time period T , the following
variables related to t refer to the value of t at each time; ud is a discrete control variable,
which is specifically referred to as capacitor switching and transformer tap action, and is
only act once in time period T , ud , ud are the upper and lower limits respectively; uc (t)
is a continuous control variable, such as the terminal voltage setting of the generator,
and which is acted at every moment in the T1 time period, uc , uc are the upper and lower
limits respectively; x(t) is a state variable, such as node voltage amplitude and phase
angle; f(t) is the system power flow equality constraint; h(t) is the inequality constraint
of state variables and function variables, such as node voltage, reactive power output of
generator, h, h are the upper and lower limits of constraints; The Ploss (t) function is the
line loss of each time t in period T , then the optimization objective F1 is the energy
consumption intime period T ; QGim (t) is the reactive power margin of the generator
i at each time, QGim (t) is the sum of reactive power margins of all nG generators at
i∈nG
time t, and the optimization objective F2 is the integral of the sum of generator reactive
power margin in time period T ; QGi,j m (t) is the difference of reactive power margin

between generators i and j at time t, QGi,j m (t) is the sum of reactive power margin
i,j∈nG
differences of any two generators at time t, and the optimization objective F3 is the
integration of the sum of reactive power margin differences of any two generators in
time period T ; MaxF1 indicates the minimum energy consumption in the time period,
and maxF2 indicates the maximum reactive power margin of the generator in the time
period, thus ensuring that the generator can quickly provide more reactive power support
for the system when an emergency occurs, which is beneficial to the stability of the
system; MaxF3 indicates that the reactive power margin of each generator is balanced
as far as possible at each time in the time period, the balanced distribution of reactive
power margin is more conducive to system stability.

2.1 Reactive Power Optimization Model with Discrete Control Variables in Time
Period

Assuming that the continuous control variable uc (t) is known at each time in the model
(1), the discrete control variable ud acts once in the time period T , ensure the optimality
of objective functions F1 , F2 and F3 in the time period T , and the objective function
adopts the integral sum of line loss, generator reactive power margin and generator
reactive power margin equalization function in time period T .
764 C. Deyu

The objective function of integral sum adopts a more general expression:



f (t)dt = f (t1 ) · (t1 ) + f (t2 ) · (t2 ) + · · · + f (tj ) · (tj ) (2)
T

It can be further written as equivalent form:


⎛ ⎞

⎝ f (t)dt ⎠ T
T
    
= f (t1 ) · t1 T + f (t2 ) · (t2 T ) + · · · + f (tj ) · (tj T )

It can be transformed into the expression of mathematical expectation:



f (t)dt = E(f (t)) · T
T

Where: E(f(t)) is the expectation of function f(t).


Because T is a constant, the maximization of the objective function of the integral
sum can be further equivalent to the maximization of mathematical expectation:

max f (t)dt = max(E(f (t))) (3)
T

Using the equivalent expression of formula (3) to replace the objective function in
the form of integral sum in optimization model (1), the physical meaning will be more
obvious, that is, the discrete control variables are optimized once in a time period, which
improves the expected value of the objective function at each time in the time period,
that is, the objective function at each time in the time period is optimized as a whole.
At the same time, if the optimization objective only ensures the optimization of
the objective function of the integral sum in time period T , the objective function
may be improved a lot at some times, but the objective function may be much worse
at some times, resulting in the problem of uneven optimization. Therefore, based on
the mathematical expectation expression of the objective function at each time in the
above time period, the mathematical variance of the objective function at each time is
introduced to balance the optimization at each time in a time period:
t1 t2
D(f (t)) = · (f (t1 ) − E(f (t)))2 + · (f (t2 ) − E(f (t)))2
T T (4)
tj
+··· + · (f (tj ) − E(f (t)))2
T
MinD(f(t)) represents the equalization of objective function optimization at each
time in time period.
Assuming that the continuous control variables are given, and the expression forms
of expectation and variance in (3) and (4) are synthesized, the optimization model of
Research on Modeling and Solution of Reactive Power Optimization 765

discrete control variables in time period obtained by decomposition of model (1) can be
expressed as:
⎡ ⎛ ⎞ ⎛ ⎞⎤
 
max ⎣FE,1 = −E(Ploss (ud )), FE,2 = E ⎝ QGim (ud )⎠, FE,3 = −E ⎝ m (u )⎠⎦
QGi,j d
i∈nG i,j∈nG
⎡ ⎛ ⎞ ⎛ ⎞⎤
 
min ⎣FD,1 = D(Ploss (ud )), FD,2 = D⎝ QGim (ud )⎠, FD,3 = D⎝ m (u )⎠⎦
QGi,j d
i∈nG i,j∈nG (5)
s.t.

⎪ ∗
⎨ f (x(t), uc (t), ud ) = 0
h ≤ h(x(t), u∗c (t), ud ) ≤ h

⎩ ud ≤ ud ≤ ud

Where: u∗c (t) indicates the optimal value of continuous control variable at a given time t,
and the control variable in this model is only discrete control variable ud ; Other variables
have the same meanings as above.
The meaning of model (5) is that the discrete control variable acts once in time period
T , and the expectation E(.) and variance D(.) of the objective function are optimized.

2.2 Reactive Power Optimization Model with Continuous Control Variables


in Time Period
After the above model (5) is solved, the discrete control variable u∗d of the time period
can be given, the model (1) can be decomposed to establish a continuous control variable
optimization model at each time t of the time period, omitting t, which can be written
as:

  
max ⎣F1 = −Ploss x, uc , u∗d , F2 = QGim (x, uc , u∗d ) ,
i∈nG


F3 = − QGi,j
m
(x, uc , u∗d )⎦
i,j∈nG (6)
s.t.

⎨ f (x, uc , u∗d ) = 0
h ≤ h(x, uc , u∗d ) ≤ h

uc ≤ uc ≤ uc
The control variables of this model are only continuous control variables uc ; The
meaning of other variables is the same as above.

2.3 Coordinated Optimization of Continuous and Discrete Control Variables


in Time Period
The reactive power optimization model (1) in the time period is decomposed into discrete
control variable optimization model (5) and continuous control variable optimization
model (6), and the models (5) and (6) are solved through cross iteration until convergence.
766 C. Deyu

The specific form of the optimization model (5) of discrete control variables in time
period is as follows:
⎡ ⎛ ⎞ ⎛ ⎞⎤
 
max ⎣FE,1 = −E(Ploss (c, b)), FE,2 = E ⎝ QGim (c, b)⎠, FE,3 = −E ⎝ m (c, b)⎠⎦
QGi,j
i∈nG i,j∈nG
⎡ ⎛ ⎞ ⎛ ⎞⎤
 
min ⎣FD,1 = D(Ploss (c, b)), FD,2 = D⎝ QGim (c, b)⎠, FD,3 = D⎝ m (c, b)⎠⎦
QGi,j
i∈nG i,j∈nG
s.t. (7)
⎧  

⎪ f θ, V l , ukc (t), c, b = 0




⎨ Vl ≤ Vl ≤ Vl
⎪ QG ≤ QG ≤ QG



⎪ c≤c≤c


b≤b≤b

Where: ukc (t) represents the continuous control variable at each time t of the k-th
iteration, that is, the set value of generator terminal voltage, which is the given value in
the k-th iteration model (7); c is the equivalent susceptance after the capacitor is switched
once in a time period, c, c are the upper and lower limits of the equivalent susceptance of
the capacitor; b is the transformation ratio after the transformer tap is set once in a time
period, b, b are the upper and lower limits of transformation ratio; c and b are discrete
control variables of the model (7); θ is the voltage phase angle of all nodes, V l is the
load node voltage amplitude, V l , V l are the upper and lower limits of the V l , which
are generally 1.05 and 0.95; QG is the reactive power output of generator, QG, QG are
upper and lower limits of reactive power output of generator.
QGim is the reactive power margin of the ith generator, which is:

QG i − QGi
QGim =
QG i − QG i

If the generator reactive power QG is within the upper and lower limits, then the
maximum value of QGim is 1, indicating that the reactive power margin is sufficient; the
minimum value of QGim is 0, indicating that the reactive power margin is scarce and has
reached the upper limit.
The concrete form of the continuous control variable optimization model (6) at each
time t in the time period is as follows, omitting t, including:
⎡ ⎤
   
max ⎣F1 = −Ploss V g , F2 = QGim (V g ), F3 = − QGi,j
m
(V g )⎦
i∈nG i,j∈nG

s.t.
⎧ (8)

⎪ f (θ, V l , V g , uk ) = 0

⎨ V ≤ V ≤dV
l l l

⎪ V ≤ V g ≤ V g
⎪ g
⎩ QG ≤ QG ≤ QG
Research on Modeling and Solution of Reactive Power Optimization 767

Where: ukd indicates the set value of discrete control variables in the kth iteration,
such as the equivalent susceptance after capacitor switching and the transformer ratio
after transformer tap setting, which are given values; Vg is the generator node voltage
amplitude (including PV node and balance node), which is a continuous control variable
of the model (8). V g and V g are the upper and lower limit constraints of voltage amplitude
of such nodes, generally 1.1 and 0.9; Other variables have the same meanings as above.
The cross iteration of discrete control variable optimization model and continu-
ous control variable optimization model in time period is carried out k times until
   
maxuck+1 (t) − ukc (t) < εc , maxudk+1 − ukd  < εd , the iteration is over, then the opti-
mal solution of the reactive power optimization model in time period is uc (t) and ud of
the k+1th iteration, and t is each time in time period T .

3 Solution of Reactive Power Optimization Model in Time Period


Genetic algorithm should deal with the optimization problem that the optimization model
is complex and the control variables have discrete variables, which makes it more possible
to find the optimal solution from a global perspective. The selection operation of genetic
algorithm proposed in this paper adopts mature roulette method; Cross operation adopts
multi-point cross; The number of intersections is set as the number of control variables,
and the crossover rate can be selected between 0.2 and 0.9; The mutation operation
randomly mutates each individual according to the set mutation rate, so that the individual
is diversified, and the mutation rate is selected between 0.01 and 0.1.
This paper focuses on the coding of control variables and the calculation of fitness
function in genetic algorithm.

3.1 Control Variable Coding


Using binary coding, let the interval of integer variable be [a,b] and the interval length
be b-a, and divide this interval into b-a equal parts:

2k−1 < b − a ≤ 2k

The length of the binary string encoded in this way needs at least k bits.
The k-bit binary string (bk−1 bk−2 · · · b0 ) is transformed into the corresponding
integer in the [a, b] interval, which is divided into the following two steps:

1) Convert the binary number represented by this binary string into a decimal number:


k−1

(bk−1 bk−2 · · · b0 )2 = ( bj · 2j )10 = i
j=0

2) Integer i in the interval [a, b] corresponding to i’ :


 b−a
i =a+i ·
2k − 1
768 C. Deyu

Take the transformer ratio b in the discrete control variable optimization model (7)
as an example to encode the control variable. If the upper and lower limits of b are 1.1
and 0.9 respectively, and the adjustable step size is 0.02, then: (1.1–0.9)/0.02 = 10, 23
 10  24 , therefore, for a transformer ratio, the number of binary coding bits is 4. When
there are n transformer ratios controllable, the number of coding bits is 4 × n:

dn,3 dn,2 dn,1 dn,0 |dn−1,3 dn−1,2 dn−1,1 dn−1,0 | · · · · · · |d1,3 d1,2 d1,1 d1,0

In which dn,3 dn,2 dn,1 dn,0 represents a binary code of transformer ratio, and d = 0 or
1, and the binary code is restored to transformer ratio:


3  
  10 − 0
(d3 d2 d1 d0 )2 = ( dj · 2j )10 = i , and i = 0 + int i · 4
2 −1
j=0

There is a transformer ratio b as follows:

b = 0.9 + i × 0.02

Coding ideas of capacitor susceptance c and generator terminal voltage Vg are the
same.

3.2 Fitness Function

Optimization models (7) and (8) are all multi-objective optimization, and the key is how
to determine the weight coefficient of each objective in multi-objective optimization.
In this paper, the adaptive weight genetic algorithm proposed in [9] is used to solve
the multi-objective optimization. By using the characteristics and advantages of genetic
algorithm, the weight coefficient is changed continuously with the algebraic evolution,
and the strategy of adaptively determining the weight is given.
Take the model (7) as an example to calculate the fitness function. The maximization
of the three expected objective functions FE,1, FE,2, FE,3 remains unchanged, while the
negative signs are added to the three variance objective functions FD,1, FD,2, FD,3 as
−FD,1, −FD,2, −FD,3 respectively, then the minimization in model (7) is transformed
into maximization.
In genetic evolution, each generation produces a certain number of individuals to
form population P. The maximum and minimum values of all individual optimization
objectives Z FE1 of the current population are defined as:
max
zFE1 = max{FE,1 (V g,j , cj , bj )|V g,j , cj , bj ∈ P}
min
zFE1 = min{FE,1 (V g,j , cj , bj )|V g,j , cj , bj ∈ P}

Where: Vg,j , cj , bj are the terminal voltage setting value, capacitor switching amount
and transformer ratio corresponding to the coding scheme of any individual j in the
population.
max , z min and z max , z min can be obtained.
Similarly, zFE2 FE2 FE3 FE3
Research on Modeling and Solution of Reactive Power Optimization 769

max , z min can also be obtained:


zFD1 FD1

max
zFD1 = max{−FD,1 (V g,j , cj , bj )|V g,j , cj , bj ∈ P}
min
zFD1 = min{−FD,1 (V g,j , cj , bj )|V g,j , cj , bj ∈ P}
max , z min and z max , z min can be obtained.
Similarly, zFD2 FD2 FD3 FD3
In the current generation, the adaptive weight of the expectation objective is expressed
by the following formula:
1
wEk = , k = 1, 2, 3
max
zFEk − zFEk
min

For population P, Vg,j , cj , bj corresponding to a given individual j, there are:

zFE1 = FE,1 (V g,j , cj , bj )


zFE2 = FE,2 (V g,j , cj , bj )
zFE3 = FE,3 (V g,j , cj , bj )

In the current generation, the adaptive weight of variance objective is expressed by


the following formula:
1
wDk = , k = 1, 2, 3
max
zFDk − zFDk
min

For population P, Vg,j , cj , bj corresponding to a given individual j, there are:

zFD1 = −FD,1 (V g,j , cj , bj )


zFD2 = −FD,2 (V g,j , cj , bj )
zFD3 = −FD,3 (V g,j , cj , bj )

Then, for model (7), the adaptive weight objective function of multi-objective opti-
mization of individual j of each generation in the process of genetic evolution is as
follows:


3 
3
z(V g,j , cj , bj ) = wEk (zFEk − zFEk
min
)+ wDk (zFDk − zFDk
min
) (9)
k=1 k=1

In the evolution process, according to some useful information of the current popula-
tion, the target preference is continuously refined, and the weight is adaptively adjusted
to obtain the search pressure towards the forward ideal point, which is different from the
traditional multi-objective optimization with fixed weight [9].
Inequality constraints are given by adaptive penalty function, for example, the
inequality constraints in model (7) can be calculated uniformly gi (V g , c, b) ≤ bi , i =
1, 2, · · · , M . For individuals Vg,j , cj , bj , the adaptive penalty function is:

M  
1  bi (V g,j , cj , bj )
p(V g,j , cj , bj ) = 1 − (10)
M bmax
i
i=1
770 C. Deyu

In which:
 
bi (V g,j , cj , bj ) = max 0, gi (V g,j , cj , bj ) − bi

  
bmax
i = max ε, bi (V g,j , cj , bj )V g,j , cj , bj ∈ P

bi (V g,j , cj , bj ) is the violation value of the ith constraint of individuals Vg,j , cj ,
bj . bmax
i is the maximum violation value of constraint i of all individuals in the current
population, ε is a small positive number, which is used to avoid division by zero in
penalty function.
Synthesis (9) and (10), for individuals Vg,j , cj , bj , the fitness function with penalty
function is:

Fit(V g,j , cj , bj ) = z(V g,j , cj , bj )p(V g,j , cj , bj ) (11)

4 Simulation Examples
The IEEE30-node system has given the node load data at one time, as well as the active
output and node voltage of each PV generator node, and the node voltage of the balance
node. Based on the data at one time, the daily typical data for simulation analysis is
constructed. Without losing generality, five typical daily load curves are assigned to
each load node. The 24-h daily load curve of each node is obtained. According to the
load curve, the active output of each generator is evenly distributed, and the 24-h active
output curve of the generator is obtained to balance the change of load.
There are 6 generators available for dispatching in IEEE30 nodes. Transformers with
4 adjustable taps; There are two switchable capacitors at nodes 10 and 24, and the related
parameters are shown in reference [10].
Taking the time period consisting of 8, 9, 10 and 11 a.m. as a typical “peak climbing”
working condition, the simulation verification is carried out. In this time period, discrete
control equipment, such as capacitor switching and transformer taps, are only adjusted
once; While continuous control equipment, such as generator terminal voltage setting,
is adjusted at every moment. Therefore, the energy consumption, the reactive power
margin of generators and the balance of the distribution of reactive power margin of
each generator are optimized.
Table 1 gives the parameters of binary coding of control variables solved by genetic
algorithm of optimization models (7) and (8):

Table 1. Coding of control variables

Adjustment step Lower limit Upper limit Single coding Total length
length
Generator 0.01 0.90 1.10 5 5 × 6 = 30
Transformer 0.02 0.90 1.10 4 4 × 4 = 16
(continued)
Research on Modeling and Solution of Reactive Power Optimization 771

Table 1. (continued)

Adjustment step Lower limit Upper limit Single coding Total length
length
Capacitor 1 0.01 0.00 0.20 5 5
Capacitor 2 0.01 0.00 0.10 4 4
Continuous 30
variable coding
length
Discrete 25
variable coding
length

Table 2. Parameter setting and iteration number of genetic solution algorithm

Population size 30 Maximum evolutionary algebra 50


Cross rate 0.2–0.9 Variation rate 0.01–0.1
Iterations 3 times

The discrete variable optimization model and the continuous variable optimiza-
tion model are solved by interactive iteration respectively. Table 2 gives other relevant
parameters of genetic algorithm, convergence iteration times and other information:
After three times of coordinated optimization of discrete variables and continuous
variables, the optimal solution is obtained by convergence. Table 3 gives the optimal
solution results of control variables:

Table 3. Optimal solution of control variables

Generator 8 9 10 11 Transformer Transformer Capacitor Susceptance


number o’clock o’clock o’clock o’clock number ratio number
voltage voltage voltage voltage
1 1.08 1.06 1.09 1.06 9–6 1.00 10 0.20
2 1.06 1.05 1.06 1.04 6–10 0.96 24 0.08
5 1.02 0.99 0.99 0.98 12–4 1.04 – –
8 1.04 1.02 1.02 1.00 28–27 0.94 – –
11 1.08 1.08 1.09 1.08 – – – –
13 1.02 1.02 1.04 1.08 – – – –
772 C. Deyu

After optimization, the system energy consumption, reactive power margin and dis-
tribution equilibrium of reactive power margin have been improved in the time period.
Figures 1(a)–(d) show the comparison diagrams before and after the optimization of the
reactive power margin of each generator at each time 8, 9, 10 and 11 of the time period:

QGm
node
(a) Time 8 (b) Time 9
QGm

(c) Time 10 (d) Time 11


Fig. 1. Comparison diagram of generator reactive power margin before and after optimization at
each time in time period

It can be seen from the above comparison diagram that the expectation and variance
objective function of balanced distribution of reactive power margin are introduced
into the optimization objective, and the reactive power margin of the generator is more
balanced at each moment after optimization.
Figure 2 shows a comparison diagram of the sum of reactive power margins of all
generators before and after optimization at each time. After optimization, the sum of
reactive power margins of generators increases at every time except at time 11. After
optimization at time 11, the reactive power margins decrease, but the decrease is not
large. In order to balance the increase of generator margins at every time in the period,
there may be such optimization results.
Research on Modeling and Solution of Reactive Power Optimization 773

4
beforeoptimization
afteroptimization
3

ΣQGm
2

0
8 9 10 11
time

Fig. 2. Comparison diagram before and after optimization of generator reactive power margin
sum at each time in time period

Figure 3 shows the line loss comparison diagram of the system before and after
optimization at each moment of the time period. After optimization, the line loss at each
time is reduced; And because the expectation and variance description are introduced
into the optimization objective function, the loss reduction effect at each moment is
balanced.

0.1 beforeoptimization
afteroptimization
Ploss

0.05

0
8 9 10 11
time
Fig. 3. Comparison diagram of system line loss before and after optimization at each time in time
period

5 Conclusion
In this paper, the time-period reactive power optimization model is decomposed into
an optimization model considering discrete control variables and an optimization model
considering continuous control variables. The two models are optimized and solved sep-
arately, and cross-iterated until convergence. The modeling problem of discrete control
variable optimization in a period is studied, and the objective optimization functions
such as expectation and variance at each time of period are put forward. The established
model ensures that the optimization objectives at each period are improved overall and
774 C. Deyu

balanced after the discrete control equipment moves once; An adaptive weight genetic
algorithm with discrete control variables and multi-objective optimization objectives is
proposed to solve the optimization model. The weights are given adaptively in genetic
evolution, which makes the optimization solution more reasonable. Simulation examples
verify the rationality of the proposed model and algorithm.

References
1. Hu, Z., Wang, X.: Time-interval based control strategy of reactive power optimization in
distribution networks. Autom. Electr. Power Syst. 26(6), 45–49 (2002)
2. Wang, X., Li, Z., Hu, Z.: Time-interval based comprehensive control strategy for daily
voltage/VAR optimization in distribution systems. Autom. Electr. Power. Syst. 30(7), 5–9
(2006)
3. Fang, X., Guo, Z.: Optimal time-varying reactive power and voltage control in distribution
systems. Autom. Electr. Power Syst. 29(9), 40–44 (2005)
4. Deng, Y., Zhang, B., Tian, T.: A fictitious load algorithm and its applications to distribution
network dynamic optimizations. Proc. CSEE 16(7), 241–244 (1996)
5. Ni, S., Cui, C., Yang, N., et al.: Multi-time-scale online optimization for reactive power
of distribution network based on deep reinforcement learning. Autom. Electr. Power Syst.
45(10), 77–85 (2021)
6. Zhao, J., Zhang, Z., Yao, J., et al.: Heterogeneous decomposition based distributed reactive
power optimization method for global transmission and distribution network. Autom. Electr.
Power Syst. 43(3), 108–114 (2019)
7. Li, Q., Qiao, Y., Zhang, Y.: Continuous reactive power optimization of distribution network
using deep reinforcement learning. Power Syst. Technol. 44(4), 1473–1480 (2020)
8. Zhang, Q., Ding, J., Zhang, D., et al.: Reactive power optimization of high-penetration dis-
tributed generation system based on clusters partition. Autom. Electr. Power Syst. 43(3),
130–137 (2019)
9. Xuan, G., Cheng, R.: Genetic Algorithm and Engineering Optimization. Tsinghua University
Press, Beijing (2004). Translated by Yu, X., Zhou, G.
10. Zhang, B., Chen, S., Yan, Z.: Advanced Power Network Analysis, 2nd edn. Tsinghua
University Publishing House, Beijing (2007)
Research on Task Allocation Method of Mobile
Swarm Intelligence Perception Based on Hybrid
Artificial Fish Swarm Algorithm

Jianjun Li1,2,3 , Fangyuan Su1,2,3 , Yu Yang1,2,3(B) , and Junjun Liu1,2,3


1 School of Computer and Information Engineering,
Harbin University of Commerce, Harbin 150028, China
2 Heilongjiang Cultural Big Data Theory Application Research Center, Harbin 150028, China
3 Heilongjiang Key Laboratory of E-Commerce and Information Processing,

Harbin 150028, China

Abstract. In this study, the task allocation mechanism in mobile swarm intelli-
gence perception is studied and analyzed. Considering how to allocate tasks under
specified time constraints and optimize for the goal, a hybrid artificial fish swarm
algorithm is proposed by using swarm intelligence algorithm. The inertia index
of particle swarm optimization algorithm is pulled into the typical artificial fish
swarm algorithm, which is proved by simulation experiments. The hybrid arti-
ficial fish swarm algorithm greatly improves the convergence speed, avoids the
shortage that the artificial fish swarm algorithm often stop when it obtains the local
optimization, and this make the algorithm obtains the effect of global optimization.

Keywords: Hybrid artificial fish swarm algorithm · Task assignment · Mobile


perception · Swarm intelligence

1 Introduction

The speed of scientific and technological development is getting faster and faster. The
number of mobile smart terminals is constantly increasing. The functions of GPS, sen-
sors, and other equipment have been significantly improved [1], and the sensing meth-
ods of mobile terminals have also been improved. Mobile swarm intelligence takes the
mobile smart terminals carried by most users as the lowest level of sensing information
acquisition unit [2]. It then uses the mobile terminal’s Perception function, which calcu-
lates, stores, and communicates the acquired perception information, realizes real-time
perception and collects surrounding environment information, uploads the information
to the network, and then forwards relevant information to realize the distribution of per-
ception tasks and users perception data Collection. Whether it is the range of perception
and the type of information [3], or the cost and speed of construction, this mobile group
intelligence perception method is far superior to the traditional method of perception.
Under the circumstances, large-scale and complicated perception tasks can be easily
completed [4].

© The Author(s), under exclusive license to Springer Nature Switzerland AG 2022


A. E. Hassanien et al. (Eds.): BIIT 2021, LNDECT 107, pp. 775–785, 2022.
https://doi.org/10.1007/978-3-030-92632-8_73
776 J. Li et al.

Maximizing the validity and reliability of perception data is the key to mobile per-
ception. This paper proposes a mission assignment method based on mixed artificial fish
swarm group perception. Use the inertia index in the particle swarm algorithm alone
as an index to mediate the speed of convergence, combined with the AFSA. The com-
bination of the two improves the ability of the algorithm to search in the global scope
and prevents the algorithm from finding the local maximum and stopping looking for
good results, and the higher efficiency of the algorithm is verified through simulation
experiments. The main research content and results of this paper are as follows:

(1) Established a task allocation majorization model.


(2) A task allocation method of hybrid artificial fish swarm perception is putted forward,
which enhances the ability of the algorithm to search in the global scope.
(3) After simulation experiments, the advantages and disadvantages of the hybrid AFSA
and the AFSA are compared from the three majorization objectives. The experiment
results prove that the proposed hybrid AFSA can greatly improve the speed of AFSA
algorithm convergence and avoid falling into the local optimum.

2 Related Works
The research of relevant scholars on task allocation is as follows: Chen [5] summa-
rized and proposed a general task allocation framework, summarized and summarized
from three aspects: worker model, task model, and task allocation algorithm, and put for-
ward the development trend of swarm intelligence collaborative task allocation research.
Wang [6] and others proposed a task allocation model combining task requirement fea-
ture extraction algorithm and user label classification method. The category keywords of
perceived tasks are extracted through the task requirement feature extraction algorithm.
Then the classifier is used to predict the user’ s type label, screen users according to the
prediction results, and distribute tasks. Finally, the simulation experiments show that
the proposed model is feasible. Yang [7] et al. Used the fuzzy logic control method to
obtain the density of participants and then proposed to use a global greedy algorithm to
allocate tasks to ensure the maximization of the utility of all tasks. Through simulation
experiments, it was proved that under the conditions of different number of tasks, num-
ber of participants, and maximum workload carried by participants, the global greedy
algorithm is better to other benchmark algorithms in maximizing the utility of all tasks.
Wang [8] et al. Studied the problem of sensor task allocation and proposed the heuristic
greedy method as the baseline solution, combining greedy algorithm and bee algorithm.
Through simulation experiments, it is proved that the hybrid method is better than a
greedy algorithm.
Qiao [9] et al. Proposed a particle swarm majorization method to solve the multi-
objective task allocation problem in a distributed environment and used fuzzy cost selec-
tion to improve the computational efficiency. The algorithm further realizes the Pareto
improvement. Ma [10] et al. Proposed a hybrid bionic self-organizing map neural net-
work algorithm. Due to the principle of AUV individual kinematics, simulation exper-
iments prove that the algorithm has wide application potential. Lu [11] et al. Proposed
PSO-GA-DWPA (discrete wolf swarm algorithm based on particle swarm majoriza-
tion and genetic algorithm) to solve the mission allocation problem of UAV swarm. The
Research on Task Allocation Method of Mobile Swarm Intelligence 777

improved algorithm has superb search quality in high-dimensional space. Deng [12] et al.
Introduced a multi-round allocation algorithm based on EMA prediction, distributed the
workload to the edge node of each time slot according to the capacity and cost prediction
[13] of the edge node, obtained the optimal allocation strategy, and proposed an online
mission allocation algorithm based on Q-Learning. The superiority and effectiveness of
the algorithm are verified by simulation.
To sum up, most scholars only carry out local objective optimization for task alloca-
tion, and lack multi-objective majorization. In this study, the mission allocation question
in mobile swarm intelligence perception is studied, a hybrid AFSA is putted forward. The
inertia index of particle swarm optimization algorithm is introduced into the AFSA to
enhance algorithm convergence speed. Avoid the shortage that the AFSA is easy to get
into local optimization.

3 Problem Description and Modeling


3.1 Description of Perceptual Task Allocation Problem

The mobile swarm intelligence perception system mainly includes task completers, task
publishers [14], and systems. The task publisher publishes task information in the sys-
tem, including task type, geographic location, completion time, etc. Therefore, when
creating tasks, M = {type, location, time, . . .}, users can participate in tasks by active
participation and passive participation. In the case of active participation, users can
decide whether to perform tasks according to their own abilities, interests, rewards and
other factors; In the case of passive participation, users do not understand the mission,
and the quality of the final mission completion results may be poor. Therefore, it is finally
determined that the set of users participating in the task is. The optimization model of
task allocation in the perception system is shown in Fig. 1:

User capabilities
Task release

Task assignment result Task assignment result

Perceptual data Perceptual data

Platform

Task publisher
Mobile terminal equipment

Fig. 1. Task allocation optimization model


778 J. Li et al.

3.2 Establishment of Perceptual Task Allocation Model

The related description of the perception task allocation model built in this paper is as
follows:

(1) In the task assignment in this paper, after the perception task is released, there is
no situation where users cannot be found.
(2) This article aims to accomplish more perception tasks with smaller users. How to
allocate tasks reasonably so that the most tasks can be completed in a short time
and the user’s movement distance is the shortest, so that the cost of perception can
be greatly reduced and the benefits can be maximized.
(3) The optimization goal is mainly to reduce the perception cost and ensure the maxi-
mum completion of the perception task. The corresponding constraints are that the
time for the user to reach the task location and the total time to complete the task
need to be less than or equal to the specified completion time.

Suppose that within the specified t min, the system publishes m perception tasks,
the task set is T = {t1 , t2 , t3 , . . . , tm }, there are n users who are determined to participate
in the perception task, the set is TWk = {tk1 , tk2 , tk3 , . . . , tkn },the distance that the user
needs to move during the task is D(TWk ). In the model constructed in this paper, it
is assumed that each user needs to spend λ min when moving to the perception task
point, the speed is v m min, and each perception task is completed by at most one user.
Therefore, the target value that needs to be optimized is shown in the following formula
(1):

⎨ min  D(TW )
n
k
k=1 (1)

max|TWk |

The constraints are:


D(TWk )
|TWk | × λ + ≤ t, 1 ≤ k ≤ n (2)
v

1 ≤ |UTm | ≤ η (3)

Among them, the perceived cost of the mobile terminal is related to and directly
proportional to the user’ s expenses, mobile phone power, network expenses, etc. There-
fore, the goal of majorization is to reduce mission completion time and reduce perceived
capitalization costs, so as to improve the revenue of the platform; Then we optimize the
quantity and quality of completed tasks. For the same task completed by multiple users,
we need to optimize it to find the best task result. Therefore, we also need to limit the
number of users completing the task to ensure the load balance of task allocation.
Research on Task Allocation Method of Mobile Swarm Intelligence 779

4 Task Allocation Method Based on Hybrid Artificial Fish Swarm


Algorithm

4.1 The Typical Artificial Fish Swarm Algorithm

The AFSA mainly simulates the foraging, clustering and tail chasing behavior of fish
by constructing artificial fish [15], and further find the global optimal value through the
step-by-step optimization process of the fish swarm.

(1) Foraging behavior

When the fish are foraging, they will swim towards the direction of more food. The
artificial fish Xi will randomly choose the state point within its visible field of vision
visual. After calculating the target value, if the fitness value of the target Yj is greatly
better than the target value Yi , the artificial fish Xi will move towards the point Xj ; If
not, the artificial fish Xi will continue to select new state points Xj within the visible
range and continue to judge. If it does not advance after repeated several times, it will
immediately advance one step to reach a new state point Xj , and its expression is shown
in the following formula (4):

Xj = Xi + rand ∗ visual {rand ∈ (0, 1)} (4)

(2) Clustering behavior

The fish often move forward in groups [16]. The artificial fish Xi searches the position
Xc and number of partner fish within its visible field of vision. If the ratio of fitness value
to the number nf of partner fish is less than the minimum value Yc of, the position of
partner fish is better, and the artificial fish moves to the center of partner fish; If not, the
artificial fish Xi will perform foraging behavior.

(3) Rear-end behavior

When a swarm of fish moves, it makes a choice according to the surrounding envi-
ronment. The behavior of moving in the optimal direction is tail chasing behavior. The
artificial fish Xi searches for the fish Xj with the highest fitness within its visible field
of vision, and its fitness value is Yj . When the optimal position has higher value, the
artificial fish Xi moves one step to the optimal position; If not, the artificial fish Xi will
perform foraging behavior. The objective function of this is as shows (5):

j Xj − Xit
Xit+1 = Xi +   ∗ Step ∗ rand {rand ∈ (0, 1)} (5)
 
Xj − Xjt 
780 J. Li et al.

When simulating the fish swarm, AFSA has blind search behavior, takes a long time,
and the overall algorithm efficiency is low. Due to the influence of its field of vision, it
always getting into local optimization. Therefore, the algorithm needs to be improved
to increase efficiency.

4.2 Hybrid Artificial Fish Swarm Algorithm


There is an inertia index in the particle swarm algorithm [17]. The inertia index affects
the particle swarm’ s convergence speed and can improve the overall search ability.
The larger the inertia index, the faster the convergence speed, the stronger the search
ability, showing a trend of first increasing and then decreasing This feature is suitable for
making up for the shortcomings of the AFSA, which can make the artificial fish swarm’s
field of view dynamically change, and in the early stage of optimization, improve the
overall optimization ability; When the algorithm execution to later stage, so as to avoid
the phenomenon of local optimum, the reduction of the inertia index can improve the
accuracy of convergence. Therefore, this paper proposes to combine the inertia index ω
in the particle swarm algorithm with the AFSA to shorten the time for the artificial fish
swarm to determine the direction of the next search movement, enhance the efficiency
of the algorithm. The expression is shown in Eq. (6):

visualnext = visuali ∗ ω (6)

ω is the inertia index in the particle swarm, visualnext is the next search range of the
artificial fish swarm, and visual is the visible field of view of the current artificial fish’s
location. The improved expression for the next search is shown in the following formula
(7):

Xnext = Xi + visualnext ∗ rand {rand ∈ (0, 1)} (7)

When the artificial fish swarm searches for the next optimal position, the inertia
index increases continuously, the visible area of vision of the artificial fish also increases
gradually, the moving step of the artificial fish will also increase, and the search breadth
of the algorithm will be greatly improved, effectively avoiding the situation of local
optimization; When the algorithm execution to later stage, the inertia index will gradually
decrease to enhance the search accuracy, accurately lock the optimal, the visible area
of view of the artificial fish swarm will be reduced, and the convergence speed of the
algorithm will be greatly accelerated, which makes up for the slow convergence speed
of the AFSA. The improved hybrid AFSA execution flow chart is as shown in Fig. 2:
Research on Task Allocation Method of Mobile Swarm Intelligence 781

Start

Initialize the size of the artificial fish swarm

Calculate fitness value and compare to get the best position

Perform foraging, rear-end collision, and group behaviors

Update optimal position No

Calculate fitness value

Whether the termination


conditions are met

Yes

End

Fig. 2. Flow chart of hybrid artificial fish swarm algorithm

5 Simulation Experiment

5.1 The Experimental Environment and Parameter Setting

The experimental hardware environment is: the CPU is AMD A4–7210 APU with AMD
radeon R3Graphics@1.80 GHz, memory 8.00 GB, windows 10 system, Matlab R2016b
simulation platform. The simulated mobile swarm intelligence perception task is ran-
domly distributed in the area of 500 * 500 (unit is km), the maximal number of iterations
L = 200, c1 = c2 = 2, and the scale of artificial fish swarm is 100. Assuming that each
user has the same ability to complete the perception task, other parameter settings of the
simulation experiment are shown in Table 1.

Table 1. Parameter setting

Parameter Value size


Inertia index 0. 7
Artificial fish swarm scale 100
Visual 1
Step 0. 1
δ 0. 618
782 J. Li et al.

5.2 Analysis of Experimental Results

The experiment compares the advantages and disadvantages of hybrid AFSA and typical
AFSA. The results and analysis of the experiment are as follows:

(1) Fitness changes of artificial fish swarm algorithm (AFSA) and mixed AFSA

The maximal quantity of iterations of the algorithm is 200. The fitness change curves
of AFSA and mixed AFSA algorithms are shown in Fig. 3.

Fig. 3. Algorithm fitness curve

The experiment results show that due to the combination of inertia index, the hybrid
AFSA converges faster than the typical AFSA, searches in the global range, avoids local
optimization, and has better stability and convergence.

(2) Number of tasks completed and task completion time

A reasonable task allocation mechanism can greatly increase the number of tasks
completed simultaneously and reduce the actual time spent by users. The figure of
the number of tasks completed by the algorithm before and after the improvement is
shown in Fig. 4(a). It shows that as the number of iterations continues to increase,
the standard AFSA often gets into the local optimum and stops searching during the
mixed artificial fish swarm. The algorithm can achieve global majorization and search
in the global. Figure 4(b) compares the average time to complete tasks between the
standard AFSA and the hybrid AFSA. The average time consumed by the improved
hybrid artificial fish swarm algorithm is much less than the original time, proving that the
loan balance is greater with the improved algorithm task distribution. The convergence
is not good at the initial stage. Still, after the number of iterations continues to increase,
the convergence becomes better. Therefore, the hybrid artificial fish swarm algorithm
has good advantages.
Research on Task Allocation Method of Mobile Swarm Intelligence 783

AFSA
Mixed AFSA

(a) (b)
Fig. 4. Total number of missions completed and comparison of average time by the two algorithms

(3) Perceived cost

The main optimization goal of this paper is to lessen the perceived capitalized cost,
and the reduction of the perceived cost can increase the platform’s revenue. In the
experiment, the standard AFSA and hybrid AFSA are used to complete the same number
of tasks respectively, and the perceptual costs of the two algorithms are compared. The
task ratio of the two algorithms is 100, 200, 300, 400 and 500. As shown from the
experimental results in Fig. 5, with the increasing number of tasks, the perception cost
of the hybrid AFSA is remarkably lower than that of the typical AFSA.

AFSA
Mixed AFSA

Fig. 5. Comparison of perceived cost between the two algorithms

6 Conclusion
Firstly, to reduce the mission pass time, reduce the perceived capitalized cost, raise
the revenue of the platform and the quantity of missions passed, this paper limits the
784 J. Li et al.

number of users completing the same task and constructs the task allocation model in
mobile perception; Secondly, based on the task allocation model and combined with
swarm intelligence algorithm, a hybrid AFSA is put forward. The randomly assigned
task allocation scheme is regarded as an “artificial fish swarm”. Combined with the
inertia index in the particle swarm optimization algorithm, the optimal task allocation
scheme in line with the optimization goal is selected; Finally, the simulation experiment
is compared with the original AFSA. The experiment shows that the hybrid AFSA can
effectively improve the convergence speed of the AFSA, avoid the shortage that the
AFSA is simple to get into local optimization and enable the algorithm to search the
optimal task allocation scheme globally.

Acknowledgment. This work is partly supported by the project supported by the National
Social Science Foundation(16BJY125) ,Heilongjiang philosophy and social sciences research
planning project(19JYB026),Key topics in 2020 of the 13th five year plan of Educa-
tional Science in Heilongjiang Province(GJB1320276),Project supported by under-graduate
teaching leading talent training program of Harbin University of Commerce(201907),Key
project of teaching reform and teaching research of Harbin University of Commerce in
2020(HSDJY202005(Z)),Innovation and entrepreneurship project for college students of Harbin
University of Commerce (202010240059),Swarm level scientific research project of Heilongjiang
Oriental University(HDFKY200202),Key entrusted projects of higher education teaching reform
in 2020(SJGZ20200138).

References
1. Wang, L., Zhang, D., Wang, C., Chen, C., Han, X., Abdallah, M.: Sparse mobile crowdsensing:
challenges and opportunities. IEEE Commun. Mag. 54(7), 161–167 (2016)
2. Xiong, Y., Liu, W., Liu, Z.: The key technology of opportunity group intelligence perception
network. ZTE Technol. 21(06), 19–22 (2015)
3. Wang, L., Yu, Z., Guo, B., Xiong, F.: Community task distribution of group intelligence
perception based on mobile social network. J. Zhejiang Univ. (Eng. Sci. Ed.) 52(09), 1709–
1716 (2018)
4. Li, S.: Research on identifying traffic congestion based on group intelligence perception.
Wirel. Internet Technol. 2016(04), 128–139 (2016)
5. Chen, B., Wang, L., Jiang, X., Yao, H.: Summary of research on task allocation of group
intelligence. Comput. Eng. Appl., 1–15 (2021)
6. Wang, X., Liao, Y., Zhao, G., Wang, J., Xie, B.: A task assignment model of group intelligence
perception based on task requirements. Comput. Eng. Sci., 1–10 (2021)
7. Yang, G., Zhang, Y., He, X.: Fuzzy logic control-oriented mobile group intelligence perception
multi-task assignment. Small Microcomput. Syst. 41(10), 2068–2074 (2020)
8. Wang, Z., Huang, D., Wu, H., Deng, Y., Aikebaier, A., Teranishi, Y,: QoS-constrained sensing
task assignment for mobile crowd sensing. In: IEEE Global Commun Conference, pp. 311–
316(2014).
9. Qiao, N., You, L., Sheng, Y., Wang, J., Deng Y.: An efficient algorithm of discrete particle
swarm optimization for multi-objective task assignment. IEICE Trans. Inf. Syst. E99-D(12),
2968–2977 (2016)
10. Ma, X., Chen, Y., Bai, G., et al.: Path planning and task assignment of the multi-AUVs
system based on the hybrid bio-inspired SOM algorithm with neural wave structure. J Braz.
Soc. Mech. Sci. Eng. 43, 28 (2021)
Research on Task Allocation Method of Mobile Swarm Intelligence 785

11. Lu, Y., Ma, Y., Wang, J., et al.: Task assignment of UAV swarm based on wolf pack algorithm.
Appl. Sci. 10(23), 8335 (2020)
12. Deng, X., Li, J., Liu, E., et al.: Task allocation algorithm and optimization model on edge
collaboration. J. Syst. Arch. 110, 101778 (2020)
13. Song, Z., Li, Z., Chen, X.: Mobile group intelligence perception task distribution mechanism
based on compressed sensing. Comput. Appl. 39(01), 15–21 (2019)
14. Wu, G., Chen, Z., Liu, J., et al.: Task assignment for social-oriented crowdsourcing. Front.
Comp. Sci. 15(2), 1–11 (2021)
15. Huang, J., Zeng, J., Bai, Y., et al.: Layout optimization of fiber bragg grating strain sensor
network based on modified artificial fish swarm algorithm. Optical Fiber Technol. 65(1),
102583 (2021)
16. Zhang, L., Fu, M., Li, H., et al.: Improved artificial bee colony algorithm based on damping
motion and artificial fish swarm algorithm. J. Phys: Conf. Ser. 1903(1), 9 (2021)
17. Wu, G., Wan, L.: Research on particle swarm optimization algorithm for robot path planning.
Mech. Sci. Technol., 1–7 (2021)
Research on Task Allocation Model of Takeaway
Platform Based on RWS-ACO Optimization
Algorithm

Li Jianjun1,2,3 , Xu Xiaodi1,2,3 , Yang Yu1,2(B) , and Yang Fang4


1 School of Computer and Information Engineering,
Harbin University of Commerce, Harbin 150028, China
2 Heilongjiang Cultural Big Data Theory Application Research Center, Harbin 150028, China
3 Heilongjiang Key Laboratory of E-Commerce and Information Processing,
Harbin 150028, China
4 East University of Heilongjiang, Harbin 150066, China

Abstract. This paper research the task distribution of the takeaway platform,
builds a task distribution model for the takeaway platform, and proposes the
roulette ant colony algorithm (RWS-ACO) that combines the roulette algorithm
with the ant colony algorithm, then conducts simulation experiments to solve the
simulation data. The distribution plan is obtained, which verifies the effectiveness
of the model and algorithm and achieves a win-win situation for multiple parties
in the distribution.

Keywords: Takeaway platform · Task allocation · RWS-ACO optimization


algorithm

1 Introduction
With the rise of Internet technology and the popularization of mobile smart devices,
online platforms have been deeply integrated with offline businesses. Many industries
seize the opportunity and grow rapidly; among them, the takeaway industry is typical. The
takeaway platform makes full use of “Internet+” and the deep integration of traditional
industries. The Internet to integrate social resource allocation has developed rapidly, and
the number of orders and users has increased rapidly. Task allocation has become a key
point in the study of takeaway platforms.
As consumers develop their consumption habits, the takeaway industry has become
more and more mature. The courier’s capacity is far less than the demand of consumers,
and this has triggered a conflict between consumers and couriers. It is difficult for
consumers to get satisfactory service, and the couriers always violate traffic rules and
cause accidents. The price factor is no longer the primary factor when users make
choices. Now the competition among takeaway platforms is service quality and delivery
speed. Especially during the peak order period, many orders flood into the system.
At this time, the ability of the takeaway platform to process orders is weak, and the

© The Author(s), under exclusive license to Springer Nature Switzerland AG 2022


A. E. Hassanien et al. (Eds.): BIIT 2021, LNDECT 107, pp. 786–795, 2022.
https://doi.org/10.1007/978-3-030-92632-8_74
Research on Task Allocation Model of Takeaway Platform 787

order distribution is not reasonable, making consumers unable to obtain satisfactory


services. This paper studies and builds a task allocation model for the takeaway platform;
the practical significance is to optimize the task allocation process and to realize the
sustainable and healthy development of the takeaway platform. The main contributions
of this article include:

– For the participant selection process and task assignment process, under certain con-
straints, the task allocation model is constructed with the shortest delivery distance as
the optimization goal.
– Proposed roulette ant colony optimization algorithm, solved the model and performed
simulation experiments on simulated data, obtained the task allocation plan, and
verified the model’s effectiveness.

2 Related Work
In terms of task allocation, the research results of related scholars are as follows:
Fu et al. [1] believe that trust in task allocation is directly proportional to the finish
quality and proposed a greedy heuristic algorithm to allocate tasks to workers with low
cost and high trust; Zhang et al. [2] proposed the priority of task assignment depends
on the worker’s ability to complete the task and the possibility of acceptance; LI et al.
[3] proposed the Te matrix formula and used the greedy algorithm to generate the vehi-
cle allocation plan as the initial solution to the problem, combined with the simulated
annealing algorithm, provided a new heuristic algorithm; Alvarez et al. [4] proposed
to adopt BPC algorithm to solve the time window and task assignment problems of
multiple couriers. Experiments to verify that the algorithm has advantages; Wang et al.
[5] proposed to transform the combinatorial optimization problem into a task alloca-
tion function with load capacity and cooperation constraints, solve by the algorithm and
verify effectiveness by experiment; Pan et al. [6] proposed to add an adaptive threshold
algorithm based on the GH algorithm, and experiments to verify that it has lower cost
and higher utility; Chen et al. [7] proposed a dynamic grouping allocation method to
solve the conflicts generated in a dynamic environment, behaved well; Fu et al. [8] pro-
posed a two-stage allocation model to try to distribute tasks fairly and minimize costs,
experiments to verify F-Aware is superior than others; Liang et al. [9] proposed TWD
and design a two-stage competition and cooperation model: competition optimization
and negotiation and cooperation, maximizes personal income and gets the best solu-
tion. Miao et al. [10] proposed an adaptive ant colony algorithm to solve the problem
of local optimal solution, through experimental verification, global optimization can be
achieved; Xia et al. [11] proposed an improved tabu search algorithm to solve a planning
model that takes the number of vehicles and cost as dual goals; Liu et al. [12] used the
greeedy-ot algorithm with optimized threshold to solve the task assignment problem;
Song et al. [13] adopted an online greedy algorithm to solve the task assignment problem
of spatial crowdsourcing, realize distribution with the lowest cost; Li et al. [14] adopted
a bilateral online matching method to match workers with tasks.
To sum up, in terms of task allocation on takeaway platforms, most of them adopt
combinatorial optimization algorithms or swarm intelligence algorithms to solve the
788 L. Jianjun et al.

problem. Therefore, this paper integrates the roulette algorithm in the ant colony
algorithm, and uses the optimized RWS-ACO algorithm to solve the model.

3 The Establishment of the Task Allocation Model of the Takeaway


Platform

3.1 Description of Task Assignment Problem on Takeaway Platform

The supply and demand relationship exists in the takeaway platform. There are several
couriers, merchants, and consumers. Request a reasonable arrangement of the delivery
route and time of the courier, and under the given constraints (delivery load, volume,
time, etc.), enable the courier to deliver the goods required by consumers from the
merchant to the consumer on time, and make the objective function (length of the delivery
path) optimal. This paper divides businesses into different business circles by region. It
proposes a task allocation model in response to the requirements of delivery distance,
cost, and overtime in each business circle, to achieve the best match between consumer
orders and couriers. The task allocation model of the takeaway platform is shown in
Fig. 1:

Fig. 1. Task allocation model of the takeaway platform

3.2 Model Assumptions and Related Parameters

In real life, the task assignment of the takeaway platform is a very complicated problem.
In order to facilitate the construction and solution of the task assignment model, simpli-
fied some problems in the actual delivery process. Now make the following assumptions
about the task assignment problem of the takeaway platform studied:

(1) Assuming that the takeaway platform is divided into different business districts
according to the location of the merchants, the platform selects the appropriate
courier to complete the order delivery task for consumers according to the order
delivery distance and estimated delivery time published by the business circle.
(2) Assuming that the goods required by each consumer can only be delivered by one
courier;
(3) Assuming that the courier must first go to the corresponding business in the business
circle to complete the task of taking meal, and then go to the designated delivery
circle to complete the task of delivering meal;
Research on Task Allocation Model of Takeaway Platform 789

(4) Assuming that each business circle has a fixed delivery person, its starting point
is located in the center of the merchants in the business circle, and approximately
the same distance from each merchant. After completing the delivery task, couriers
will still return to the starting point, and the return distance is not include the total
distance of the delivery.

Let di (n) denote the distance from the i-th customer to the n-th business circle, where
n = 1,2, …, N, N represents the total number of business circles. Let the set Di (n) =
{d i (n)}, n = 1,2, …, N means the collection of distances from customer i to each business
circle. d i(m) = minDi (n), then the i-th customer will be delivered by the m-th district
distributor.
Based on the above analysis, construct a corresponding order distribution model for
multi-commercial takeaway platforms. The relevant parameters are shown in Table 1
below:

Table 1. Parameter of takeaway platform task allocation model

Parameter Means
ni Business circle
P Number of consumers
c The distance between the courier’s starting point and each merchant
Li The number of couriers in the i-th business circle
W ij The load of the j-th courier in the i-th business circle
T ij The maximum time of the j-th courier in the i-th business circle
qk The weight of the k-th consumer’s goods
tk Delivery time of the k-th consumer
d ij Distance from consumer i-th to consumer j-th
nij The number of consumers delivered by the j-th courier in i-th business circle
Rij The delivery route of the j-th courier in the i-th business circle
r ijk The k-th consumer delivered by the j-th courier in i-th business circle

3.3 Model Construction

This paper takes the shortest delivery path as the objective function, and establishes the
following mathematical model:
N Lj nij
minZ = [ drij(k−1) rijk + drij0 rijnij δ(nij ) + cP] (1)
i=1 j=1 k=1

nij

s.t. qrijk ≤ Qij (2)
k=1
790 L. Jianjun et al.

nij

trijk ≤ Tij (3)
k=1

N  Li
nij = P (4)
i=1 j=1

Rij ∈ {rijk |rijk ∈ [1, 2, · · · , P], k = 1, 2, · · · , nij } (5)



  1, nij ≥ 1
δ nij = (6)
0, others
In the above model, (1) is the objective function: the shortest total delivery route;
(2) means the total weight of the consumer’s goods must not exceed the delivery load of
the courier; (3) means the delivery time of the courier should not exceed the maximum
delivery time; (4) means each consumer’s needs can be met; (5) represents the com-
position of consumers in each path; (6) expresses the number of consumers served by
 >=1, which is, nij ≥ 1, indicating the courier
the j-th courier in the i-th business circle
participated in the delivery, then δ nij = 1; when nij < 1, means the courier is not
participating in the service, so δ nij = 0.

4 Roulette Ant Colony Task Allocation Method


4.1 RWS-ACO Algorithm
The roulette algorithm is added based on the ant colony algorithm. At the beginning of
the iteration, each ant starts from each business circle and selects the next customer to
be visited by combining the roulette algorithm, the selection probability of the optional
customer pijk is arranged in [0,1], then a random number between 0–1 is generated, and
the next customer j to be visited is determined according to the probability that the
random number appears in the above, to ensure the probability of the customer being
selected is the same as the calculated probability value, proportional to avoid falling into
the local optimum. After determining the next customer, calculate the total weight, if it
bigger than the courier’s carrying capacity, return; otherwise, take it as the next customer.
Among them, α represents the relative importance of pheromone, and β represents the
relative importance of heuristic factors.
The transition probability is as follows:

⎨ [τij (t)]α [ηij (t)]β
α β , j ∈ Jk (i)
pi (t) = s∈Jk (i) [τis (t)] [ηis (t)]
k
(7)
⎩ 0, j ∈ / J (i)
k

Jk (i) = {1, 2, · · · , n} − tabuk , ηij = 1/dij (8)


When all ants complete one iteration, update the optimal solution. The new route
length is calculated after every two customers exchange. If it better than before, update
the route, otherwise unchanged. Next, the pheromone on each spath needs to be updated:
τij (t + n) = (1 − ρ)τij (t) + τij (9)
Research on Task Allocation Model of Takeaway Platform 791


m Q
τij = τijk , τijk = Lk , If ant k passes client i, j in iteration
(10)
k=1
0, others

4.2 Algorithm Operation Steps

Step1: Initialize, t = 0;NC = 0;τ ij (t) = C;τ ij (t) = 0; put m ants on n business circles;
Step2 Set the taboo table index s = 1; add starting point to the respective taboo table;
Step3 Determine whether the taboo table is full: If full, calculate the travel length
that all ants have traveled, update the optimal path; else, set s = s + 1, the m ants select
the next business circle according to pijk Obtained by selecting the customer through the
roulette algorithm, adding the business circle to the taboo table, then re-inserting it into
the judgment of the full taboo table. Until it reaches the full state;
Step4 Calculate τijk , update the pheromone; t = t + n; NC = NC+1;
Step5 Determine whether the termination conditions are met, if met, output; else,
clear taboo tables and return to (2) until the termination conditions are met (Fig. 2).

Fig. 2. Flow chart of RWS-ACO algorithm operation


792 L. Jianjun et al.

5 Simulation Experiment
5.1 Experimental Environment and Data Description

Verify the effectiveness of the constructed task allocation model and the optimized ant
colony algorithm through simulation experiments. Experiment with MATLAB R2016a
as the simulation platform. The heuristic factor Eta is set as the reciprocal of the distance,
the initial pheromone matrix is the identity matrix, and the parameter settings are shown
in Table 2:

Table 2. Parameter settings

Parameter name Value size


Pop 60
MAXGEN 50
Alpha 1
Beta 1
Rho 0.15
Q 15
W 9
c 5
T 10

The simulated order experiment data is shown in Table 3 below:

Table 3. Parameter table of simulated experiment data

Number Coordinate X Coordinate Y Start time End time Load


1 105 89 300 318 2.5
2 105 88 205 217 1.5
3 105 92 59 83 1.8
4 104 87 103 157 2
5 102 86 35 71 0.8
6 108 88 97 109 1.5
7 101 89 117 141 1
(continued)
Research on Task Allocation Model of Takeaway Platform 793

Table 3. (continued)

Number Coordinate X Coordinate Y Start time End time Load


8 101 88 199 253 2.5
9 106 87 127 157 3
10 106 88 36 78 1.7
11 106 92 220 256 0.6
12 108 93 42 54 0.2
13 102 89 113 137 2.4
14 107 89 71 125 1.9
15 106 86 112 142 2
16 107 88 201 243 0.7
17 107 90 37 67 0.5
18 106 85 149 179 2.2
19 102 91 29 47 3.1
20 104 88 124 136 0.1

5.2 Analysis of Experimental Results


Substituting the above data into the experiment, the results of the order distribution
experiment are shown in Fig. 3 below:

Fig. 3. Optimal order allocation experiment

In the above figure, the asterisk locations are the business circle locations, and the
circled place is where the order is requested to be delivered. Each closed path is the order
794 L. Jianjun et al.

delivery path of the corresponding courier. Without overtime and overload, to achieve
the goal and make the delivery distance the shortest, it required four delivery personnel
to complete the tasks in the two business circles. The specific order task allocation plans
are as follows:
No. 1 courier: responsible for the delivery tasks of business circle 7, the orders that
need to be delivered are 5, 8, 13 and19;
No. 2 courier: responsible for the delivery task of business circle 10, the orders that
need to be delivered are 1, 2, 4, and 20;
No. 3 courier: responsible for the delivery tasks of business circle 10, the orders that
need to be delivered are 9, 15, and 18;
No. 4 courier: responsible for the dispatch task of business district 10. The orders
that need to be dispatched are 3, 6, 11, 12, 14, 16, and 17.
The adaptability comparison chart of the ACO algorithm and the RWS-ACO
algorithm is shown in Fig. 4 below:

Fig. 4. Algorithmic fitness comparison chart

Simulation experiments verify that the model and algorithm can reasonably allocate
the orders posted on the platform to the corresponding couriers and successfully deliver
the goods to consumers within the specified time under limited constraints. The destina-
tions to be delivered can be reasonably allocated optimally and save algorithm running
time.

6 Conclusion
This article first builds a task allocation model that takes the shortest delivery distance as
the optimization goal and takes the delivery load, volume, and time as constraints. Sec-
ondly, based on the ant colony intelligence algorithm, the roulette algorithm is combined
with the ant colony algorithm, and the RWS-ACO algorithm is used to solve the model.
Finally, the simulation experiment verifies that the algorithm can effectively achieve task
allocation, ensure that randomness is not lost, ensure that all consumers are likely to be
Research on Task Allocation Model of Takeaway Platform 795

selected, and avoid falling into local optimality. There are still some shortcomings in this
paper. In a follow-up study, the actual distance between the merchant and the courier
should be further considered.

Acknowledgment. This work is partly supported by the project supported by the National Social
Science Foundation(16BJY125), Heilongjiang philosophy and social sciences research planning
project(19JYB026), Key topics in 2020 of the 13th five year plan of Educa-tional Science in
Heilongjiang Province(GJB1320276), Project supported by under-graduate teaching leading talent
training program of Harbin University of Commerce (201907), Key project of teaching reform and
teaching research of Harbin University of Commerce in 2020(HSDJY202005(Z)), Innovation and
entrepreneurship project for college students of Harbin University of Commerce (202010240059),
School level scientific research project of Heilongjiang Oriental University (HDFKY200202), Key
entrusted projects of higher education teaching reform in 2020(SJGZ20200138).

References
1. Fu, D., Liu, Y.: Trust-aware task allocation in collaborative crowdsourcing model. Comput.
J. 64(6), 929–940 (2021)
2. Zhang, X., Su, J.: An approach to task recommendation in crowdsourcing based on 2-tuple
fuzzy linguistic method. Kybernetes 47(8), 1623–1641 (2018)
3. Li, B., Yang, X., Xuan, H.: A hybrid simulated annealing heuristic for multistage het-
erogeneours fleet scheduling with fleet sizing decisions. J. Adv. Transp. 2019, 1–19
(2019)
4. Alvarez, A., Munari, P.: An exact hybrid method for the vehicle routing problem with time
windows and multiple deliverymen. Comput. Oper. Res. 83, 1–12 (2017)
5. Wang, J., Wang, J., Han, Q.: Multivehicle task assignment based on collaborative neurody-
namic optimization with discrete hopfield networks. IEEE Trans. Neural Networks Learn.
Syst. 1–13 (2021)
6. Pan, Q., Pan, T., Dong, H., Gao, Z.: Research on task assignment to minimize travel cost for
spatio-temporal crowdsourcing. Wireless Com Network 2021, 59 (2021)
7. Chen, X., Zhang, P., Du, G., Li, F.: A distributed method for dynamic multi-robot task
allocation problems with critical time constraints. Robot. Auton. Syst. 118, 31–46 (2019)
8. Fu, D., Liu, Y., Yan, Z.: Fairness of task allocation in crowdsourcing workflows. Math. Probl.
Eng. 2021, 1–11 (2021)
9. Liang, D., Cao, W., Xu, Z., Wang, M.: A novel approach of two-stage three-way co-opetition
decision for crowdsourcing task allocation scheme. Inf. Sci. 559, 191–211 (2021)
10. Miao, C., Chen, G., Yan, C., Wu, Y.: Path planning optimization of indoor mobile robot based
on adaptive ant colony algorithm. Comput. Ind. Eng. 156, 107230 (2021)
11. Xia, Y., Fu, Z.: Improved tabu search algorithm for the open vehicle routing problem with
soft time windows and satisfaction rate. Cluster Comput. 22(1), 1–9 (2018)
12. Liu, J., Xu, K.: Budget-aware online task assignment in spatial crowdsourcing. World Wide
Web 23(1), 289–311 (2020)
13. Song, T., Xu, K., Li, J., Li, Y., Tong, Y.: Multi-skill aware task assignment in real-time spatial
crowdsourcing. GeoInformatica 24(1), 153–173 (2020)
14. Li, Y., Fang, J., Zeng, Y., Maag, B., Tong, Y., Zhang, L.: Two-sided online bipartite matching
in spatial data: experiments and analysis. GeoInformatica 24(1), 175–198 (2020)
Utilizing Vias and Machine Learning
for Design and Optimization of X-band
Antenna with Isoflux Pattern
for Nanosatellites

Maha A. Maged1(B) , Ahmed Youssef2 , Fatma Newagy3 ,


Mohammed El-Telbany4 , Ramy A. Gerguis3 , Nayera M. Abdulghany3 ,
George S. Youssef3 , and Youssef Y. Zaky3
1
National Authority for Remote Sensing and Space Sciences, Cairo, Egypt
maha maged@narss.sci.eg
2
Egyptian Space Agency, Cairo, Egypt
ahmed.youssef@egsa.gov.eg
3
Ain Shamis University, Cairo, Egypt
fatma newagy@eng.asu.edu.eg, ramyadel16@icloud.com
4
Sinai University, Cairo, Egypt
mohammed.elsaid@su.edu.eg
https://www.narss.sci.eg, https://www.egsa.gov.eg,
https://www.asu.edu.eg/, https://www.su.edu.eg/

Abstract. The reduction in antenna hardware such as physical struc-


ture and generating isoflux radiation are two important factors in design-
ing nanosatellites antennas. Machine learning (ML) techniques show
great promise in antenna design and optimization to achieve great perfor-
mance. In this work, it is reported the design and simulation results of a
novel circular polarization antenna, with isoflux radiation for nanosatel-
lites despite the use of cylinders of vias and genetic algorithms (GAs)
as a machine learning optimization technique, closely satisfies the isoflux
gain pattern requirements in the designed frequency band (8.2–8.3 GHz)
and showing excellent circular polarization with an axial ratio (AR) lower
than 3 dB over the resonance frequency band at the limit of coverage.
The antenna has been designed and successfully simulated.

Keywords: Machine learning · Genetic algorithms · Isoflux antenna ·


Vias · Nanosatellites

1 Introduction
The satellites antennas have an important role in Earth observation missions,
where a high transmission rate is necessary to acquire sensed images in different
spectral bands for an enormous number of remote sensing applications. The wide
coverage pattern or Isoflux pattern is one of the requirements for the antenna
system in satellites. The Isoflux antenna on the satellites provides a uniform
c The Author(s), under exclusive license to Springer Nature Switzerland AG 2022
A. E. Hassanien et al. (Eds.): BIIT 2021, LNDECT 107, pp. 796–804, 2022.
https://doi.org/10.1007/978-3-030-92632-8_75
Utilizing Vias and Machine Learning for Design and Optimization 797

distribution to power density over a well-defined portion of the Earth coverage


for long time visibility is desired. The Isoflux pattern is advantageous for remote
sensing satellites due to improving antenna downlink data handling. However,
this requirement is highly difficult to fulfill for many typologies of antennas,
and an efficient low-mass and low-cost antennas for communications with opti-
mal Isoflux shaped beams are a major asset for space missions. Many antennas
have been specified as good candidates to meet the needed Isoflux pattern such
as the microstrip antenna with cavity, the compact antenna with parasitic ele-
ments, helix antenna, and most used antenna are the choke horn and choke ring
antennas [1–3]. The croke ring antenna consists of a number of uniform lengths
of conductive concentric cylinders/rings around a central radiator [1–3]. Unfor-
tunately, their physical size and weight are not compliant with a nanosatellite
application. The second solution for Isoflux pattern using beam-formed antenna
such as an array of patches [4] are interesting but they suffer from important
dimensions and they require a complex feeding network. The two solutions are
very difficult to implement in nanosatellites structures.
Machine learning (ML) techniques shows great promise in antenna design
and optimization to achieve great performance [5]. The supervised learning tech-
niques relies heavily on availability of data, which can be used to find a mathe-
matical model describing input and output data. From an antenna design per-
spective, this data will be acquired by simulating the desired antenna on a wide
range of values using computational electromagnetics simulation software. In this
paper, we present a novel X-band Isoflux pattern antenna, despite the use of a
vias technology and genetic algorithms (GAs) as machine learning optimization
technique [6], closely satisfies the gain pattern requirements in the requested
frequency band (8.2–8.3 GHz). In this contribution, vias are used to creating
equal size choke rings, to help in generating the isoflux radiation pattern, and
its diameters optimized for learning and achieving the desired Isoflux pattern
using GA. This paper presents the design and simulate a new X-band antenna
with Isoflux pattern for nanosatellites using vias technology and machine learn-
ing optimization. The contributions of this paper are as follows:

1. Design a novel X-band antenna using cylinders of vias.


2. Learning and Optimizing the X-band antenna performance for Isoflux pattern
using GA.

In the remainder of this paper, an isoflux X-band antenna description is


given in Sect. 2. Then, the antenna structure design optimization using machine
learning is presented in Sect. 3. In Sect. 4, the simulation and measured results
for the antenna are presented. Finally, Sect. 5 concludes the paper.

2 Antenna Structure Design

The basic structure of designed antenna consists of two choke rings fed by a
radiator in order to get the desired isoflux pattern is shown in Fig. 1. The antenna
798 M. A. Maged et al.

architecture was optimized to comply with an integration on top the CubeSat


satellites. The calculated positions of two concentric rings are optimized using
multi-objective optimization technique. The thickness of each ring is 1.5 mm.
In order to obtain circular polarization, driving patch antenna excited by a
compact sequential-rotation phase feed microstrip circuit with a set of parasitic
crossed dipoles. The feeding circuit is consists of one oversized 180◦ hybrid ring
coupler and two 90◦ hybrid couplers, which are folded to fit inside the 180◦ ring
coupler [3].

Fig. 1. The X-band Isoflux antenna structure assembly.

The patch and the circuit are printed on two stacked RO4003c substrates
which creates a embedded ground plane between the patch and the feeding net-
work’ board. The patch antenna is connected to the coupler through four via
holes crossing the embedded ground plane. The resulting microstrip assembly is
placed on a metal cylinder (4 mm high and 33 mm in diameter), which is sur-
rounded by twelve parasitic crossed dipoles placed on a 45 mm-diameter circle,
printed on both sides of a second RT5880 1.524 mm-thick substrate. Figure 2
illustrates the structure of the X-Band Isoflux antenna from front and back view.

3 Antenna Structure Optimization


The optimization processes of the isoflux antenna represent a synthesis method-
ology for antenna patterns shaping with wide-angle coverage when a large num-
ber of geometrical parameters are involved. The role of the set of concentric rings
is to manipulate the propagation of surface waves on the ground plane to con-
trol the current distribution on the antenna aperture and to shape the radiation
pattern. The optimizing current distribution can be realized by positions and,
also, physical parameters of the rings which play an important role in the gen-
eration of Isoflux pattern. This relation can be set as a nonlinear minimization
problem [7]:
Utilizing Vias and Machine Learning for Design and Optimization 799

Fig. 2. The antenna front and back view of shows the two calendars of vias and feeding
network.

x∗ ∈ arg min F (f (x))


x
subject to gi (x) ≥ 0, where (i = 1, . . . , m)
x ∈ [l, u] ⊂ R
d n

(1)
Where x is the vector of d design variables, X = [l, u] represents a feasible
d

region of the design variable x, f denotes the response results of the antenna of
interest and F (x) is the antenna optimization objective function and gi (x) are
constraints. The vector x∗ is the optimal parameters of the design. Normally, f is
obtained through computationally expensive electromagnetic simulation. Since,
our main goal is synthesis of the Isoflux pattern, applying the machine learning
concepts by learning the desired gain pattern at X-band target frequency for
the designed antenna. In this case, the objective function measures the distance
between a specified pattern and the desired, it is commonly called error function
in supervised learning [8–10]. The objective function (F ) corresponding to the
design is given by:


N
x∗ ∈ arg min F (x) = (Pd (fi (x) − Pa (fi (x))2
x
i
(2)
g(x) = v(x) − vmin ≥ 0,
x ∈ [l, u]d
Where (Pd ) is the desired pattern, (Pa ) is the actual pattern values are
expressed in dB and v(x) is the antenna volume. Where xu , xl are upper and
lower limits of design variables, respectively.
800 M. A. Maged et al.

4 Simulation Results
The designed X-Band Isoflux vias choke rings antenna has been successfully sim-
ulated and designed using advance EM simulation and Python programming lan-
guage. The CST microwave studio [11], GA optimization and Python libraries sim-
ulation tools are employed to efficiently evaluate the candidate antenna designs
to determine their gain.

Fig. 3. GA-based machine learning algorithm framework for synthesis Isoflux pattern

The CST2020 provides a Python programming interface to execute Visual


Basic (VBA) scripts in the Python environment. In order to generate antenna
simulation data, The VBA language integrated in CST and Python scripts to
automatically run the simulation programs in batches. In this way, antenna
pattern data acquisition and learning can be performed in Python. Our antenna
design framework based on GA progress that has been developed is shown in
Fig. 3. The optimization and learning process settings, such as the parameters,
Utilizing Vias and Machine Learning for Design and Optimization 801

the objective functions, and the visualization of the performance, are preset
in the main program as follows: maximum number of iterations = 20, number
of chromosomes = 5, Pcross = 0.5, Pmu = 0.1 and range bound of two radii
[l, u]2 = [20, 22], [26, 30].

Table 1. The dimensions of optimized vias rings radii for synthesis Isoflux pattern

Parameter Value
r1 20.4 mm
r2 28.6 mm

The optimization parameters are predicted based on the GA progress algo-


rithm and are updated to the CST MWS. The error functions f are calculated by
the time domain solver of CST2020. After calculating, the results are returned
to the main program. Then, the evaluation and comparison of the performance
and decision making are performed in the main program. After carrying out the
simulations, optimized radii of the rings of vias for learning the Isoflux pattern
have been obtained and they are given in Table 1.

Fig. 4. The best optimized radii return loss of designed antenna

Using those dimensions, simulated return loss and gain are presented in Fig. 4
and Fig. 5, respectively. For the operating frequency, return loss is −20 dB and
gain > 3 dB at desired beam-width also shows excellent cross polarization mea-
surements. Figure 6 shows the searching for best obtained learned Isoflux pat-
terns that nearly matched desired pattern through the learning and optimization
process.
802 M. A. Maged et al.

Fig. 5. The best optimized radii gain pattern of designed antenna at φ = 45◦

Fig. 6. The Isoflux gain patterns of designed antenna through learning of best radii
values at φ = 45◦ .

Also, Fig. 7 shows the learning curve of the GA for optimizing the radii of
rings of vais. The fitness value of the best chromosome stabilizes and fixed for all
the generations and converged to a solution is obtained by the five independent
runs. As can be seen in Fig. 8, the axial ratio (AR) of the antenna is below 3 dB
within the whole 65◦ range in plane cuts of φ equal to 45◦ .
Utilizing Vias and Machine Learning for Design and Optimization 803

Fig. 7. The learning curve of the genetic algorithm for fives runs.

Fig. 8. The AR of the X-band Isoflux antenna.

5 Conclusion
This paper presents a new design for X-Band Isoflux choke ring antenna uti-
lizing vias for nanosatellite applications. This design introduces the utilization
of supervised learning and optimization for the radius of vias-based choke rings
using GA which results not only in gain improvement but also learning the
Isoflux pattern.
The optimized X-band antenna has wider impedance bandwidth of > 200
MHz, higher gain > 3 dB in entire coverage range, return loss of less than
−20 dB at resonant frequency, axial ratio less than 3 dB and VSWR less than 2
in desired frequency range of X-band. The final X-band antenna design will be
used in the nanosatellites of Egyptian Space Agency (EgSA).
804 M. A. Maged et al.

Acknowledgments. The work described in this paper is a spin-off of a bigger project,


namely the Egyptian Universities Education Satellites project which supported by
EgSA.

References
1. Colantonio, D., Rosito, C.: A spaceborne telemetry loaded bifilar helical antenna for
LEO satellites. In: Proceedings of the International Microwave and Optoelectronics
Conference, Belem, Brazil, pp. 741–745 (2009). https://doi.org/10.1109/IMOC.
2009.5427481
2. Nawaz, W., Ali, A.: Improvement of gain in dual fed X band isoflux choke horn
antenna for use in LEO satellite mission. In: Fourth International Conference on
Aerospace Science and Engineering (ICASE), pp. 1–4. IEEE (2015)
3. Fouany J., et al.: New concept of telemetry X-band circularly polarized antenna
payload for CubeSat. IEEE Antennas Wirel. Propag. Lett. 17, 2987–2991 (2017).
https://doi.org/10.1109/LAWP.2017.2756565
4. Maldonadoa, A., Panduroa, M., del Rio Bociob, C., Mendez, A.: Design of concen-
tric ring antenna array for a reconfigurable isoflux pattern. J. Electromagn. Waves
Appl. 27(12), 1483–1495 (2013). https://doi.org/10.1080/09205071.2013.816877
5. El Misilmani, H., Naous, T., Al Khatib, S.: A review on the design and optimization
of antennas using machine learning algorithms and techniques. Int. J. RF Microw.
Comput. Aided Eng., 1–28 (2020). https://doi.org/10.1002/mmce.22356
6. Haupt R.: Genetic Algorithms in Electromagnetics, 1st edn. Wiley, Hoboken (2013)
7. Koziel, S., Ogurtsov, S.: Simulation-driven design in microwave engineering: meth-
ods. In: Koziel, S., Yang, X.S. (eds.) Computational Optimization, Methods and
Algorithms. SCI, vol. 356, pp. 153–178. Springer, Heidelberg (2011). https://doi.
org/10.1007/978-3-642-20859-1 8
8. Hagglund, A.: Using genetic algorithms to develop a conformal VHF antenna.
Master’s thesis, Stockholm-Sweden (2014)
9. Ledesma, S., Ruiz-Pinales, J., Cerda-Villafan̈a, G., Garcia-Hernandez, M.: A
hybrid method to design wire antennas: design and optimization of antennas using
artificial intelligence. IEEE Antennas Propag. Mag. 57(5), 23–31 (2015). https://
doi.org/10.1109/MAP.2015.2453912
10. Smith, J., Baginski, M.: Thin-wire antenna design using a novel branching scheme
and genetic algorithm optimization. IEEE Antennas Propag. Mag. 67(5), 2934–
2941 (2019). https://doi.org/10.1109/TAP.2019.2902960
11. CST Team: CST Studio Suite: Computer Simulation Technology. https://www.
cst.com/products/cstmws
Research on Automatic Generation System
of the Secondary Safety Measures of Smart
Substation Based on Archival Data

Sun Liye(B)

Electric Power Science Research Institute of State Grid Heilongjiang Provincial Electric Power
Co., Ltd., Haerbin 150030, Heilongjiang, China

Abstract. This article summarizes and analyzes the secondary safety measures
and regulations archival data of smart substation work ticket, establishes electri-
cal archives database of typical safety measures, and converts the relay protection
language of safety measures in archives database into semantics identifiable by
computer, which matches implementation and operation objects modeled in main-
tenance work of substation analyzing SCD file archives to realize the automatic
generation of operating maintenance and repair safety measures for smart sub-
station secondary system. With the aid of archival data of safety measures, it has
effectively improved efficiency and accuracy of safety measure compilation, pro-
vided the significant decision-making basis for safe operations, and improved the
technical level of safe production.

Keywords: Archives · Safety measures · Relay protection · Semantic matching

1 Introduction

With the complete construction of a smart substation and continuous improvement of


standardization in operating maintenance and repair, the adopted safety measures have
similarities and can be taken as the reference. Currently, the secondary safety measures
on various on-site operations work tickets in substations are all handwritten and archived
via paper with a storage period of one year. Still, the archival digitization of work tickets
is low [1]. As they are stored, they will lack digitization and systematic analysis, and it
could not carry out quick searching and data analysis. Each time operating maintenance
and repairs have to repeat manual editing, low efficiency, and high mistake rates, which
easily cause safety accidents.
This article comprehensively summarizes typical safety measures of smart substa-
tion work tickets archives, establishes a complete electrical archives database for typical
safety measures of the secondary System and operating procedures, converting the relay
protection language of the safety measures in the database into semantics identifiable
to computer, analyzing smart substation SCD file archives, and having operating main-
tenance and repairs into modeling to complete automatic matching between computer
semantics of relay protection and implementation and operation objects of modeling,

© The Author(s), under exclusive license to Springer Nature Switzerland AG 2022


A. E. Hassanien et al. (Eds.): BIIT 2021, LNDECT 107, pp. 805–816, 2022.
https://doi.org/10.1007/978-3-030-92632-8_76
806 S. Liye

and realize the automatic generation of typical safety measures. It demonstrates various
implementation objects and operation objects via a graphical interface to form the step-
by-step demonstration of safety measures [2]. Work tickets’ archival data will improve
the accuracy and efficiency of safety measures compilation and provide a significant
basis for safe operations.

2 Typical Relay Protection Safety Measures of Archival Data


Through the data summary and analysis on work tickets archives, the typical safety
measures of smart substation are normally described as “Unplug the XX consolidation
unit-connected direct optical fiber on XX protection device,” “The XX protection exits
from the XX soft pressure panel,” and other semantics [3]. Through deep research on
typical safety measures, we can see that abstract content descriptions of typical safety
measures are as follows: within the substation under the ascertain voltage class and wiring
mode when some repairs are carried out on one specific type of device in one specific
running equipment, certain safety measures are taken to isolate the interoperation rela-
tionship between the equipment to be repaired and the operating equipment to ensure the
operating equipment is safe [4, 5]. In typical safety measures, this connection is defined
as an “implementation object”. In contrast, the specific implementation of “implemen-
tation object” is accomplished through the operations by one or more specific operation
objects. In a smart substation, the operation objects are categorized into 4 categories:
A functional pressure plate of exit protection device; the maintenance pressure plate of
the input device; soft pressure plate or hard plate to exit the communication between
devices (or to open the loop connection); Unplug the optical fiber that communicates
between the devices.

Table 1. The relationships between the object of implementation and the object of the operation

Equipment to be maintained Implementation object Operation object


A A— > B ➀ Output pressure plate from A to B
➁ A functional pressure plate of A
➂ Input pressure plate of B
➃ The maintenance pressure plate of A
➄ Optical fibre from A to B

The final complete safety measures are the orderly combination of operation objects.
Table 1 indicates the relationship between the implementation object and operation object
related to the equipment A to be maintained.
Hence, the typical safety measures in the archive database could be formalized,
forming into semantic rules understandable by the computer. Then when the maintenance
people are writing specific safety measures, it can combine the voltage class and wiring
mode of the substation, type of the equipment to be maintained, the primary operational
condition of the equipment, maintenance type, and other elements to match appropriate
the appropriate semantic rules. That is to model the general implementation object and
Research on Automatic Generation System 807

operation object in semantic rules into specific implementation object and operation
object examples in the substation to automatically generate safety measures tickets that
are supposed to be carried out during maintenance operations.
The functional pressure plate, soft pressure plate, and maintenance pressure plate
in the specific examples of implementation object and operation object all have been
illustrated in the SCD file in line with standard methods, which can be easily computer
modeled and used to match general semantic rules. However, although optical fire, the
secondary connection, and other operation objects are not described in the SCD file and
could not achieve the model matching, optical fiber and the secondary connection on site
are all physical, visible, and convenient for on-site verification, which could be marked
during matching. And in this way, they could be taken as items to be confirmed in the order
of automatically generated safety measures to avoid a large amount of supplementary
modeling work and provide guidance to on-site compilation and implementation of
safety measures tickets.

3 The Automatic Generation of Safety Measures in Typical Safety


Measures Semantics

3.1 The Formalization of Safety Measures for Typical Secondary Maintenance

The formalization of the typical secondary safety measures consists of 3 parts: defini-
tion of substation design scheme, the definition of maintenance mode, and formalization
of implementation object and operation object of safety measures. The design scheme
of substation mainly concerns the following dimensional elements: voltage class, main
wiring mode, protection configuration, sampling form (electronic voltage and current
transformer or general voltage and current transformer), whether adopting the consoli-
dation unit, and whether adopting the smart terminal. Maintenance mode considers the
power off the scope of primary equipment (power failure at the whole station, voltage
class power failure, interval power failure, no power failure), maintenance type (initial
inspection, routine test, defects elimination, and upgrading).
In the operation procedure, in the substation for each design scheme, every type of
equipment (busbar protection, circuit protection, main transformer protection, consoli-
dation unit, smart terminal, spare power automatic switching, etc.) has its detailed rules
on safety measures under the determined maintenance mode where combination with
the operation procedure must identify the implementation objects and operation objects
in safety measures, model and object them one by one.
The formalization of implementation object can be expressed as:
   
Primary Blackout
 Range Equipment
   Maintenance Type .
to be Inspected
{ Start
 Interval Type Start Device
 Type to Target Interval Type (1)
Target Device Type of Implementation Object Semantics

The formalization of operation object can be expressed as:


     
Implementation Object . Start Interval Type Start Device Type . Operation Object
    (2)
or Target Interval Type Target Device Type . Operation Object .
808 S. Liye

Implementation object semantics is the logic semantics of the implementation


object and is a semantic matching keyword. Table 2 gives an example illustrating the
implementation object semantics of busbar protection.

Table 2. The semantics of objects implemented

Equipment type to be inspected The semantics of implementation object


Busbar protector Current
Voltage
Breaker position
Tripping operation
Start failure
Failure of united jumping

3.2 Substation Archive Data Object Modeling

The main process of substation archive object modeling is to analyze the SCD file archive
and model various data objects. The specific process is to: determine the substation design
scheme type, that is, to determine substation voltage class, design scheme, protection
configuration, sampling form (Electric voltage and current transformer or general volt-
age and current transformer), whether adopting the consolidation unit, whether adopting
the smart terminal [6]. The current IED naming practice on SCD by individual domes-
tic integrators extracts the models that define various types and intervals of secondary
equipment type in the SCD file. For example, PM stands for busbar protection, and PL
stands for circuit protection.
When analyzing SCD files, there is a communication relationship between any two
devices, including GOOSE transceiver and SV transceiver, which are defined as imple-
mentation objects between devices. Then the similar virtual signals are then combined
and formed into one group of virtual signals, defining one group of virtual signals
as an implementation object with a grouping method based on matching keywords in
semantics, refer to Table 3.

3.3 The Semantic Rules and Semantic Matching with Case Objects in Substation

Firstly, for semantic rules and semantic matching with case objects in substation, it
needs to identify the equipment to be inspected, maintenance mode, which is based on
the typical safety measures semantic rules used for information matching in maintenance
and repair scenarios. Then apply the typical secondary maintenance’s safety measures
semantic rules to match each implementation object and each operation object modeled
in a substation. The specific process is as follows:
Research on Automatic Generation System 809

Table 3. Semantic rules for object extraction

Semantics of Initial end to terminal Virtual signal type Key words for fuzzy
implementation object end matching
Current sampling Circuit consolidation SV Current
unit to circuit
protection
Current sampling Main transformer SV (High voltage *
consolidation unit to Current) or (Medium
main transformer voltage * Current) or
protection (Low voltage *
Current) or (Common
winding * Current) or
(Drivepipe * Current)
Start failure Circuit protection to GOOSE (Circuit branch *
busbar protection Failure) or (Main
transformer * Failure)
Start failure Main transformer GOOSE (Failure of united
protection to busbar jumping)or (Failure of
protection turn-in) or (Failure of
start)
Start failure Busbar protection to GOOSE (Circuit branch *
busbar protection Failure) or (Buscouple
* Failure)
…… …… …… ……

1. Select an equipment to be inspected in a substation and set the maintenance mode,


that is, to clarify one power failure condition and type of equipment maintenance.
2. Primary equipment’s power off the range, maintenance type, substation wiring
mode, and another information match with semantic rules used in typical secondary
maintenance safety measures.
3. The semantic matching of the implemented objects is conducted based on the seman-
tic rules. The matching method is intelligently matched based on the logical seman-
tics of the implemented objects in the semantic rules. For example, the typical safety
measures: “[Circuit protection] to [Busbar protection] [Start failure]” matches [Start
switch 1 failure GOOSE transceiver soft pressure plate] from “[PL612A 220 kV ×
× Circuit 57211 circuit protection A set] to [PM220A 220 kV busbar protection A
set].
4. Semantic matching of operation objects is carried out based on semantic rules. That
is, the operation objects associated with case objects defined in semantic rules are
matched with the soft pressure plate and virtual loop associated with the matching
implementation objects of the substation, and the matching is carried out based on
keywords. For example, the operation object that implements the object association
in the semantic rule is “Exit [Sample receiving pressure plate] on [Line Protection]
connecting [line consolidation unit]” matching “[consolidation unit direct receiving
810 S. Liye

soft pressure plate] exit [PL612A 220 kV × × Circuit 57211 Circuit protection A
set] connecting [ML612A 220 kV × × Circuit 57211 Circuit consolidation unit A
set]”.

3.4 Editing, Adjustment, and Confirmation of Safety Measures Tickets


The editing, adjustment, and confirmation of the safety measures ticket will first be to
show the implementation objects. Operation objects related to the device be inspected,
and then the matching results with typical semantic rules so that the maintenance staff
can see the pressure plate or optical fiber that can be opened or unplugged between the
equipment to be inspected and the associated equipment as the executive safety measures
with a visual presentation to be demonstrated in the form of visible break. Matching of
the operating objects has three results:

1. In the SCD file, the modeled operation object in the semantic rule base can match
with the data object in the SCD file, such as the trip outlet soft pressure plate.
The matching result directly replaces the general operation object with the specific
operation object case.
2. In SCD file, the modeled, but due to reasons such as nonstandard description, the
general operation object in semantic rules failed to match with operation case object,
this kind of circumstance should be marked as “to be confirmed” so that for staff
on-site to confirm whether the operation object to be matched exists. If the operation
object exists, it is an option for safety measures. If not, unplug the optical fiber for
disconnection.
3. Because of SCD file is unmodeled, semantic rules of some general operation objects
in semantic rules could not carry out model matching. The automatically generated
safety measures tickets will be listed directly in which the general operation objects
are defined in the typical safety measures semantics and identified as “to be con-
firmed” for staff on-site to confirm the operation case object, such as optical fiber
the secondary connection.

4 Safety Measures Automatically Generated System


4.1 Basic Structure of System
The safety measures automatically generated System is developed in C/S mode, which
applies a combination of centralized and distributed basic data maintenance. When the
main server is maintained, the client data can be updated automatically, guarantee the
centralized and distributed system data real-time synchronization. When the server can-
not be connected, the client can run it independently offline. The server configuration is
in the smart substation configuration file management and control system built by Spring
+ G4Studio framework and developed in Java language. HTTP + JSON realizes data
access and file transfer with the client. MySQL relational database is used as the server
database.
Client database using SQLite file database, easy to upload and download data
between client and server. The client is developed by integrating Web technology and
Research on Automatic Generation System 811

Node technology, built on the framework of NW.js, and provided in the form of PC client
software. The overall structure is shown in Fig. 1:

Fig. 1. The overall structure of automatic generating safety measures system

The overall structure comprises the data, business analysis, service, human-computer
interaction, and management layers.

Data Layer. The data layer is divided into smart substation archive databases, including
public database, typical scheme database, SCD information database, semantic rule base,
maintenance rules, sequential verification rules, etc.

Business Analysis Layer. The business analysis layer mainly analyzes the association
between SCD files, substation data models, and typical schemes, and the generation of
safety measures.

Service Layer. The service layer mainly provides graphical services and sequential
calibration services.

Human-Computer Interaction Layer. The human-computer interaction layer mainly


includes data instantiation adjustment, safety measures tickets compilation, pressure
plate, and virtual loop relationship graphical configuration, safety measures graphical
preview, safety measures order checking, among which data instantiation adjustment
is to consider when the current substation data model and typical matching rules are
incomplete to add in human confirmation and adjustment function.
The graphical configuration of the relationship between the pressure plate and virtual
loop is to associate the pressure plate and virtual loop in the substation data model to
812 S. Liye

form a complete loop. The graphical rehearsal of safety measures is a demonstration


of safety measures on graphics to assist inspection personnel in checking whether the
compilation of safety measures is reasonable.
Management Layer. The management of safety measures data is mainly to realize the
upload and download of intermediate data of safety measures and the query, upload,
and download of safety measures tickets. At the same time, through the data interface
layer and smart substation configuration file management and control system for data
interaction to achieve unified data archiving and management of safety measures data.

4.2 Main Performance Indexes of the System


The automatic generation system of safety measures can continuously generate 7 × 24-h
uninterrupted operations. In case of failure, it can give alarm in time and has automatic
and manual recovery measures to restore normal operation in case of error quickly. The
automatic recovery time is less than 15 min, and the manual recovery time is less than
2 h (Table 4).

Table 4. Main performance indexes of the system

Name of Index of performance Name of Index of performance


performance performance
Database capacity ≥1000 G Average response <3 s
time of system
operation
Maximum supported 1000 MB Response delay of ≤1 s
switched data unit database condition
length retrieval interface
Annual system ≥99.99% Response time of ≤15 s
availability multi condition
accurately querying
MTBF of server, ≥10,000 h Response time of ≤25 s
workstation and single condition
network equipment fuzzily querying
Response delay of ≤3 s Response time of ≤30 s
database network visual
switching interface

5 The Automatically Generated Case of the Secondary


Maintenance Safety Measures
220 kV busbar A set of protection devices shall be first inspected under the condition of no
power failure of the primary equipment. According to the practice, the initial inspection
Research on Automatic Generation System 813

of busbar protection (A4 maintenance) shall be carried out under the condition of power
failure of the primary equipment at all busbar intervals, and the single device outage
maintenance mode shall be adopted. As shown in Fig. 2, the corresponding typical
tickets can be screened by selecting the devices to be inspected and the type of work
(Figs. 3, 4 and 5).

Fig. 2. Busbar protection initial safety measures under the condition of power failure of the
primary equipment

In the case of 220 kV circuit protection, the completeness of the safety measures
of the initial and secondary inspection can reach 100% under the condition of primary
equipment power failure. In the case of 220 kV circuit protection, the completeness of
the safety measures of the initial and secondary inspection can reach 100% under the
condition of primary equipment power failure. The generated secondary safety measures
cover all operations such as soft pressing plate, functional pressing plate, maintenance
pressing plate, and pulling out optical fiber, and ensure that all operations are free from
omission. It can find the virtual circuit, soft pressing plate, maintenance pressing plate,
optical fiber, etc., contained in the typical ticket but not found in the SCD file. It can
identify virtual circuits in SCD documents but not be involved in typical tickets. The
recognition rate shall be more than 95%, which automatically prompts the operation and
maintenance personnel to verify safe and stable work.
814 S. Liye

Fig. 3. Implementation object of busbar protection

Fig. 4. Specific operation steps


Research on Automatic Generation System 815

Fig. 5. Visual display of the initial inspection of busbar protection

6 Conclusion

It proposes the semantic rules of object extraction to establish the complete safety mea-
sure database. The file data define the division of responsibilities, items, implementation
and recovery sequence, format, and content of safety measures for on-site work of sec-
ondary circuit. The implementation object and operation object model of the SCD file
automatically match the implementation object and operation object in the typical ticket.
The current operation safety measures are automatically generated based on the historical
archival data. The efficiency and accuracy of preparing safety measures are significantly
20% higher than the existing technology and reference.

References
1. Luo, Z.K.: Study on secondary safety measures of smart substation. Electr. Eng. 15, 144–146
(2019)
2. Sheng, H.H., Wang, D.L., Ma, W., Luo, S.J., et al.: Exploration of intelligent operation manage-
ment system of relay protection based on big data. Power Syst. Prot. Control. 47(22), 168–175
(2019)
3. Geng, S.B., Zhao, C.L., Gao, X., et al.: Research and application of the strategy for repair
security measures based on formalized description mechanism. Power Syst. Prot. Control.
46(22), 178–186 (2018)
4. Xu, J.Y., Song, F.H., Lu, Z., et al.: Design and realization of the secondary maintenance safety
measures visualization and one-touch operation system in smart substation. Power Syst. Prot.
Control. 45(16), 136–144 (2018)
816 S. Liye

5. Hou, K.Y., Shao, G.H., Wang, H.M., et al.: Research on practical power system stability analysis
algorithm based on modified SVM. Prot. Control Mod. Power Syst. 3(3), 119–125 (2018)
6. Bu, Q.S., Gao, L., Yan, Z.W., et al.: Anti-maloperation strategy and realization of soft-
pressing board of relay protection in intelligent substation. Electr. Power Autom. Equipment
36(12),156–160+168 (2016)
Decision-Making Framework of Supply Chain
Service Innovation Based on Big Data

Haibo Zhu1,2(B)
1 Harbin University of Commerce, Harbin 150028, China
2 Heilongjiang Provincial Laboratory of Electronic Commerce and Information Processing,

Harbin 150028, China

Abstract. Information and communication technologies related to big data have


been widely used in various fields. Global well-known enterprises obtain various
structured data through big data platforms and quickly find methods to increase
profits in data information. Big data and related technologies are mainly used
in business value-added, such as accelerating product and service innovation;
Optimize and improve the production process by cooperating with the Internet
of things (IoT); Improve the level of product quality management; Promote the
transformation of the industry to smart service. In the past, innovation mainly
occurred within the enterprise. Still, the open and networked innovation makes it
possible for products to interact with users before quantitative production how to
develop and meet customers’ needs, which operation mode to choose, and which
decision-making mechanism to adopt are particularly prominent. Based on those
mentioned above, this paper briefly described the channel service innovation and
its influencing factors under the application of big data and proposed a decision-
making framework.

Keywords: Supply chain · Big data · Decision framework · Service innovation

1 Supply Chain Service Innovation and Its Related Factors


1.1 Service Innovation
Service Innovation Theory. Service innovation of enterprises in the channel. In the
current competitive production environment, for many potential competitors, service
innovation mainly includes four aspects: joint innovation with the network, adding ser-
vice technology in the process of production and service, active linkage innovation with
customers, and technological innovation. Service innovation is a new solution to attract
VIP or senior customers in service [1]. However, many service innovations only consider
the service aspect and ignore other influencing factors. Not only the professional knowl-
edge of service but also the technical knowledge is very important in-service innovation.
Robert et al. established a service innovation model from the four dimensions of the
concept, interface, system delivery, and technology method. Each dimension is affected
by the comprehensive effect of various factors, so the innovation output is not the same
[2].

© The Author(s), under exclusive license to Springer Nature Switzerland AG 2022


A. E. Hassanien et al. (Eds.): BIIT 2021, LNDECT 107, pp. 817–825, 2022.
https://doi.org/10.1007/978-3-030-92632-8_77
818 H. Zhu

Individual and Whole Service Innovation in Supply Chain. The relationship between
the supply chain and enterprises is bidirectional. The supply chain is composed of enter-
prise nodes, and the supply chain affects enterprise performance and operation. There-
fore, service innovation can not be realized by a single enterprise or a single channel;
only by the joint action of all enterprises can it be successfully implemented, and we
should make full use of the knowledge reserve, profitability, and resources of all parties,
and fully grasp its internal mechanism to integrate and improve the supply chain.
Service Innovation in the Context of Big Data. The development of data and infor-
mation technology has triggered the global information revolution and become the core
competitive highland of the next global production revolution. The international compet-
itiveness of countries and regions with high levels of information technology is relatively
strong. In the current era, service innovation should be a new service model that can
increase comprehensive competitiveness. It also requires paying attention to customer
needs, using information technology and super network to explore the value of knowl-
edge service, and increasing enterprise profits. Chen et al. [3] proposed that big data
can be used for service innovation, such as analyzing customer behavior and preference,
predicting customer consumption. Wang et al. believe that improving innovation ability
is to mine valuable information about customers hidden in big data by constructing an
analysis framework. From this, we can see that in the past, most of the researches focused
on the innovation of a single enterprise, but little on the customer experience, and rarely
integrated and innovated the whole supply chain [4].
Service Innovation Ability. Service innovation needs the joint action of individual and
team, and its ability is also composed of individual innovation ability and team inno-
vation ability. Individual mainly refers to the induction education, continuous training
in work, accumulated work experience, and the common interaction behavior among
individuals, teams, and individuals and teams. Flexible use of individual innovation and
individual thinking and skills can form team innovation ability and improve service
innovation. Compared with intangible service innovation, tangible service innovation
is more explicit and easier to code. In the process of transformation to team innova-
tion ability, the problems encountered are relatively small. According to Tobias et al.
[5], we should examine the factors conducive to improving the service innovation ability
from six aspects: perceived demand and technology, cultivating service thinking, flexibly
mobilizing service, cooperation, expanding service scope, and learning ability.
Service innovation is mainly divided into three stages: concept, development, and
introduction. The first stage of service value realization is the concept stage. The main
task is to build a conceptual framework and initially complete all aspects of this process,
such as conceptual creativity, preliminary development, analysis, and inspection, etc. The
second stage is the development stage, which mainly includes feasibility demonstration,
operation link design, training, etc. The third stage is the introduction stage, which is
the evaluation and estimation of service quality, the use of customers, and suggestions
for improvement. This is also the final stage. In the three stages, service innovation
involves integrating and cooperating innovation ability in six aspects: resource input
in production, production, management, marketing and staff, and customers in person-
nel. At the same time, based on the characteristics of the development and evolution
Decision-Making Framework of Supply Chain Service Innovation 819

process of the service industry, the new path and new strategy including four dimen-
sions are formulated, namely service-related page interaction, informatization, lean, and
inventory.

1.2 Omni-Channel Operation


In theory, many scholars do not have a unified concept, and everyone has their views,
but most agree with the concept of Omnichannel retail created by Darrell (2011) [6].
Based on the spatial classification, Zhang (2016) [7] constructed the evolution path map
of retail channels and believed that Omnichannel is formed from the combination of
multiple single channels and then from the cross evolution of single and multi-channel.

Operation Strategy. In recent years, with the development of the Internet, online pay-
ment and shopping play a more important role in customers’ consumption and greatly
impact offline stores. By exploring the impact, this paper analyzes the mechanism of
customers’ online and offline choices to improve the online and offline sales strategy
and think about how to change the means of payment or delivery to attract customers
and improve customer satisfaction.

Price. How to price is the core problem of multi-channel supply chain and the main
factor affecting customer consumption demand and the research focus of scholars.

Channel Supply Chain Service Level. The service level of the channel supply chain
is usually measured by sales staff performance, customer service, and innovation. In
measuring the service level, the demand for the same product with the same price and
different channels is analyzed, and the relationship between service decision-making and
price is studied. It can be found that among many factors, the degree of effort of sales staff
is the most important. But in reality, it is difficult to evaluate the performance of sales
staff, and it is also difficult to monitor the sales efforts of retailers. Giri (2019) found that
retailers and manufacturers compete for market share by advertising in specific regions,
so advertising investment is an alternative factor [8]. Through the comparison of the
retailers advertising in the same area, it is found that when the price of multi-channel
is the same, the sales volume is less and the sales unit price is higher; Compared with
no advertising, advertising can alleviate the conflict of interest of each channel in the
supply chain and improve the profits of enterprises.

In summary, driven by e-shopping, enterprises will put more energy into improving
service level and innovation to obtain higher profits and occupy more market share. In
the past, in the traditional single-channel environment, offline interaction could provide
a good shopping experience and other value-added services to improve the attractive-
ness of customers. However, in today’s dual-channel model, price is no longer the only
competitive factor, and service level is also included in consideration of retailers. By
analyzing the equilibrium pricing strategies in different markets, it is found that service
is not considered as the influencing factor of decision-making. And the way of factory
direct sales can promote enterprises to improve service and carry out service innovation.
Further analysis based on the customer utility theory shows that when the service level
of retailers is compared with that of manufacturers, the former can increase the profits
820 H. Zhu

of manufacturers, all links in the whole channel and the whole supply chain, which is
an effective means to alleviate the contradictions among enterprises in the channel, and
channel competition is very beneficial to improve their service level. However, in terms
of the current research situation, few scholars are involved in the service innovation and
level of the Omni-channel supply chain.

2 Service Innovation Decision Framework of Omni Channel


Supply Chain Driven by Big Data
Although in today’s era, the use-value of big data is huge, but the data still can’t be used
and analyzed in a timely and effective way, resulting in many processing technologies
can’t play a role, only presenting scattered and meaningless information, and there is little
value information for managers to refer to. Therefore, according to the characteristics of
Omni-channel supply chain and the requirements of customer demand for service level
and service innovation, it is very necessary to construct a data analysis flow chart that
can connect multiple data streams and analyze specific problems, which is helpful to
mine real valuable information in huge data streams.
By mining the existing data to improve the competitiveness of service providers, this
paper constructs a scientific and feasible analysis framework, which has the following
characteristics: firstly, it should be predictive and can meet the future expansion needs
of existing products and services to a certain extent; Secondly, we should be able to
combine the big data information and turn the demand into the information that can be
used in each sub-process; Finally, through the process and supply chain operation, the
service level can be continuously evaluated and improved. Next, from the big data to find
the factors that affect the Omni-channel supply chain service innovation, the analysis of
these factors can improve the accuracy of analysis, increase the possibility of success in
service innovation, and then obtain higher profits. Bayesian network is a method that can
effectively judge consumer preferences by explaining the interaction between various
variables [9]. This paper uses this method to link various data streams and establish a
decision analysis framework to predict market service demand. The analysis framework
is shown in Fig. 1.
To facilitate the understanding and operation of enterprises, this paper uses David
et al. (2016) [10] reasoning network analysis technology to construct a similar frame-
work. The framework divides the decision-making process into four stages: the first stage
is to obtain the required big data. The second stage is based on big data information,
using certain information technology to predict product demand. The third stage is to
refine the requirements of the second stage to each process so that each channel can
carry out targeted optimization. The fourth stage is to cycle the products and services of
the Omni-channel supply chain, constantly find problems and carry out innovation and
improvement.

2.1 The Demand for Forecast Service


The main forecasting methods are linear regression, time series, etc., but these meth-
ods are based on the previous data. The current service demand is affected by many
Decision-Making Framework of Supply Chain Service Innovation 821

Fig. 1. Decision framework of supply chain service innovation based on big data

factors, not limited to the past trend, such as marketing, after-sales, policy, etc., which
are the missing dimensions of mainstream forecasting methods. However, we can make
improvements by mining information from big data to improve the accuracy of mar-
ket demand forecasting, improve retailers’ competitiveness, help formulate marketing
strategies to adapt to market development and changes, improve after-sales service and
shopping experience, and so on make customers satisfied.
In this paper, data mining combines data mining and a Bayesian network, first through
data mining to obtain useful data and then the Bayesian algorithm to extract refined data.
The specific process is as follows:

Firstly, collect data, that is, obtain data related to service demand, including customer
purchase records, early consumer search information on related products, and post-use
evaluation of products, which can reflect consumer preferences. That is to say, in the
observed supply chain, the appropriate samples are selected according to the previous
experience and experts’ suggestions, and the influencing factors are identified to become
Bayesian network nodes, and each node is assigned a value.
Secondly, the sample data is preprocessed to facilitate the next step of data analysis. The
original data should be processed by clustering analysis, hierarchical analysis, and other
statistical methods to make its assignment discontinuous to model the algorithm easily.
Thirdly, we design and construct a Bayesian network, namely the belief network, a
closed-loop graph composed of all links and directed edges. Each node is composed of
random variables, and the relationship between variables is represented by the connection
direction of the parent node to the child node. The conditional probability is used to
express the connection strength, and the prior probability without parent node is used
to express the relationship between information. The uncertainty of related variables is
determined by P((X |π (X ))), where π (X ) represents the parent node X . Therefore, under
the relevant assumptions of Bayesian network, it can be expressed as joint probability
distribution, as follows:

P(X1 , X2 , . . . , Xn ) = P(Xi |π (Xi ))i = 1, 2, . . . , n (1)
822 H. Zhu

A complete Bayesian network includes two parts: identification network and


conditional probability [11]. Common mainstream algorithms are shown in Table 1.

Table 1. Algorithm of Bayesian network reasoning

Structure Integrity Method


Known Whole Maximum-likelihood
Known Local Maximum expectation
Unknown Whole Model space search
Unknown Local Scope contraction

According to the bi-directional reasoning characteristics of the Bayesian network,


we can calculate the probability according to each assignment and predict the demand
of products and services of manufacturers and retailers in the supply chain. Finally, sen-
sitivity analysis is carried out. This is used to determine the factors that affect consumer
preferences and is the cornerstone of measuring the relationship between variables and
decision-making. We can reduce the randomness of another variable by knowing the
information of one variable. The interaction information between the two variables is
expressed as follows:
 P(A, B)
T (A, B) = P(A, B) log (2)
P(A)P(B)
P(A, B) represents the joint probability distribution, P(A) represents the probability
distribution function of A, P(B) represents the probability distribution function of B,
and T(A, B) represents the relationship between the two variables. The larger the value
is, the closer the relationship is. Therefore, the degree of influence of variables and the
degree of attention to variables can also be judged by the value of T(A, B). In addition,
each supply chain link in the Bayesian network can detect the change of market demand
in time and reflect and improve the service innovation.

2.2 Demand Segmentation


Next, the requirement is subdivided with the help of a reasoning model. Among them,
we need to integrate the resources and capabilities of various companies, integrate the
needs of different companies, and refine them into each link, so that the whole process
of the product in the supply chain can be adjusted according to the needs and the optimal
adjustment can also be made by visualization.

2.3 The Building Decision Process of Product/Service Innovation


Collect Management Data. With the progress of the times, all kinds of informa-
tion sensing devices are constantly updated and iterated, such as cameras, GPS, two-
dimensional codes, sound waves, etc. These devices are distributed in the surrounding
Decision-Making Framework of Supply Chain Service Innovation 823

environment, and the distribution range is wider and wider, and the scale of data col-
lected is also becoming larger and larger. The cost of obtaining this information is very
high, but most of the internal value is ignored. Therefore, managers need to understand
the Omni-channel supply chain’s information to reduce the cost of obtaining informa-
tion and improve work efficiency. Therefore, in the primary stage of data experiment
management, making storage requirements and scientific computing conditions of many
data sets.

Clean Up and Integrate Data. In the process of general data analysis, the scale of data
to be processed is often too large, and the data source is very complex. There are serious
inconsistencies in database information, data loss, and noise instability. Therefore, before
analyzing the data, we must preprocess the data collected in advance. We can choose
data cleaning, integration, conversion, restoration, and other technologies to solve data
instability or inconsistency. Using data mining-related technologies, we can find useful
information for decision-makers, such as helping enterprises find single or composite
skills in product and service requirements. We can also use visualization techniques to
get more information we want, such as integrating resources and capabilities of other
companies and the customer needs to be faced by enterprises. These pieces of information
are very important for the development of the analysis model in the next stage.

Data Analysis. The data preprocessing in the second stage is more suitable for analysis.
In this paper, the network optimization reasoning model combined with other enterprise
capability sets shows product and service innovation and expansion. Such as “E” rep-
resents the problem to be solved, “TR” represents the ability set needed to solve the
problem, “SK” represents the ability set to get a solution, and “I” represents the media-
tion skills. Then the mediation skills “I” can help managers obtain the real needs from
the ability set. This method is based on the inference network graph SK and uses integer
programming to find the optimal solution from the beginning node to the end node.

Moreover, E, TR, SK, I, and related skills and data can be obtained from the data
processing in the second stage. The obtained information can be input into the reasoning
diagram to carry out logical reasoning processes. A model to solve a specific problem can
be constructed under the background of big data and Omni-channel supply chain service
innovation behavior evolution theory. Then, the expansion process of the reasoning
network is visualized by MATLAB and other programming software to help decision-
makers get the optimal solution [12].

Interpret Data and Make Decisions. First, collect data and preprocess it, then use a
reasoning network for data analysis, build a set of capabilities that can be continuously
improved, and obtain the best solution that Omni-channel managers want by seeking
the optimal solution. Then the best decision is obtained through analysis. The network
relationship obtained in this process is sorted and applied to the whole Omni-channel
supply chain service process. In a word, the above process can fully refine the demand of
classified products and services, build a specific analysis process, improve and coordi-
nate the Omni-channel supply chain, and help manufacturers and retailers continuously
improve and innovate in products and services.
824 H. Zhu

3 Conclusion and Discussion


This paper proposed a specific analysis process, improve and coordinate the Omni-
channel, which can fully refine the demand of classified products and services, sup-
ply chain and help manufacturers and retailers continuously improve and innovate in
products and services.
In the future, we will explore how to evaluate customer purchasing behavior from
the perspective of Omni-channel supply chain? Nowadays, the popular online channel
mainly presents products to customers through text and picture descriptions. In contrast,
the offline channel mainly provides customers with a better shopping experience by
contacting physical objects and enjoying sales staff service. Compared with the two, the
credibility of the online channel is far lower than that of the offline channel, the customer
experience is less, and the matching degree of products is lower. Under the background
of big data, how to combine the characteristics of the times and the development trend
of e-commerce, how to use the valuable information hidden in the data, how to help
enterprises formulate the most suitable marketing plan for the target customers when
they sell products and services online, how to find the most suitable sales strategy and
improve the products and services, so that customers can get a better experience, getting
more profits needs continuous research.
How to evaluate the innovation behavior of the Omni-channel supply chain? When
other aspects of products are the same, channel service level plays a decisive role in
consumer behavior. Compared with the lack of flexible low service innovation, high
service innovation will continue to optimize and improve according to customer needs,
showing the advantage of better adapting to changes in customer needs and external
resources and policy environment. Therefore, according to the degree of innovation,
innovation behavior can be divided into three stages: differentiation, customization, and
bundling.
How to measure innovation capability and build an optimal decision mechanism?
In the Omni -hannel supply chain operation mode, every channel and link data can be
analyzed. Data sharing among different channels, enterprises, and links is an inevitable
trend. Finding valuable information in the data, finding its internal relationship, exploring
the potential needs of customers, and carrying out targeted innovation is very important.
Customer demand is the core information, which is the source and trigger of service
innovation in the Omni-channel supply chain.

References
1. Feng, Z., Guo, X., Zeng, D.: Some frontier topics of business management research under
the background of big data. J. Bus. Rev. 16(01), 7–12 (2013)
2. Robert, A., Stephen, M.: The services science and innovation series. Eur. Manag. J. 66(02),
57–67 (2008)
3. Chen, G.: Research on business model innovation mechanism of supply chain enterprises. J.
Sci. Res. Manage. 26(12), 12–16 (2018)
4. Wang, J.: Research on channel optimization in supply chain of retail enterprises. J. Bus. Econ.
Res. 20(4), 26–32 (2018)
Decision-Making Framework of Supply Chain Service Innovation 825

5. Tobias, S., Cheri, S.: Data science, predictive analytics, and big data in supply chain
management: current state and future potential. J. Bus. Logist. 33(01), 44–53 (2015)
6. Darrell, R.: The future of shopping. Harv. Bus. Rev. 89(12), 64–75 (2011)
7. Zhang, Z., Xue, C.: Research on supply chain optimization of retail enterprises under channel
mode. J. Ind. Econ. 21(4), 32–36 (2016)
8. Giri, B., Bardhan, S.: Sub-supply chain coordination in a three-layer chain under demand
uncertainty and random yield in production. Int. J. Prod. Econ. 56(09), 16–33 (2019)
9. Yang, B., Park, B., Chang, B.: Optimal pricing and advertising decisions in a dynamic multi-
channel supply chain. Int. J. Control Autom. 11(10), 119–134 (2018)
10. David, S., Nikolaos, T.: Operations management in multi-channel retailing. Oper. Res. 36(05),
66–78 (2016)
11. Pattnayak, P., Patra, S., Pradhan, J.: Optimizing the network flow in cloud supply chain
management. Int. J. u - and e-Serv. Sci. Technol. 10(5), 43–54 (2017)
12. Niranjan, T., Parthiban, P., Sundaram, K.: Designing an omnichannel closed loop green supply
chain network adapting preferences of rational customers. J. Oper. Res. Soc. 38(11), 89–99
(2019)
Determination of Headways Distribution
Between Vehicles on City Streets

Andrii Bieliatynskyi1(B) , Oleksandr Stepanchuk2 , and Oleksandr Pylypenko2


1 North Minzu University, Yinchuan, Ningxia, China
beljatynskij@ukr.net
2 National Aviation University, Kiev 03058, Ukraine

Abstract. This paper discusses a work that contains a study to solve the optimal
saturation of cities’ street network elements by vehicles. The work is based on
experimental-theoretical studies of patterns for headway determination between
vehicles on the city arterial streets. The main approaches to determining the traffic
capacity of roads and carriageways of city streets are considered and analyzed. The
lanes capacity of city arterial streets having uninterrupted traffic is determined,
taking into account the regularities of traffic flows. Regularities of vehicles dis-
tribution on lanes of the arterial streets are proved, and parameters of headways
distribution between the vehicles moving in a dense traffic flow are found. Based
on the obtained data, it has been found that the minimum headway for each lane
is different. This is influenced by the speed and composition of the traffic flow
and the patterns of vehicles distribution between the lanes. The actual minimum
headway between vehicles moving on the city arterial streets is found.

Keywords: Arterial street · Traffic flow · Traffic intensity · Traffic capacity ·


Vehicle

1 Introduction
A modern city is a concentration of residential buildings, industrial enterprises, admin-
istrative, cultural, medical institutions, etc., and many people in a relatively small area.
The comfortable and efficient level of life in such type complicated urban system is based
on the efficiency level interaction of all its components through the proper organization
of transport system as the main element uniting other components into a single whole
structure.
Nowadays, in multi-purpose cities with a well-developed division of social labor,
the residence place is less and less tied to the workplace. Need in the trips is growing
due to the growth of mobility requirement through the number of trips (traffic volume)
and traffic distance increase.
Urban traffic is a result of the need to interact between elements of the city’s planning
structure. Such interaction is the most effective when the modes of transport providing
traffic flows are most in line with the size of the city. Urban traffic in today’s major and
largest cities has grown into an important transport and urban planning problem. The

© The Author(s), under exclusive license to Springer Nature Switzerland AG 2022


A. E. Hassanien et al. (Eds.): BIIT 2021, LNDECT 107, pp. 826–835, 2022.
https://doi.org/10.1007/978-3-030-92632-8_78
Determination of Headways Distribution Between Vehicles 827

correct solution depends on people’s level of living conditions and the further territorial
and economic development of the cities themselves.
Daily trips of thousands of people create high-density traffic and pedestrian flows
on the city street network, which leads to a significant increase of time taken for trips
and forms so-called “transport fatigue” from uncomfortable travel conditions.

2 Materials and Methods

Analyzing the traffic process on the carriageway lanes of the city arterial streets, we
see that the uneven distribution of interspace between vehicles in the lanes is observed
in the interrupting traffic flow. The distribution of interspaces is significantly affected
by public transport stops the distance between signalized intersections, the number of
lanes, signalized intersections, where it is allowed to change the traffic direction, etc.
Accordingly, this explains the essence of the carriageway efficiency. It should be noted
that in these conditions, the main parameters of the carriageway and the factors that
affect the increase of its capacity are almost completely the same for all traffic lanes
on the certain carriageway section, in particular the geometric dimensions of each lane
(carriageway width, transverse and longitudinal slope, the radius of the vertical and
horizontal curve, etc.), the same are the parameters and properties of the pavement and
the road structure itself, as well as the environmental conditions.
Based on the principles of headways distribution between vehicles moving on the
road network, it is possible to determine the range of certain theoretical models applica-
tion to describe the patterns of the traffic flow formation on the city’s arterial streets
depending on the lanes number. Surveying the traffic patterns on multi-lane streets
and roads enables to improve the theoretical basis of their design and develop specific
measures to improve and enhance traffic conditions on them.
Many scientific works of national and foreign scientists are devoted to the study of
determining the distribution of time headways between vehicles moving in a dense flow,
among which there are works [1–6]. But, analyzing these scientific works, it should be
noted that the study of the time headways between vehicles on the arterial streets of
Ukrainian cities with 4–8 lanes was not paid attention.
The purpose of the work is to improve the approaches and principles of determining
the capacity of the lane of the arterial streets and roads for uninterrupted traffic, taking
into account their traffic flow patterns.

3 Results

The urban streets and roads capacity is the most important task among the all urban
transport problems of the city. It is well known that the city street network capacity (SNC)
is the main indicator characterizing the effectiveness conditions of the city transport
system operation.
The contradictions that arise in the traffic flows organization on the city streets,
create the corresponding difficulties in their operation. The vehicle-to-vehicle interaction
and the need to ensure urban traffic safety require the introduction of restrictive and
828 A. Bieliatynskyi et al.

prohibitive measures, which, in turn, affect the speed of traffic flows and the street
network capacity.
Each section of the city street has its planning features that allow the vehicle to move
at a speed determined by traffic rules. It is known that the lower the speed, the greater the
time spent, and it is the increase of travel time that almost always bothers the citizens.
Speed is one of the most important traffic indicators because it characterizes its target
function. Any reduction of vehicles speed compared to the permitted one, and even more
so the traffic delays, lead to economic losses. Therefore, the urban transport operation
efficiency contains organizational measures directed on traffic delay minimization.
When researching the principles of traffic delays and places of traffic congestion
formation [7–9], the main attention should be paid to improving the traffic conditions
on the city arterial streets, reducing delays and thus increasing the traffic speed. One of
the main tasks to ensure continuous, comfortable, and safe vehicles on the city street
network is to minimize vehicle delays under heavy traffic flows.
It is known that the theoretical maximum traffic capacity of the city road lane, taking
into account the maximum dynamic length of the platoon moving at a speed of 40 km/h
is 1450 vehicles/hour. But today, this theoretical value does not correspond to the actual
number. As a result of the observations performed, we have found that the maximum
traffic intensity in one lane of arterial streets of Kyiv city is 1869 units/hour, which
is 29% higher than theoretical indicators. This is because the theoretical calculation is
made taking into account all the factors of influence, which have inflated indicators that
ensure the safety of vehicles and pedestrians. According to works [10–12], the maximum
traffic capacity on the city road lane at the optimal speed values at the uninterrupted traffic
conditions and maximum traffic density is about 2500 vehicles/hour.
Such value can be achieved at a speed of 100 km/h, as is the case on highways in the
United States, England, and other countries [13]. It should be noted that the maximum
traffic capacity in real conditions on urban arterial streets is almost impossible to achieve.
When organizing the city streets traffic, it is clear that it is necessary to consider the
current road regulations, which clearly define the maximum speed value on the city
streets and roads. Therefore, the real conditions create certain restrictions to achieve the
maximum result.
But it should be noted that taking into account the real situation in the lane of the
street or road. An individual approach to traffic management on them, will provide better
operation of this particular carriageway, which, in general, will help to improve the traffic
conditions within the whole city street network and will increase its traffic capacity.
In our opinion, today, the proposal to use the indicator of specific traffic intensity is
very interesting. In [13], it is noted that according to the theory of saturated traffic flows,
it is advisable to determine the traffic capacity not in the cross-section of the lane but
space, so it is necessary to determine the specific intensity U (vehicle/h·km), which is
distributed in lane space and is the ratio of intensity to the length of the lane L:

U = N /L (1)

The specific traffic intensity allows to compare city highways and roads depending on
the distribution of traffic flow intensity on them and thus to characterize their efficiency
in time and space.
Determination of Headways Distribution Between Vehicles 829

This indicator can allow the rational distribution [12, 14, 15] of traffic flows on the
city street network, thus ensuring its efficient operation.
Specific traffic intensity determines the headway between vehicles according to
which they pass the lane’s cross-section. At the headway l.44 s – 2500 vehicles/h [9], at
the headway of 1 s, there is an intensity of 3600 vehicles/h. Thus, at the optimal level of 50
vehicles per kilometer at the headway 2 s, the traffic capacity reaches 1800 vehicles/hour.
Thus, the minimum headway between vehicles moving in the lane is a decisive factor
in determining its maximum traffic capacity.
For our further study, which determines the headways between vehicles moving in
different lanes on the city street network, we took sections of the arterial street of Kyiv
city having uninterrupted traffic. Such sections have been chosen following the following
requirements: absence of road intersections of any kind; absence of any road elements
causing to change the lane, the direction of traffic or having direct influence on traffic
flow.
At this stage of the study, the main task was to determine the headways between
vehicles moving in the traffic flow. The fourth lane of the city arterial street has been
chosen during the field observations because it is the most saturated lane on the four-lane
highways. The data had been fixing in the condition of heavy traffic, where the situation
was close to traffic congestion. Vehicles were moving in a dense uninterrupted flow. This
condition is characterized by the maximum density and almost uniform distribution of
vehicles on available lanes. The average speed of traffic was 18.7 km/h.
The method of mathematical statistics was used to process the obtained data. To
calculate the mean value and standard deviation, the data obtained from the observation
represented in the Table 1.

Table 1. Calculation of headways between vehicles on the arterial street for the fourth lane.

Sequence number Headways Average headway Frequency ni ni t i t i –t cp (t i –t cp )2 pi


t i , in sec
1 0,8–1,2 1,0 3 3 −2,01 12,13
2 1,2–1,6 1,4 8 11,2 −1,61 20,75
3 1,6–2,0 1,8 11 19,8 −1,21 16,12
4 2,0–2,4 2,2 14 30,8 −0,81 9,20
5 2,4–2,8 2,6 16 41,6 −0,41 2,70
6 2,8–3,2 3,0 20 60 −0,01 0,00
7 3,2–3,6 3,4 11 37,4 0,39 1,67
8 3,6–4,0 3,8 9 34,2 0,79 5,61
9 4,0–4,4 4,2 8 33,6 1,19 11,32
10 4,4–4,8 4,6 6 27,6 1,59 15,16
11 4,8–5,2 5,0 5 25 1,99 19,79
(continued)
830 A. Bieliatynskyi et al.

Table 1. (continued)

Sequence number Headways Average headway Frequency ni ni t i t i –t cp (t i –t cp )2 pi


t i , in sec
12 5,2–5,6 5,4 3 16,2 2,39 17,13
13 5,6–6 5,8 1 5,8 2,79 7,78
Total 115 346,2 139,35

To determine the average headway between vehicles in a dense traffic flow, we take
the accuracy  = 1 s.
Based on the experimental indicators calculations, it can be stated that the average
headway between vehicles under dense flow conditions ranges from 2.81 to 3.21 s.
According to the obtained average headway, we determine the traffic intensity
of vehicles, which is approximately 1200 vehicles/hour or within the range (1120–
1280 vehicles/hour). It should be noted that this is the value of the design traffic intensity
of the lane (in the given units) that is used for designing arterial streets of the city with
uninterrupted traffic (ASCUT) in accordance with current regulations. Then, taking into
account the coefficients of use of traffic lanes (in the conditions of absence of public
transport stops on the road stage or if they are arranged outside the carriageway), that is
1.9 for two lanes, 2.7 for three lanes, and 3.5 for four lanes, the traffic capacity of city
arterial streets can be determined as 4200 vehicles/hour.
When designing the arterial streets, we first consider its traffic capacity at the intensity
that is characteristic for the road condition close to the congestion state, where the
vehicle’s speed is minimal. The last, according to the theory of traffic flows, does not
provide maximum traffic intensity, which is exactly that parameter that characterizes the
effective operation of the arterial street network section.
To ensure the efficient operation of the arterial street, it is necessary to set the optimal
headway between vehicles, which will enable the maximum number of vehicles to pass
through for a certain time, ensuring at the same time the safe and convenient conditions
of the traffic. Hence, the task is to determine the maximum number of vehicles each lane
can pass in a partially uninterrupted flow where traffic is regulated by vehicle-vehicle
interactions and interactions between vehicles and the roadway.
The main task of the study was to establish the average and minimum possible
headways between vehicles moving in the lane and establish the average traffic speed on
this section of the arterial street. Therefore, the relevant observations were performed on
four lanes of the arterial street. These observations involved determining the headway
between vehicles moving one after another. The observation was performed for 20 min
for each lane. The procedure of the experiment was to record the number of vehicles
that pass through the cross-section of the lane for a 10 s period. The obtained data were
grouped and listed in Table 2. The shape of the distribution curve depends on the intensity
and varies depending on the lane number (Fig. 1).
Determination of Headways Distribution Between Vehicles 831

Fig. 1. Distribution of headways in time on four lanes of the city arterial streets with uninterrupted
traffic

As a result of monitoring the vehicles traffic on the four-lane carriageway of the


arterial streets of Kyiv, it was found that following headways between vehicles had been
dominated: for the first lane – 4.2 s (33% of vehicles), for the second lane – 2.3 s (29%),
for the third – 1.8 s (29%) and the fourth lane – 1.5 s (27%). As you can see, about a
third of all vehicles in the flow drove through each lane at such headways. Therefore,
it can be argued that the headway values mentioned above, can be taken as the optimal
headway for each lane. The obtained headways for the fourth, third, and second lanes
are very close to the data obtained in the dissertation [9, 13], which shows the following
headways between cars on the road: the fourth lane – 1,53 s, the third – 1,8 s, another –
2.17 s. But for the first lane in [9], the headway is 2.96 s, which differs significantly (in
1.4 times) from the value obtained above for the city’s arterial streets. This happens due
to the significant number of public passenger transport, which moves in the first lane
and significantly affects the choice of this lane for the traffic of passenger cars.
Based on the obtained data, it has been found that the minimum headway for each
lane is different. This is influenced by the speed and composition of the traffic flow and
the vehicle distribution patterns in the lanes. Figure 2 shows the minimum, optimal and
average headway between vehicles for each lane. The actual minimum headway between
vehicles moving on the corresponding arterial street was equal to 1.0–1.25 s at a speed
that exceeded the permissible speed on the arterial street and was 92 km/h. The distance
between the transport vehicles was 32 m.

Table 2. The distribution of headways on the arterial streets of the city for uninterrupted traffic.

Number The 1 lane 2 lane 3 lane 4 lane


of headways Number Number % Number Number % Number Number % Number Number %
vehicles between of 10 s of of 10 s of of 10 s of of 10 s of
for a vehicles, periods vehicles periods vehicles periods vehicles periods vehicles
10 s in sec
period
of time

0 >10,0 32 0 0 0 0 0 0 0 0 0 0 0
1 5,0–10,0 42 42 26 26 26 8 6 6 1 8 8 1

(continued)
832 A. Bieliatynskyi et al.

Table 2. (continued)

Number The 1 lane 2 lane 3 lane 4 lane


of headways Number Number % Number Number % Number Number % Number Number %
vehicles between of 10 s of of 10 s of of 10 s of of 10 s of
for a vehicles, periods vehicles periods vehicles periods vehicles periods vehicles
10 s in sec
period
of time

2 3,33–5,0 26 52 33 30 60 18 20 40 9 10 20 4
3 2,5–3,33 16 48 30 26 78 23 34 82 18 16 48 9
4 2,0–2,5 2 8 5 24 96 29 18 92 21 28 100 19
5 1,67–2,0 2 10 6 12 60 18 26 130 29 20 112 21
6 1,43–1,67 0 0 0 2 12 4 16 96 22 24 144 27
7 1,25–1,43 0 0 0 0 0 0 0 0 0 10 70 13
8 1–1,25 0 0 0 0 0 0 0 0 0 4 32 6
Total 120 160 100 120 332 100 120 446 100 120 534 100

The difference between the average, actual and minimum headway for each lane of
the carriageway is: for the first lane 5 s, the second one −1.9 s, the third one – 1.2 s, and
the fourth one −1 s. As far as the optimal headway is concerned, we see that the first
lane is underloaded 1.8 times, the second 1.6 times, the third and fourth 1.5 times.

Fig. 2. Minimum and average headway between vehicles on each separate traffic lane

The difference between the average and optimal headways is explained by the fact
that when the distance between vehicles moving in the traffic flow is short and a speed
is over 60 km/h, it is very difficult for a driver to change the lane. In [9] it is noted
that the headways between vehicles that enable you to change the lanes in traffic flow
should be 2.5 s, and at headways of 1.6 s, such a maneuver becomes dangerous. As you
can see from Table 2, the percentage of vehicles moving in the corresponding lane with
headways less than 2.5 s, are: for the first lane 11%, the second lane 51%, for the third
Determination of Headways Distribution Between Vehicles 833

and fourth lanes 72%, and 86%, respectively. Therefore, at this intensity, the change of
the current lane to the third and fourth lanes is almost impossible, which, in turn, does
not allow to achieve the optimal headway for each lane.

4 Discussion
The traffic capacity of the city arterial street lanes for public passenger transport uninter-
rupted traffic is: 850 vehicles/hour for the first lane, 1550 vehicles/hour for the second
lane, 2000 vehicles/hour for the third lane, 2400 vehicles/hour for the fourth lane, and
the total intensity is 6800 vehicles/hour. According to the data represented in [9], the
traffic capacity for the four-lane street is 7235 vehicles/hour. The following data were
obtained for the two-lane streets: 850 vehicles/h for the first lane the 2000 vehicles/h for
the second lane. For the three-lane streets, such type data represent 850 vehicles/h for
the first lane, 1600 vehicles/h for the second lane, and 2000 vehicles/h for the third lane.
Based on the observations performed, the proportions of vehicles distribution on the
road lanes depending on lanes number and the headways between vehicles on each lane
where they move (Table 3).

Table 3. The proportions of vehicles distribution on the road lanes depending on lanes number
and the headways between vehicles.

Lanes First lane Second lane Third lane Fourth lane


number Distribution, Headway, Distribution, Headway, Distribution, Headway, Distribution, Headway,
in one % c % c % c % c
way
Two 37,1 5,26 62,9 2,44 – – – –
Three 20,6 5,08 43,4 2,86 38,3 2,46 – –
Four 7,7 7,5 19,5 3,6 33,2 2,7 39,6 2,3

It should be noted that the vehicles speed has a significant effect on the headway
between vehicles moving in separate lanes on the roadway.The current task of the study
was to determine the speed of traffic in the lanes and to determine compliance between
the actual speed distribution and “normal distribution law”. The value of the average
actual speed, which was obtained as a result of experimental studies on the arterial
streets of Kyiv, is shown in Fig. 3.
According to the survey performed and the results obtained, the following values
were obtained: for the first lane the characteristic speed is 43.2 km/h, for the second –
54.2 km/h, for the third – 65.1 km/h, and for the fourth – 78.9 km/h. The obtained data
indicate that the vehicles on the third and fourth lanes exceed permanently a speed limit
approved by traffic rules.
One of the reasons for the speed decrease and its uneven distribution between the
lanes is the change of traffic lanes by vehicles, as well as the presence of freight and
public passenger transport vehicles on the third and even fourth lanes.
The results of data processing correspond to the well-known principles, where the
speed in the first lane is always less than in the others. For example, the average speed
834 A. Bieliatynskyi et al.

Fig. 3. Distribution of average speeds in the lanes of ASCUT in Kyiv

on the first lane of a four-lane carriageway is less than on a three-lane road of 43.2 and
45.2 km/h, respectively. This is significantly affected by the width of the first lane, the
safe gap, vehicles parked along the sidewalk, the composition of the traffic flow, and
in particular, by the public passenger transport. It should be noted that the analysis of
the relationship between intensity and velocity is very well described and performed in
[3], where the corresponding regression equations are defined. Given that the intensity
for lanes, which we obtained as a result of observations, almost coincides with the data
obtained in work [9], so for further calculations and generalization of the study results,
we will use the appropriate regression equations proposed in [9].
Based on the observations discussed above, it can be argued that ensuring the effective
use of different sections of the city arterial streets is to organize the vehicle traffic on
them with the optimal headway, which is individual for each lane.

5 Conclusion
Comparing the obtained data, it can be noted that all lanes are, in fact, not fully saturated,
but, at the same time, it must be said that achieving maximum traffic capacity on the
arterial streets of Ukrainian cities is currently almost impossible. Therefore, as already
mentioned, the traffic capacity is affected by many different external and internal factors
with a probabilistic effect. In the observation, we have fixed the maximum traffic inten-
sity on lanes on city arterial streets. The obtained results show that transport network
efficiency increase due to the rational distribution of vehicles in the lanes, and the traffic
capacity increase by reducing the headway between vehicles is possible only at specific
sections of the arterial street network.

References
1. Semchenko, N.: The relationship of speed with the intensity and density of traffic flows on
multi-lane highways. Road Transport. 19, 38–40 (2006)
2. Krasnikov, A.: Regularities of distribution of time headways between vehicles on multi-lane
roads. Works of MADI 95, 74–83 (1975)
Determination of Headways Distribution Between Vehicles 835

3. Petrov, E.: Modelling of high-intensity traffic flow. Omsk Sci. Bull. 21, 137–138 (2002)
4. Zaporozhtseva, E.: Traffic capacity of highways. Socio-economic problems of development
and functioning of transport systems of cities. 19, 225–229 (2010)
5. Jinhwan, J., Changsoo, P., Byunghwa, K., Namkuk, C.: Modeling of time headway distribution
on suburban arterial: case study from South Korea. Procedia Soc. Behav. Sci. 16, 240–247
(2011)
6. Rupali, R., Pritam, S.: Headway distribution models of two-lane roads under mixed traffic
conditions: a case study from India. Eur. Transp. Res. Rev. 10(1), 1–12 (2018)
7. Stepanchuk, O., Bieliatynskyi, A., Pylypenko, O., Stepanchuk, S.: Surveying of traffic
congestions on arterial roads of Kyiv city. Procedia Eng. 187, 14–21 (2017)
8. Nguyen-Phuoc, D.Q., Young, W., Currie, G., De Gruyter, C.: Traffic congestion relief
associated with public transport: state-of-the-art. Public Transp. 12(2), 455–481 (2020)
9. Zaporozhtseva, O.V.: Improving the Principles of Determining the Capacity Of Multi-Lane
Highways. Dissertation of the Candidate of Technical Sciences: 05.22 .11/O.B. Zaporozht-
seva. – Kharkiv National Automobile and Road University of the Ministry of Education and
Science of Ukraine, Kharkiv, 145 p. (2016)
10. Stepanchuk, O., Bieliatynskyi, A., Pylypenko, O., Stepanchuk, S.: Peculiarities of city street-
road network modelling. Procedia Eng. 134, 276–283 (2016)
11. Guk, V.I.: Elements of the theory of traffic flows and design of streets and roads: Tutorial/V.I.
Hooke. – K.: UMK VO, 255 (1991)
12. Stepanchuk, O., Bieliatynskyi, A., Pylypenko, O.: Modelling the bottlenecks interconnection
on the city street network. In: Popovic, Z., Manakov, A., Breskich, V. (eds.) TransSiberia
2019. AISC, vol. 1116, pp. 889–898. Springer, Cham (2020). https://doi.org/10.1007/978-3-
030-37919-3_88
13. Guk, V.I., Shkodovsky, Y.M.: Transport Flows: The Theory of Their Application In Urban
Planning: Monograph. – X.: Golden Pages, 232 (2009)
14. Stepanchuk, O., Bieliatynskyi, A., Pylypenko, O.: Regularities of City Passenger Traffic Based
on Existing Inter-district Links. In: Murgul, V., Pukhkal, V. (eds.) International Scientific Con-
ference Energy Management of Municipal Facilities and Sustainable Energy Technologies
EMMFT 2019. EMMFT 2019. Advances in Intelligent Systems and Computing, vol. 1258.
Springer, Cham (2019)
15. Hossain, M.T., Hasan, M.K.: Assessment of traffic congestion by traffic flow analysis in Pabna
Town. Am. J. Traffic Transp. 4(3), 75–81 (2019)
Does Digital Finance Promote the Upgrading
of Industrial Structure

Yi Qu(B) , Hai Xie, and Xing Liu

Harbin Vocational College of Science and Technology, Harbin 150300, Heilongjiang, China
quy@hrbcu.edu.cn

Abstract. The growth of digital finance improves SMEs’ financing conditions and
business circumstances and has a crucial role in upgrading the industrial structure
of China. This paper takes China’s 30 provinces from 2011 to 2019 as samples and
studies the effect of digital finance on upgrading China’s industrial structure from
both theory and demonstrative two aspects. The results reflect that digital finance
stimulus the optimization and upgrading of China’s industrial structure as a whole.
Furthermore, digital finance in the eastern region has a significant positive impact
on the industrial structure. However, digital finance plays an inhibitory role in
upgrading the central and western regions in the industrial structure.

Keywords: Digital finance · Industrial structure upgrading · Digital inclusive


finance

1 Introduction and Literature Review

The upgrading of industrial structure is a major measure to change the extensive mode
of economic growth and symbol of development modern social and economic [1]. Since
the reform and opening up, China has rapidly developed from a backward agricultural
country to manufacturing. The three industrial structures have been constantly adjusted,
and industrial development has made great achievements. However, there is still a big
gap between industrial structure and high-quality, sustainable growth of the economy.
Financial development is a key way to promote the upgrading of industrial structures.
Still, there is a serious problem of financial exclusion in the development process of
traditional finance, which can not play effectively to the ability of financial services to
the substantial economic growth. With the development of the Internet, digital finance
emerges as the times require. Digital finance debases the problem of information asym-
metry and the threshold of financial services through big data so that SMEs can also
access financial services. Digital finance has significantly improved the coverage and
use depth of financial services, optimized the resource allocation pattern, and promoted
upgrading industrial structures. Besides, digital finance can improve the credit system
and make the capital flow to high-quality industrial projects to boost the upgrading of
industrial structure [2]. Therefore, this study explores the influence of digital finance on
upgrading industrial structures based on theory.

© The Author(s), under exclusive license to Springer Nature Switzerland AG 2022


A. E. Hassanien et al. (Eds.): BIIT 2021, LNDECT 107, pp. 836–843, 2022.
https://doi.org/10.1007/978-3-030-92632-8_79
Does Digital Finance Promote the Upgrading of Industrial Structure 837

In academia, there are numerous studies on the impact of financial development on


industrial structure. As early as last century, some scholars have discussed the effect
of financial development on industrial structure. Goldsmith (1969) constructed a the-
oretical financial development system and studied the relationship between financial
development and industrial structure upgrading [3]. Wurgler (2000) found that financial
development has played a positive role in industrial investment and upgrading [4]. Da
Rin and Hellmann (2002) pointed out that financial services can advance the pattern
and efficiency of resource allocation in the market and boost funding and structure [5].
Aghion et al. (2005) believed that the rationalization of financial development could nar-
row the gap of science and technology between regions, thus converting the industrial
structure from a low-level form to a high-level form [6]. Chava et al. (2013) found that
financial development can promote economic development and scientific and technolog-
ical progress by solving information asymmetry and optimizing the industrial structure
[7]. Sasidharan et al. (2014) discovered that digital finance would promote technolog-
ical innovation of the financial services industry and achieve results in transforming
and upgrading industrial structure. Some scholars have different opinions [8]. Fan et al.
(2017) pointed out that financial services have come into play a restraining role in the
transformation and upgrading of industrial structure [9].
To sum up, there is little research on how digital finance affects upgrading industrial
structure in the academic community. Therefore, this paper uses benchmark regression
to evaluate the influence of digital finance on upgrading industrial structure and discusses
its regional heterogeneity. The marginal contribution of this paper lies in: Firstly, using
China digital inclusive finance index to study the relevance between digital finance and
industrial structure upgrading. Secondly, this paper uses the three sub dimensions of
digital inclusive finance to further explore which aspect of digital finance affects the
upgrading of industrial structure. Finally, we discuss digital finance’s influence on the
upgrading of regional industrial structure in the East and the Midwest.

2 Model Construction and Data Description

2.1 Variable Selection

Dependent Variable. Industrial structure upgrading (ISU) is the selected dependent


variable. Upgrading the industrial structure shows that the proportion of the first industry
is declining, while the proportion of the secondary and tertiary industries is rising,
especially the tertiary industry. Referring to Cai et al. (2017) [10], this study takes the
proportion of the value-added of the tertiary industry to GDP as the proxy index.

Independent Variable. This paper selects digital finance (DF) as the independent vari-
able. Referring to the research of Guo et al. (2020) [11], inclusive digital finance are
selected as the index to measure digital finance. To further explore the effect of digital
finance on upgrading industrial structure, this paper also chooses three sub-dimensions
of inclusive digital finance (coverage breadth, usage depth, and digitization level) as
independent variables.
838 Y. Qu et al.

Control Variable. This paper selects the economic development level, fixed assets
investment level, government intervention level, and the R&D investment level as the
control variables to increase the robustness of the empirical model. Economic devel-
opment level is measured by regional per capita GDP. For the stability of the data, the
logarithm of regional per capita GDP is taken and recorded as lnPGDP. The growth rate
of fixed assets is used as the proxy index for the level of fixed assets (FA). The level of
government intervention is benchmarked by the natural logarithm of government gen-
eral budgetary defray (lnGov). R&D level is assessed by the natural logarithm of R&D
investment (lnR&D).

2.2 Data Sources and Descriptive Statistics

This work selects panel data of 30 areas (including provinces, autonomous regions,
and municipalities) between 2011 and 2019, excluding Tibet, Taiwan, Hong Kong, and
Macao. The data of inclusive digital finance originates China digital finance research cen-
ter of Peking University, and the rest of the data comes from China Statistical Yearbook.
To eliminate the effect of price, the index data related to price factors are deflated. For
individual missing values, the interpolation method is used for information processing.
The descriptive statistics of each variable involved in this study are recorded in Table 1.

Table 1. Statistical description of variables

Variables N Mean Sd Min Max


ISU 270 0.450 0.0872 0.297 0.810
lnDF 270 5.151 0.670 2.909 6.017
LnBreadth 270 4.995 0.827 0.673 5.952
LnDeapth 270 5.133 0.646 1.911 6.087
LnLevel 270 5.458 0.716 2.026 6.136
lnPGDP 270 10.50 0.419 9.706 11.55
lnGov 270 8.000 0.536 6.559 9.175
FA 270 0.113 0.124 −0.627 0.412
lnR&D 270 14.28 1.344 10.96 16.96

2.3 Benchmark Model

According to the above analysis, this paper constructs model (1) to explore the
relationship between digital finance and industrial structure upgrading.

ISUit = β0 + β1 lnDFit + β2 Xit + εit (1)


Does Digital Finance Promote the Upgrading of Industrial Structure 839

Where, X is the control variable. i means province, and t represents year. ε is


disturbance term.
The regression model of the three dimensions of digital finance (coverage breadth,
usage depth and digitization level) to upgrading industrial structure is similar to (1).

3 Empirical Research

3.1 Empirical Analysis of Benchmark Regression

Firstly, this paper uses the Hausman test to judge which model is used in benchmark
regression. Results show that the adjoint probability value of the Hausman test is 0.0002
in the regression of digital inclusive finance and breadth coverage to industrial structure
upgrading. Therefore, the fixed-effect model is selected for regression. In the regression
of the usage depth of digital finance and the digitization level to upgrade industrial struc-
ture, the adjoint probability value of the Hausman test is 0.1913 and 0.0887, respectively.
So for the regression of these two sub-dimensions, a random effect model is used.
Table 2 reports the regression results of digital finance and its three sub-dimensions
on industrial structure. Column (1) (3) (5) (7) are the regression results of digital finance
and its sub-dimensions on the upgrading of industrial structure. Columns (2) (4) (6) (8)
are the regression results after the introduction of control variables.
From the result of column (1), it can be seen that digital finance has a promoting
effect on upgrading of industrial structure and has passed the significance test under 1%
level. And every 1% increase in digital finance will lead to 0.032 units in upgrading
industrial structure. After introducing control variables, the promotion effect is slightly
weakened but still has a significant promotion role. This means that digital finance can
improve the aggregation quality among industries, promote the overall coordination and
resource allocation of industrial structures, and promote upgrading industrial structures.
As for control variables, regional economic development level can promote the indus-
trial structure upgrading. Economic growth can promote capital accumulation, weaken
the resource constraints in structural adjustment, provide a broader market for indus-
trial structure transformation, and accelerate the restructuring of new industries. The
level of fixed assets investment, government intervention, and R&D investment levels
all have inhibitory effects on upgrading the industrial structure. And the level of fixed
asset investment has no significant effect on the upgrading of industrial structure. The
government intervention level and the level of R&D investment are significant at 10%
and 5%, respectively. Under the condition of the market economy, there is large informa-
tion asymmetry. The government cannot comprehensively grasp and accurately analyze
all the information in the market, leading to the distortion of industrial policy in the
implementation and implementation, which has no help to the upgrading of industrial
structure. In terms of scientific research and development, China may pursue more eco-
nomic growth, and technological innovation is more inclined to projects with a high
output value. These projects are not concentrated in the tertiary industry. And the trans-
formation rate of scientific results in China has always been low, which may lead to
the investment in scientific research and development inhibiting the industrial structure
upgrading.
840 Y. Qu et al.

From columns (3)–(8), it can be seen that the coverage breadth of digital finance
has no obvious impact on the upgrading of industrial structure. The usage depth of
digital finance significantly promotes the upgrading of industrial structure and passed
the significance test at the level of 1%. However, the digitization level of digital finance
significantly inhibited the upgrading of industrial structure at level of 1%. It shows that
digital finance plays an active role in upgrading industrial structure in the usage depth.

Table 2. The influence of digital finance and sub dimensions on the industrial structure upgrading

Variables (1) (2) (3) (4) (5) (6) (7) (8)


ISU
Lndif 0.0320*** 0.0309*** 0.00609 0.00570 0.0385*** 0.0343*** −0.0205*** −0.0200***
/breadth/depth/digitization (0.0117) (0.0115) (0.00482) (0.00472) (0.00853) (0.00870) (0.00769) (0.00764)
lnPGDP 0.0712*** 0.0720*** 0.0625*** 0.0716***
(0.0197) (0.0200) (0.0166) (0.0173)
FA −0.0228 −0.0265* −0.0203 −0.0267*
(0.0144) (0.0146) (0.0145) (0.0143)
lnGov −0.0424* −0.0397* −0.0488*** −0.0481***
(0.0235) (0.0237) (0.0165) (0.0176)
lnR&D −0.0170** −0.0162** −0.00813 −0.00985
(0.00744) (0.00753) (0.00574) (0.00602)
Constant 0.282*** 0.112 0.378*** 0.164 0.255*** 0.115 0.474*** 0.242
(0.0426) (0.254) (0.0160) (0.257) (0.0346) (0.199) (0.0322) (0.206)
R2 0.754 0.776 0.748 0.770 0.764 0.781 0.753 0.775
Observations 270 270 270 270 270 270 270 270
Sample size 30 30 30 30 30 30 30 30

Notes: *, **, *** indicate significance at the 10%, 5% and 1% level. For the estimated coefficients,
the standard errors are in parentheses.

3.2 Analysis of Regional Heterogeneity


There are obvious regional disparities in digital finance and industrial structure devel-
opment in eastern, central, and western regions. Due to its location advantages, the
eastern region has better economic development conditions than the central and western
regions. The industrial structure is more optimized than the central and western regions.
Therefore, this section discusses whether the effect of digital finance on the upgrading
of industrial structure has regional heterogeneity in the eastern regions and central and
western regions. The results are recorded in Table 3. Column (1) and (2) is listed as the
regression result of the impact of digital finance on the upgrading of industrial struc-
ture in the eastern region. Column (3) and (4) listed as the impact of digital finance on
the upgrading of industrial structure in the central and western regions. From this in
the eastern region, digital finance is able to advance the upgrading of industrial struc-
ture. However, the influence of digital finance on industrial upgrading is negative in
central and western regions. This may be because the eastern region has a relatively
advanced economy. The economic development promoted by digital finance can fur-
ther optimize resource allocation efficiency, boost the expansion of new industries, and
inject new vitality into the industrial structure upgrading. Over the central and western
Does Digital Finance Promote the Upgrading of Industrial Structure 841

parts, the development of digital finance is more reflected in promoting the growth of
the secondary industry. After the second industry gets the dividend, it may turn the third
industry into the second industry. This enlightens that reasonable policies should be
inclined to the central and western areas to avoid expanding development differences
between the eastern and western regions.

Table 3. Heterogeneity analysis of the impact of digital finance on the upgrading of industrial
structure

Variables (1) (2) (3) (4)


ISU
lnDF 0.0338* 0.0298* −0.0387* −0.0441*
(0.0180) (0.0165) (0.0232) (0.0238)
lnPGDP 0.0182 0.0430
(0.0176) (0.0383)
FA −0.0126 −0.0110
(0.0168) (0.0195)
lnGov −0.0177 −0.0597*
(0.0246) (0.0332)
lnR&D −0.0382*** −0.0275***
(0.00957) (0.00992)
Constant 0.315*** 0.834*** 0.493*** 0.907*
(0.0720) (0.248) (0.0783) (0.495)
R2 0.791 0.840 0.792 0.810
Observations 108 108 162 162
Sample size 12 12 18 18
Notes: *, **, *** indicate significance at the 10%, 5% and 1% level. For the estimated coefficients,
the standard errors are in parentheses.

3.3 Robustness Test

To ensure the reliability of empirical results, this work uses various methods to check
out the robustness of the regression model.
First, the core explanatory variables are delayed for a period and then regressed. As
shown in Table 4 (1) and (2), digital finance with lag period still produces a significant
active role in promoting the upgrading of industrial structure.
Secondly, due to the municipalities directly under the central government have
more superiorities in economic growth, political policy, science and culture, and trans-
portation location than other provinces, which may lead to inconsistent effects of digital
842 Y. Qu et al.

finance on industrial upgrading. Therefore, this paper excludes the data of four munici-
palities directly under the central government of China and returns again. The results are
shown in Table 4 (3) and (4), and the digital finance has a strong promoting correlation
to the upgrading of industrial structure.
In conclusion, the empirical result of this paper has certain robustness.

Table 4. Robustness test results

Variables (1) (2) (3) (4)


Independent variable lags one Excluding province-level
period municipality
Industrial structure upgrading
lnDF 0.0397*** 0.0388*** 0.0318** 0.0329**
(0.0113) (0.0112) (0.0132) (0.0130)
lnPGDP 0.0622*** 0.0398***
(0.0198) (0.0137)
FA −0.0190 −0.0207
(0.0146) (0.0145)
lnGov −0.0457* −0.0365***
(0.0243) (0.0107)
lnR&D −0.0168** −0.0122***
(0.00772) (0.00401)
Constant 0.267*** 0.219 0.263*** 0.303**
(0.0409) (0.262) (0.0471) (0.141)
R2 0.735 0.757 0.758 0.783
Observations 240 240 234 234
Sample size 30 30 26 26
Notes: *, **, *** indicate significance at the 10%, 5% and 1% level. For the estimated coefficients,
the standard errors are in parentheses.

4 Conclusions and Policy Recommendations


According to the empirical results, the conclusions are as follows: First, digital finance
significantly promotes upgrading China’s industrial structure. Secondly, the usage depth
promotes the upgrading of industrial structure, while the digitization level inhibits it.
Third, the development of digital finance over the eastern area can advance the upgrad-
ing of industrial structure, while it has a restraining effect in the western region. This part
puts forward the policy recommendations: First, speed up financial infrastructure con-
struction, and relevant departments should improve the financial infrastructure. Then,
Does Digital Finance Promote the Upgrading of Industrial Structure 843

implement the strategy of regional coordinated development. It is crucial to promote the


complementary advantages and develop rich financial tools and products to expand the
coverage breadth of digital financial services and promote the coordinated development
of digital finance.

Acknowledgments. We thank the funding sponsored by the Key Project of Educational Science
Planning in Heilongjiang Province (ZJB1421217) and General Project of Teaching Reform of
Higher Vocational Education in Heilongjiang (SJGZY2019172).

References
1. Kuznets, S.S.: Modern Economic Growth : Rate, Structure, And Spread. Yale University
Press, New Haven (1966)
2. Guo, W.L., Chen, J.Y.: Research on the effect of digital inclusive finance development and
industrial structure upgrading in China. Mall Modern. 11, 146–148 (2020)
3. Goldsmith, R.W.: Financial Structure and Development. Yale University Press, New Haven
(1969)
4. Wurgler, J.: Financial markets and the allocation of capital. J. Financ. Econ. 58(1), 187–214
(2000)
5. Da Rin, M., Hellmann, T.: Banks as catalysts for industrialization. J. Financ. Intermed. 11(4),
366–397 (2002)
6. Aghion, P., Howitt, P., Mayer-Foulkes, D.: The effect of financial development on conver-
gence: theory and evidence. Q. J. Econ. 120(1), 173–222 (2005)
7. Chava, S., Oettl, A., Subramanian, A., Subramanian, K.V.: Banking deregulation and
innovation. J. Financ. Econ. 109(3), 759–774 (2013)
8. Sasidharan, S., Jijo Lukose, P.J., Komera, S.: Financing constraints and investments in R&D:
evidence from indian manufacturing firms. Q. Rev. Econ. Finance 55, 28–39 (2015)
9. Fan, X.C., Xia, Y., Hao, Y.M.: Research on the impact of inclusive finance development on
industrial structure upgrading in China. Zhejiang Finance 5, 23–30 (2017)
10. Cai, H.Y., Xu, Y.Z.: Does trade openness affect the upgrading of China’s industrial structure.
Res. Quant. Econ. Tech. Econ. 34(10), 3–22 (2017)
11. Guo, F., Wang, J.Y., Wang, F., Kong, T., Zhang, X., Cheng, Z.Y.: Measuring the development
of digital inclusive finance in China: index compilation and spatial characteristics. Econ. Q.
19(4), 1401–1418 (2020)
Electricity Consumption Prediction Based
on Time Series Data Features Integrate
with Long Short-Term Memory Model

Jiaqiu Wang1(B) , Hao Mou2 , Hai Lin1 , Yining Jin3 , and Ruijie Wang1
1 Beijing China-Power Information Technology Co., Ltd., Haidian District Beijing, China
wangjiaqiu@alu.hit.edu.cn
2 State Grid Sichuan Electric Power Company, Gaoxin District, Chengdu, China
3 Harbin University of Commerce, Songbei District, Harbin, China

Abstract. The forecast of electricity consumption is of great significance to


adjusting the power supply dispatching scheme and optimizing economic struc-
ture. Electricity consumption prediction is a time series prediction problem in
essence. Most relevant work considers establishing electricity consumption pre-
diction models in terms of economy, temperature, region, etc. However, few studies
consider the three data characteristics of trend, seasonality, and periodicity con-
tained in time series. Therefore, this paper proposes a Long-Short Term Memory
model based on time series feature fusion. This model considers the influence of
three data features of time series on time prediction results and can effectively
integrate time-series features into Long Short-Term Memory Model with strong
self-learning ability. Experimental results show that the proposed method has
higher accuracy than the basic LSTM model and the Autoregressive Integrated
Moving Average Model (ARIMA).

Keywords: Electricity consumption prediction · Time series features · Long


short-term memory

1 Introduction
It is of great significance to improve the accuracy of electricity consumption prediction.
First of all, electricity sales, as one of the most concerning indicators of power grid enter-
prises, can be accurately predicted to help power supply enterprises effectively adjust
the power supply plan, optimize the power supply structure, and improve the security
of power system operation. Secondly, as a barometer of economic development, the
electricity consumption also reflects economic development in a region. The govern-
ment, the power supply company, can adjust the region’s industrial structure according
to electricity consumption data [1] to promote economic development.
In essence, establishing an electricity consumption prediction model is a method to
forecast future electricity consumption based on the historical data of electricity con-
sumption. Therefore, the problem of electricity consumption prediction belongs to the
time series prediction problem. Many studies have taken weather, temperature, holidays,

© The Author(s), under exclusive license to Springer Nature Switzerland AG 2022


A. E. Hassanien et al. (Eds.): BIIT 2021, LNDECT 107, pp. 844–853, 2022.
https://doi.org/10.1007/978-3-030-92632-8_80
Electricity Consumption Prediction Based on Time Series Data 845

and other factors affecting the establishment of the electricity consumption prediction
model. But as a typical time series problem, few works take the trend, seasonality, and
periodicity of time series prediction problem as the main factors to establish the electric-
ity consumption prediction model, which makes the established model lack persuasive.
At the same time, the existing electricity consumption prediction methods, such as the
traditional empirical method, regression analysis method, gray prediction method and
so on, do not carry out unified and logical treatment for different types of electricity
consumption prediction, and the stability of the model is poor. Although the artificial
neural network prediction method and SVM prediction method can effectively use his-
torical data, their adaptive ability and self-learning ability are relatively insufficient, and
the prediction effect is relatively unsatisfactory.
Therefore, given the above problems, a Long Short-Term Memory Model based on
time series feature fusion is proposed. The model takes into account the influence of three
data features of time series on time prediction results and effectively integrates time-
series features into the Long-short term memory model with strong self-learning ability.
The actual historical data of electricity consumption is used to verify the effectiveness,
and the proposed method is compared with other methods. The experimental results
show that the proposed power consumption prediction model based on the fusion of
time series features has higher accuracy, which indicates that the fusion of time series
features in the power consumption prediction model is applicable and obtains better
prediction results.
The organizational structure of this article is as follows. Section 2 introduces the
related work. Section 3 gives relevant concepts and formal expressions of the model.
Section 4 introduces the structure of the model proposed in this paper. Section 5 intro-
duces the use of data sets, experimental environment, and evaluation indexes of exper-
imental results, and compares and analyzes the experimental results obtained by the
proposed method. Section 6 summarizes this paper and looks into the future work.

2 Related Work

In essence, The prediction of the future electricity consumption based on historical elec-
tricity consumption data belongs to a time series prediction problem. At present, To
divide the electricity consumption prediction model based on the time series predic-
tion method into the electricity consumption prediction model based on the traditional
algorithm, machine learning, and neural network.

2.1 Electricity Consumption Prediction Model Based on Traditional Methods

Many scholars have practiced the prediction model of electricity consumption based on
traditional methods and achieved good results. Fan Jiao [2] and Wang Tao [3] et al. took a
holiday, temperature, and other factors into consideration in the time series of electricity
consumption. At the same time, they used traditional electricity consumption prediction
models such as a combination model of GM (1,1) model, ARIMA model, Winters
model, and moving average model to forecast monthly electricity consumption. The
results showed that improving the prediction effect of the combination model. Guefano
846 J. Wang et al.

Serge [4] and Laxmi Rajkumari [5] et al. used a new GM (1, 1) – Var (1) hybrid model
based on VAR and gray models and Holt-Winter smoothness (no seasonal) technique to
predict the future electricity consumption in a region. The rationality and applicability
of the model are verified. However, the traditional method of electricity consumption
prediction model of the curve is limited, which cannot fit the changing trend beyond the
curve rule.

2.2 Power Consumption Prediction Model Based on Machine Learning Method


The power consumption prediction model based on machine learning is mainly ana-
lyzed by applying statistics. Cao Min et al. [6] artificially solved the problem of multiple
factors affecting electricity consumption. They proposed a combined model combining
the autoregressive moving average model and SVM model to predict users’ electricity
consumption. The results showed that the prediction accuracy of the combined model
was higher than that of the single model. Relevant foreign scholars [7–9] applied the
integrated machine learning method, second-order fuzzy time series (FTS), and vector
autoregressive (VAR) analysis method to predict the electricity consumption in differ-
ent regions, and achieved good prediction accuracy. However, since statistical learning
theory can only show good performance in the case of small samples, it is not applicable
to large power consumption data model components.

2.3 Electricity Consumption Prediction Model Based on Neural Network


Although the traditional method of the electricity consumption prediction model can get
a relatively satisfactory prediction accuracy, the precision of the model will be affected
by human factors. The neural network model can imitate the intelligent processing of the
human brain and mine different data features from a large number of multidimensional
data, which completely shows that the neural network has strong self-learning and self-
adaptive ability. Ramos Daniel [10] and Shahid Ali [11] et al. used an artificial neural
network (ANN) to improve the prediction accuracy of power consumption. With the
continuous development of the neural network, Most scholars who study the electricity
consumption prediction model recognize LSTM for its strong time-series learning ability
and information selection ability. A. Jayanth Balaji et al. [12–15] used the LSTM model
to carry out related researches on electricity consumption and obtained the result of high
prediction accuracy.
In conclusion, compared with a large number of traditional power consumption
prediction models, the neural network model based on deep learning can predict the
power consumption more accurately. From the perspective of time series characteristics,
this paper quantifies and integrates the trend, seasonality and periodicity characteristics
of time series into the LSTM model.

3 Notions and Problem Definition


This section gives the problem statement and the definitions of related notions, and
presents the problem formal representation.
Electricity Consumption Prediction Based on Time Series Data 847

For the large-scale industrial users who need to use high voltage grades in the power
grid, the daily electricity consumption time series of the past period is used to predict
the daily electricity consumption of the future period. It should be noted that the high
voltage level of large-scale industrial users is different from the low voltage level of
residential users. For some time in the future, the predicted electricity consumption
is not a prediction of the total amount of electricity, but a forecast of daily electricity
consumption. Therefore, the prediction problem can be regarded as a multi-steps time
series prediction.
Definition 1. Time series of daily electricity consumption. It Refers to the ordered array
of electricity used by users every day according to the time sequence of its occurrence.
Given xt represent a variable varying with time t(t ≥ 1), which means that on the day t,
electricity consumption is xt . In a period of time (t1 , t2 , ..., tN ) t1 < t2 < · · · < tN , N >
2, the discrete sequence (xt1 , xt2 , ..., xtN ) composed by the measurement of electricity
constitutes a time series of electricity consumption.
Time series characteristics contain the trend, seasonality, and periodicity of daily
electricity consumption time series.
Definition 2. Trend Characteristic: the daily electricity consumption shows a relatively
slow and long-term trend of continuous rise, decline, or stays with the same nature over
time, but the range of change may not be equal.
Definition 3. Periodicity Characteristic: When the time series of daily electricity con-
sumption has the rise and fall of irregular frequency, that is to say, the fluctuation of
daily electricity consumption is irregular, it means that the series is periodic.
Definition 4. Seasonality Characteristic: The frequency of electricity consumption fluc-
tuation in the time series of daily electricity consumption is constant and related to a
period of fixed length, indicating seasonality in the series.
Forecasting problem of power consumption of high voltage users. For high-voltage
electricity users, given their historical time series of daily electricity consumption,
the values of future daily electricity consumption series with multiple time steps
are predicted. Formally, given the time series (xt1 , xt2 , ..., xtN ) of users’ historical
daily electricity consumption in the past time interval (t1 , t2 , ..., tN ), the time series
(xtN +1 , xtN +2 , ..., xtN +H ), H > 2 of daily electricity consumption in the future time
interval is predicted. The formula is expressed as follows.

f (xt1 , xt2 , ..., xtN ) → (xtN +1 , xtN +2 , ..., xtN +H ) (1)

4 Time Series Features Fusion Long-Short Term Memory


A Recurrent neural network is a kind of recursive neural network, which takes sequence
data as input, recursion in the evolution direction of sequence, and all nodes (recurrent
units) are chained together. The Long Short-Term Memory Networks (LSTM) is a recur-
rent neural network. LSTM is a widely used RNN model, which is usually composed
of multiple LSTM units in series. Each LSTM unit consists of a memory unit Cell, a
forgetting gate, an input gate, and an output gate. A Cell is the main body of the LSTM
848 J. Wang et al.

unit. The gate lets information selectively pass through mainly through a function and a
dot product. LSTM is often used to capture the length of data Gating units make it have
strong generalization ability and well performance in time series data. LSTM is often
used to capture the long-term and short-term time dependence of data, and the gating
unit makes it have a strong generalization ability and has a good performance in time
series data.
As RNN adopts the backpropagation algorithm in training when the sequence with
long interval. To find the gradient by chain rule, the phenomenon of gradient disap-
pearance will appear. The gradient disappearance will become more serious with the
longer period of time, making it difficult for RNN to train effectively for the sequence
with long-range. The memory unit structure of LSTM controls the memory of historical
information by three gate structures: forgetting gate, input gate, and output gate. The
calculation process of time t is as follows.

it = sigmoid (Wxi xt + Whi ht−1 + Wci ct + bi ) (2)

ot = sigmoid (Wxo xt + Who ht−1 + Wco ct + b0 ) (3)

ft = sigmoid (Wxf xt + Whf ht−1 + Wcf ct + bf ) (4)

ct = ft ct−1 + it tanh(Wxc xt + Whc ht−1 + bc ) (5)

ht = ot tanh(ct ) (6)

In the formula, i, o and f represent input gate, output gate and forgetting gate respec-
tively. c stands for memory unit. h represents hidden layer output, and its subscript
represents time t. W represents the connection weight, and its subscript represents the
weight associated items. b represents the offset item. According to the values of i, o and
f in the range of [0, 1], the proportion of historical information passing through the gate
structure is controlled. So tanh, sigmoid is the activation function.
This paper proposes a time-series features fusion long-short time memory network
model to learn the fused multi-sequence features information and to capture the change
law of sequence features in time series and the relationship between multiple sequence
feature signals at the same moment. The LSTM model is shown in Fig. 1. Power con-
sumption data are contained in time series according to three data characteristics of
Trend, Seasonality and Periodicity for vectorial processing with Word2vec [16]. Then,
the three serial feature signals are serialized into multi-channel input data according
to time, and three serialized LSTM blocks are used to extract deep features from the
multi-channel input. The time step of the LSTM block is set as 1.
The third LSTM block only returns the output of the last moment, reducing the
number of depth characteristics and the number of parameters. Through three groups of
LSTM blocks, the model uses a 4-layers fully connected network to classify the depth
characteristics.
Electricity Consumption Prediction Based on Time Series Data 849

Tendency
LSTM LSTM LSTM
Periodicity FC FC FC FC Sigmoid
Block Block Block
Seasonality

Fig. 1. The proposed time-series feature fusion long-short time memory network.

5 Experiment

In this chapter, we verify the proposed prediction method. To prove the effectiveness of
our algorithm, we introduce it from four aspects: a data set and experimental environment,
experimental results, and evaluation criteria. At last, we analyze the experimental results.

5.1 Experimental Setup

We adopted the electricity consumption time series of 27 provincial companies of State


Grid Corporation of China in 2020 for 12 months in a row to train and verify the electricity
consumption prediction model built. The time series of electricity consumption contains
100,000 pieces of electricity data, and we divide the training set and the test set according
to the ratio of 8:2.
The attribute of the data set includes total electric quantity, wave electric quantity,
peak electric quantity, valley electric quantity, and average electric quantity.
Peak electricity price is the highest, valley electricity price is the lowest, and the total
active electricity is the sum of Peak electricity price and valley electricity price. Table 1
shows the data content form of the real-electricity consumption.
Experimental setting: the paper uses the real-historical data of unit electricity con-
sumption in each province within one year to establish the electricity consumption model.
Data is divided into a test set and verification set, accounting for 80% and 20%. The
proposed model uses the test set data to train, and the model uses the verification set to
verify.
Experimental environment: In our experiment, we use Nvidia Titan XP GPU server,
the CPU was Intel Xeon E5–2680 V4, the memory size was 64 GB, and use Google
TensorFlow as the software platform.

5.2 Evaluation

To verify the validity of the method proposed in this paper for Electricity consumption
forecast, we will compare the experimental results by using the error indexes commonly
used in the world, Root means square error (RMSE) and mean absolute percentage error
(MAPE). At the same time, the Mean Squared Log Error (MSLE) is also an important
index to verify the power consumption prediction results. Expressing the formula as
follows:
850 J. Wang et al.

Table 1. The sample of real-electricity consumption.

Attributes\No No. 1 No. 2 No. 3 No. 4 No. 5


Pro_Org_no 10001 10001 10001 20001 20001
YMD 20200303 20200304 20200305 202003 202004
THIS_PQ 28780 21285 24183 31817 32344
HIS_T_PQ 3900 3421 2876 5761 5690
THIS_P_PQ 7460 5643 5091 7864 8653
THIS_N_PQ 11760 8765 9465 11098 10876
THIS_V_PQ 5640 3456 6751 7094 7125
UserID BJ0980832 BJ0980832 BJ0980832 TJ654987 TJ654987

Root Mean square deviation (RMSE):



   n  2
 1  
RMSE y, y =  yi − yi (7)
n
i=1

Mean absolute percentage error (MAPE):


 
  n  
 100%   yi − yi 
MAPE y, y =  y  (8)
n 
i=1 i 

Mean Squared Log Error (MSLE):


  n   2
 1 
MSLE y, y = log(1 + yi ) − log 1 + y (9)
n i
i=1


yi represents the actual value, yi represents the predicted value.

5.3 Results and Analysis

We compared Time Series Features Fusion Long Short-Term Memory Model with basic
LSTM model, Auto-Regressive Integrated Moving Average (ARIMA) on the evaluation
metrics. Showed the experimental results in Fig. 2, Fig. 3, and Fig. 4, and Compared
with other methods, Our proposed algorithm has fewer mistakes in predicting electricity
consumption.
Electricity Consumption Prediction Based on Time Series Data 851

Fig. 2. The comparison between predicted results and the actual results.

Fig. 3. The comparison between the proposed method and other methods on RSLE and RMSE.

Figure 2 shows the comparison between the predicted and actual electricity con-
sumption from August to December 2020. The papers’ predicted rend of electricity
consumption is consistent with the actual trend of electricity consumption from Fig. 2.
The experimental results in Fig. 3 and Fig. 4 show the electricity consumption prediction
model established in this paper compare with the basic long-short term memory network
model and the autoregressive moving average model (ARIMA) under the three evalua-
tion indexes. From the experimental results in the figure, we can see that the electricity
consumption prediction model established in this paper can achieve better results in the
mean absolute percentage error, root means square error and means square log error.
852 J. Wang et al.

Fig. 4. The comparison between the proposed method and other methods on MAPE.

6 Conclusions and Future Work


The paper proposes integrating the three features of time series into the Long Short-Term
Memory Model with strong self-learning ability and builds an LSTM model based on
the fusion of time series features. Real historical data of electricity consumption verify
the model, and the proposed method is compared with other methods. The experimental
results show that the proposed power consumption prediction model based on the fusion
of time series features has higher accuracy, which indicates that the fusion of time series
features in the power consumption prediction model is applicable and can obtain better
prediction results.
In future work, we intend to design and implement a feedback learning mechanism
to improve the accuracy rating of prediction. In addition, we improve the generalization
and efficiency of our model to accommodate more Multi-source heterogeneous data.

Acknowledgment. The authors thank the editor and anonymous reviewer for their helpful
comments and suggestions that have led to their improved version of this paper.

References
1. Wu, Q., Lu, S.F., Yang, S.H., et al.: Research on household electricity management under
corresponding conditions on demand side. Electric Power Eng. Technol. 35(5), 28–31 (2016)
2. Fan, J., Feng, H., Niu, D.X., Wang, X.Y., Liu, F.Y.: Monthly electricity consumption prediction
based on wavelet analysis and GM-ARIMA model. J. North China Electric Power Univ. Nat.
Sci. Ed. 42(4), 101–105 (2015)
3. Wang, T., Zhang, W., Lu, M.M., Wang, T.M.: Research on prediction method of power
consumption of regional power grid based on time series. Electr. Technol. 11, 9–12 (2010)
4. Guefano, S., Tamba, J.G., Azong, T.E.M., Monkam, L.: Forecast of electricity consumption
in the Cameroonian residential sector by Grey and vector autoregressive models. Energy 214
(2021)
Electricity Consumption Prediction Based on Time Series Data 853

5. Laxmi, R.: Relation between electricity consumption and economic growth in Karnataka,
India: An aggregate and sector-wise analysis. Electricity J. 33(5), (2020)
6. Cao, M., Ju, J., Bai, Z.Y., Liu, J.T.: Prediction method of electricity consumption based on
ARMA-SVM combination model. Energy Environ. 1, 49–51 (2021)
7. Banik, R., Das, P., Ray, S., Biswas, A.: Prediction of electrical energy consumption based
on machine learning technique. Electr. Eng. 103(2), 909–920 (2020). https://doi.org/10.1007/
s00202-020-01126-z
8. Tay, K.G., Sim, S.E., Tiong, W.K., Huong, A.: Forecasting electricity consumption using the
second-order fuzzy time series. IOP Conf. Ser. Mater. Sci. Eng. 932(1), 12056 (2020)
9. Yasir, H.A., Gurudeo, A.T.: The relationship between electricity consumption, peak load and
GDP in Saudi Arabia: a VAR analysis. Math. Comput. Simul. 175 (2020)
10. Ramos, D., Faria, P., Vale, Z., Mourinho, J., Correia, R.: Industrial facility electricity con-
sumption forecast using artificial neural networks and incremental learning. Energies 13(18),
4774 (2020). https://doi.org/10.3390/en13184774
11. Ali, S.H., Zhang, J.R., Azeem, A., Asif, M.: Impact of electricity consumption on economic
growth: an application of vector error correction model and artificial neural networks. J. Dev.
Areas 54(4), (2020)
12. Jayanth, B.A., Harish Ram, D.S., Binoy, B.N.: A deep learning approach to electric energy
consumption modeling. J. Intell. Fuzzy Syst. 36(5), 4049–4055 (2019)
13. Hakan, Y., Umut, U., Oktay, T.: A long-short term memory application on the turkish intraday
electricity price forecasting. Press Academia Procedia (1) (2018)
14. Manowska, A.: Using the LSTM network to forecast the demand for electricity in Poland.
Appl. Sci. 10(23), 8455 (2020)
15. Ya’u, G.A., Lawal, A.M., Jimmy, N.J., Nannin, R.E., Chidi, E.O., Jafaru, B.: LSTM network
for predicting medium to long term electricity usage in residential buildings (Rikkos Jos-City,
Nigeria). Comput. Sci. Eng. 9(2), (2019)
16. Rong, X.: Word2vec Parameter Learning Explained. Comput. Sci. (2014)
Intelligent Search Method in Power Grid Based
on the Combination of Elasticsearch
and Knowledge Graph

Jiaqiu Wang1(B) , Xinhua Yang1 , Yining Jin2 , Xueyong Hu1 , and Le Sun1
1 Beijing China-Power Information Technology Co., Ltd., Haidian District, Beijing, China
wangjiaqiu@alu.hit.edu.cn
2 Harbin University of Commerce, Songbei District, Harbin, China

Abstract. The accurate search of professional knowledge in power dispatching


systems, enterprise technical standards, and other power business documents is
significant for power dispatching personnel to make decisions and improve work
efficiency. The existing methods to solve the power search focused on paragraph
search, few of which consider locating the question to the precise answer. This
paper proposes an intelligent search method based on the combination of Elastic-
search and Knowledge Graph for power dispatching and equipment technology,
which improves the accuracy of searching enterprise technical standard docu-
ments. This method is based on Elasticsearch to prefilter the contents of paragraphs
and reduce the search scope.
Further, in a small range of paragraph content, based on the knowledge graph
of triples and problems to match fine granularity, to achieve the power enterprise
technical standard documents intelligent accurate search. A real-world data set is
used in the experiment, and the results verify the effectiveness of the proposed
method. Finally, a prototype system for knowledge search of power grid dispatch-
ing system is constructed. (Lucene is a set of open sources libraries for full-text
search and search, supported and provided by the Apache Software Foundation.
Lucene provides a simple but powerful application interface for full-text indexing
and searching.)

Keywords: Intelligent search · Power grid · Elasticsearch · Knowledge graph

1 Introduction

With the advent of big data, the power system with large and complex assets, knowledge-
intensive, and consumption characteristics has introduced smart devices and dispatchable
networks in the era of big data, which together constitute a powerful smart grid. In
the smart grid operation, generating a large amount of electric power business data is
inevitable, so it isn’t easy to search the data by traditional manual means. Therefore, a
fast and accurate method is needed to search target data from massive data to improve
the production efficiency of the power industry.

© The Author(s), under exclusive license to Springer Nature Switzerland AG 2022


A. E. Hassanien et al. (Eds.): BIIT 2021, LNDECT 107, pp. 854–864, 2022.
https://doi.org/10.1007/978-3-030-92632-8_81
Intelligent Search Method in Power Grid 855

At present, the dispatching work of the power grid is mainly decided and operated
by dispatchers. The dispatcher needs to analyze the state and parameter changes of the
power network and take measures based on experience. This artificial scheduling mode is
subjective [1]. With the rapid development of power systems, the structure and operation
mode of the power grid become more and more complex, and the difficulty of power grid
disposal is increasing [2]. Although the traditional text search method based on keyword
search can get paragraph search results, the search results lack fragmentation. They even
appear to be irrelevant answers, which reduces the work efficiency [3]. Electric power
enterprises as manufacturing enterprises, the low efficiency of the work schedule directly
affects the benefits of enterprises. At present, most of the existing methods focus on the
paragraph search of the technical standard documents of electric power enterprises, and
few can give an accurate answer to the question.
Power business document data includes power dispatching systems, enterprise tech-
nical standards, etc. To accurately find the dispatch knowledge, equipment knowledge,
equipment troubleshooting problems, and equipment maintenance history information,
this paper proposes an intelligent search method based on the combination of elastic
search and knowledge graph to help the dispatcher make decisions and improve the
operation efficiency of the power grid. The Elasticsearch method is used to select the
contents of paragraphs in advance to reduce the search scope. Secondly, in a small range
of paragraph content, the triples in the knowledge graph are used to match the problems
in fine granularity to realize the intelligent and accurate search of the technical stan-
dard documents of electric power enterprises. The experimental results of the proposed
method are compared with those of other methods in terms of evaluation index by using
the real-world data set of power enterprises’ technical standard documents. Experimen-
tal results show that the accuracy and recall of the proposed method are better than other
comparison methods, which indicates that the combination of knowledge graph search
in the Elasticsearch engine can achieve a more accurate search.
The organizational structure of this article is as follows. Section 2 introduces the
related work, and Section 3 describes the proposed method in detail. In Section 4, exper-
imental setup, data set and experimental environment, evaluation indexes, and exper-
imental results obtained by the proposed method. Section 5 introduces the functional
logic structure and function of the prototype knowledge retrieval system of the power
grid dispatching system. Section 6 summarizes this paper and looks into the future work.

2 Related Work

This chapter introduces the relevant work based on Elasticsearch and Knowledge Graph
search methods and explains the differences between the proposed method and the related
work.

2.1 Search Method Based on Elasticsearch

At present, compared with other search methods, Elasticsearch, an open-source real-time


distributed search and analysis engine developed by Lucene, is an effective way to com-
prehensively and quickly browse large amounts of data [4]. For example, In the electric
856 J. Wang et al.

power enterpri, Yang [5] et al. used Elasticsearch search engine, the development tech-
nology of front, rear end separation, and the design pattern of microservice architecture
to establish a specific data application aiming at the difficulty of locating the target in
massive data, and the results showed that this method could effectively improve the work
efficiency. In campus networks, Xue [6] and Qin [7] et al. took full-text searches as the
research problem. They proposed combining Elasticsearch full-text search engine with
Scrapy framework and optimizing Elasticsearch Chinese index, respectively. Experi-
ments show that applying the Elasticsearch engine in the campus network to search the
full text can get better results. The Elasticsearch search engine is also widely recognized
in other areas. Aleksei, V. et al. [8] proposed a system based on the Elasticsearch engine
and MapReduce model to solve the problem of user authentication. Prashant, K. et al.
[9] applied the Elasticsearch engine to the search application of the e-book portal.
In the field of the smart grid search, search results are required to be more accurate.
Compared with the related work, the proposed method does not rely on Elasticsearch
alone but only uses the ES method to prefilter the relevant paragraph content. This
approach is expected to improve the accuracy of search results.

2.2 Search Method Based on Knowledge Graph


As a graph-based data structure, the knowledge graph is designed to describe various
entities or concepts and their relationships in the real world, forming a huge semantic
network graph [10]. Applying the knowledge graph provides a better way to organize,
manage, and utilize vast amounts of information. Many researchers have applied the
knowledge graph method to intelligent semantic search, deep question-answering sys-
tems, and other practical applications. Xu [11], Wang [12], and Liu [13] et al. took
the intelligent analysis and management of data resources of power enterprises as the
research problem. They proposed the power data search method based on knowledge
graph, and the significance, development, and application of knowledge graph in smart
grid enterprises are explained and analyzed. Wang [14] and Hu [15] et al. improved the
search speed by changing the query mode. Kamdar Maulik R [16] et al. mapped triples
to medical text search to find matching sentences in reference text. Although knowledge
graph is widely used in knowledge search, the automatic acquisition of knowledge, the
automatic fusion of multisource knowledge, and the knowledge-oriented representation
learning in the power domain pose many challenges to improve the search accuracy.
To sum up, this paper proposes an intelligent search method based on Elasticsearch
combined with Knowledge Graph. This method selects the contents of the search para-
graphs in advance and greatly reduces the search scope. In the small range of paragraph
content, triples in the knowledge graph can match the problem in fine granularity, which
improves the precision of the accurate search of standard documents.

3 Intelligent Search Method Based on the Combination


of Elasticsearch and Knowledge Graph
This chapter introduces the Intelligent Search Method based on the Combination of
Elasticsearch and Knowledge Graph (SEK) proposed to realize the intelligent and accu-
rate search of technical standard documents of electric power enterprises. This method
Intelligent Search Method in Power Grid 857

combines the Elasticsearch search method with the knowledge graph search method,
which can quickly and accurately locate the target content in the standard documents of
electric power enterprises. It can not only browse large-scale text data comprehensively
and quickly in the full text of standard documents of electric power enterprises, but also
make up for the low search precision of single target words by traditional methods, and
improve the precision and efficiency of accurate search of standard documents. The steps
for this method are described below.

Step 1: Building a Knowledge Graph. In the unit of paragraph content, the enterprise
technical standard document is split. Each paragraph’s content is indexed and parsed
into a knowledge graph representation. According to the corresponding content of the
technical standard documents of electric power enterprises, the word segmentation algo-
rithm of POS (Part-of-speech) [17] and part of speech tagging algorithm are firstly used
for word segmentation and tagging processing of the text content of each paragraph.
Secondly, a regular extraction method is used to extract numbers and units. Finally,
the dependency syntax analysis method is used to identify the dependency relations of
context relations, analyze and extract the data of different power grid business fields,
and build the data knowledge map of power enterprises with the extracted entities and
relations.
The knowledge graph elements corresponding to the content of the construction
paragraphs are composed of triples, as shown in Table 1. Each row is represented as a
triple, each triple consists of an entity, a relationship, and an entity or attribute value.
The triple ID of 2010000 is a triad (scope, contains, Excludes certain small and special
transformers).
Step 2: Paragraph Segmentation. In this process, a power domain word divider con-
verts each text into a series of words. The word segmentation consists of three com-
ponents, namely Character Filter, Tokenizer, and Token Filter. The Character Filter
component removes the HTML tags for each triple, The Tokenizer component divides
words according to the rules of the word splitter, The Token Filter modifies the shard
words into the lowercase format, deletes stop words and adds synonyms. After being
processed by three components, the paragraph content is broken down into words.
For example, the entity in the triple is “excluding certain small and special transform-
ers.“. If the entity contains HTML markup, the Character Filter component removes it.
The result of the Tokenizer component’s word segmentation is: “Excludes certain small
and special transformers.“ The result after processing by the Token Filter component is
“excluding small special transformers”.
Step 3: Build the Elasticsearch Inverted Index File. Inverted indexes are relationships
that lookup document IDs by words [18]. You can search the corresponding paragraph
content in the document of enterprise technical standard by searching the keywords in
the electric power field. The inverted index measures four statistical indexes of enter-
prise technical standard documents, including Doc id, word frequency TF, Position, and
Offset. For example, the sentence after paragraph segmentation is “ Excludes small spe-
cial transformers”. The inverted index statistical index of the word “small” is shown in
Table 2.
858 J. Wang et al.

Table 1. Samples of triples of the knowledge graph.

Triple id Entity Relationship/Attribute Entity/Attribute value


2010000 Scope Contains Excludes certain small and
special transformers
2010013 There is no standard content This section may be applied in
transformer whole or in part when there is
no corresponding standard
time for certain types of
transformers (especially
special industrial transformers
where all winding voltages are
not higher than I 000 V). This
section does not cover the
requirement that transformers
be installed in public places
2010097 Auto-connected Content Series windings and common
windings windings in autotransformer

Table 2. Example of the word “small” inverted index composition.

Doc id Doc content TF Position Offset


1 Excludes small special transformers 1 1 <4,6>
2 Includes special small transformer 1 2 <5,7>
3 Small special transformer 1 0 <0,2>

Step 4: Search for Paragraph Content Based on Elasticsearch. Enter the question
into the search service, and word the question by paragraph segmentation. The result of
the word segmentation is returned to the retrieval service to find the matching enterprise
technical standard document ID in the inverted index file. According to the similarity
measurement method of term frequency-inverse document frequency (TD/IDF) [19], the
relevance score of each word in the problem is calculated. Then the relevance score of the
whole problem is obtained after adding them together. Finally, all enterprise technical
standard documents are sorted according to their relevance from high to low.
Step 5. Accurate Search Based on the Knowledge Graph. To search the target answer
more accurately, within the range of the search results of paragraph content, an accurate
search was carried out by combining the data knowledge graph of electric power enter-
prises. After information extraction, entity link and knowledge fusion, the problem is
represented as a problem triad. The problem triad is accurately matched with the power
enterprise data knowledge graph to achieve the accurate search of the target answer.
Intelligent Search Method in Power Grid 859

For example, the question “What should the thickness of the oil passage of the
radiator be less than?” The thickness of the oil passage of the radiator should not be less
than 9 mm. The distance between plates of the self-cooling type should not be less than
45 mm”, The data knowledge graph of power enterprises is expressed as (oil channel
thickness, less than, 9 mm) and (distance between slices, less than, 45 mm). According
to the CYPHER retrieval statement of the knowledge graph, the correct answer was
9 mm.

4 Experiment

In this chapter, we verify the effectiveness of our proposed method through experiments.
The experimental data set, experimental setting, experimental environment and exper-
imental evaluation are introduced. Finally, the experimental and analytical results are
given.

4.1 Data Sets

In the experiment, the proposed method is verified by electric power enterprises’ real-
world technical standard document dataset. Electric Power Enterprise Technical Stan-
dard Document (EPETSD) data set comprises 200 documents of national standards,
Enterprise standards, and rules and regulations of the Electric Power Industry. Each
standard technical document of electric power enterprise contains several paragraphs,
and the sample data set after splitting each standard technical document of electric power
enterprise is shown in Table 3. Table 3 contains several paragraphs of a document, and
the corresponding split inverted index and knowledge graph triples.

4.2 Experimental Setup and Environment

Experimental Setup. All data sets are taken as validation data sets, and the proposed
method is run on the same data set with other comparison methods. The search results of
all the methods were compared with the real content, and the values of all the methods
in the evaluation indexes were calculated. Other comparison methods include traditional
Elasticsearch based on Keyword search and knowledge graph based on semantic search
(KGSS) [20].

To ensure the stability of the search results, and the contingency of results can be
avoided. We perform 10-fold cross-validation [21] of all the methods.

Experimental Environment. The programming environment is Python 3.7.2, Elastic-


search 7.13.13 and Neo4j v4.1.0, the Integrated Development Environment (IDE) is
Pycharm 2020. The computing configuration environment is Intel Core (TM) i5-6200U
CPU 3.10 GHz, 8 GB RAM, 500 GB hard disk space, and Window 10 64-bit operating
system.
860 J. Wang et al.

Table 3. The contents of several paragraphs of a document and the corresponding split inverted
index and knowledge graph triples.

ID Content of the paragraph Inverted index Triple


1 For transformers and reactors Series: 2010014 (Transformers and reactors
with relevant standards, this Name: 1 GBT 1094.1-2013 with relevant standards,
section applies only to those contains, reactor
areas of reference which are (GB/T1094.6);)
explicitly mentioned in their
product standards. What’s
Included: 1.3.1 reactor
(GB/T1094.6);
2 Same as above Series: 2010015 (Transformers and reactors
Name: 1 GBT 1094.1-2013 with relevant standards,
contains, Dry-type
transformer (GB1094.11);)
3 A winding whose effective Series: 2010087 (Tapped winding, content, A
number of turns can be Name: 1 GBT 1094.1-2013 winding whose effective
changed step by step number of turns can be
changed step by step.)

4.3 Evaluation Indicators

Precision and recall were used to evaluate the effectiveness of the proposed method and
compared with the evaluation indexes of other methods.
Precision is the ratio of the number of real answers to the retrieved questions to the
number of all retrieved paragraphs.

1  Ti
n
Precision = (1)
n Ri
i=1

Ti is the number of real answers to the retrieved questions, Ri is the number of all
retrieved paragraphs, and n is the number of all problems.
The recall is the ratio of the number of real answers retrieved to the number of
relevant paragraphs.

1  Ti
n
Recall = (2)
n AN
i=1

AN is the number of relevant paragraphs.

4.4 Experimental Result and Analysis

We compare our proposed method SEK with ESKS, KGSS on the evaluation metrics. The
experimental results are shown in Fig. 1, which indicates that the proposed method SEK
Intelligent Search Method in Power Grid 861

has better experimental results in terms of precision and recall. This is mainly because of
the combination of Elasticsearch search and knowledge graph search method, which can
select the contents of the paragraphs in advance and the finer-grained search scope. In the
small range of paragraph content, triples in the knowledge graph can match the problem
in fine granularity, which improves the precision. Therefore, the method proposed in this
paper effectively searches problems in the data set of technical standard documents of
electric power enterprises.

Fig. 1. Comparison results between different methods.

5 Prototype System for Knowledge Search of Power Grid


Dispatching System

This chapter introduces the functional logic structure and functional module description
of the prototype system for knowledge search of power grid dispatching system.
The functional logic structure of the system can be divided into three layers: text
parse layer, data storage layer, and application layer from bottom to top. In the text
parse layer, paragraph segmentation is carried out for the dispatching documents and
technical documents of the power grid enterprise. Then the Elasticsearch inverted index
and knowledge graph are constructed, and these data output to the data storage layer.
In the data storage layer, the Neo4j graph and MySQL relational databases are used to
store knowledge graphs and Elasticsearch inverted index. When the question is entered,
the system will query the data in the database and output the relevant paragraph content ID
and knowledge graph triples to the stored content. Data from the stored content is output
to the application layer for display. The application layer includes information retrieval,
accurate question and answer of knowledge graph and professional recommendation
functions. The functional logic structure of the system is shown in Fig. 2.
The information retrieval function of the application layer is used to return documents
related to the user’s search problem. The exact question and answer function of the
Knowledge Graph can return the answer of the question more accurately within the
scope of relevant documents. The answer is in a specific string and number, rather
862 J. Wang et al.

Application Knowledge Graph Professional


Information retrieval Questions
Layer Accurate Q&A recommendations

Store content

Neo4j Paragraph content ID


Data Storage
Layer Databases
MySQL Knowledge Graph
Triple

Build an ES inverted index Building a Knowledge Graph

Text Parse Paragraph participle


Layer
Power Grid Power grid enterprise Other documents of
Enterprise Scheduling technical power grid
Document documentation enterprises

Fig. 2. The functional logic structure of the system.

Fig. 3. The main interface of the prototype system for knowledge search of power grid dispatching
system.

than the most relevant paragraphs. The professional recommendation function is used
to recommend documents that are associated with the search question. It is important
to note that the recommendation document is not the same as the paragraph content
returned by the information retrieval but is related to the paragraph content in some
other way.
In Fig. 3, the problem is that the thickness of the radiator walkway should not be less
than what. The information retrieval returned three paragraphs, and the exact question
and answer of the knowledge graph returned 9 mm. The knowledge graph showed the
Intelligent Search Method in Power Grid 863

knowledge graph relationship of the radiator. The professional recommendation function


recommended relevant documents.

6 Conclusion and Future Work


This paper proposes an intelligent search method based on Elasticsearch and Knowledge
Graph and constructs a prototype knowledge search system for power grid dispatching
systems. In the experiment, the power enterprise technical standard document data set
is used for verification, and the proposed method is compared with other methods in
terms of evaluation indexes. The experimental results show that our proposed method
has higher accuracy in precision and recall, which proves that our proposed method is
effective.
In the future work, we intend to Construct a segment-based dictionary in electric
power to reduce the segmentation errors of electrical professional nouns caused by the
use of a universal word segmentation dictionary and thus improve the accuracy of the
proposed method.

References
1. Guo, R., Yang, Q., Liu, S.H., Li, W., Yuan, X., Huang, X.H.: Research and application on
construction of knowledge graph of power system fault handling. Power System Technol.
45(6), 2092–2100 (2021)
2. Qiao, J., Wang, X.Y., Min, R., et al.: Framework and key technologies of knowledge-graph-
based fault handling system in power grid. Proc. CSEE 40(18), 5837–5848 (2020)
3. Gao, H.X., Lu, M., Liu, J.N., et al.: Review on knowledge graph and its application in power
systems. Guangdong Electric Power 33(9), 66–76 (2020)
4. Quan, L.X., Ma, P.: Talk about a kind of intelligent search engine technology for big data
industry. Comput. Technol. Autom. 39(2), 170–176 (2020)
5. Yang, Q.: Research and practice of innovative data application based on elasticsearch in
Tianwan nuclear power plant. Power Big Data 24(2), 41–46 (2021)
6. Xue, T., Zhang, T.C., Zhuang, X.F., He, X.: Research and implementation of campus network
search engine based on scrapy framework and elasticsearch. Northeastern University, Editorial
Department of Control and Decision, 6 (2020)
7. Qin, J.C., Shen, H.L.: Research and implementation of elasticsearch based full-text search
platform on campus. Mod. Comput. Prof. Ed. 34, 96–100 (2018)
8. Aleksei, V., Aleksei, S., Shamil, M.I.: Big data processing for full-text search and visualization
with elasticsearch. Int. J. Adv. Comput. Sci. Appl. (IJACSA) 8(5) (2017)
9. Prashant, K., Aishwarya, S., Komal, D., Manaswini, M., Meghna, M.: Search engine for
Ebook portal. Int. J. Sci. Technol. Res. 4(8) (2014)
10. Liu, Q., Li, Y., Duan, H., Liu, Y., Qin, Z.G.: A review of knowledge graph construction. J.
Comput. Res. Dev. 53(3), 582–600 (2016)
11. Xu, H., Hong, Q., Yao, X.M., Li, X.L., Lu, S.Y.: Semantic search algorithm based on
knowledge graph in smart grid. Lab Res. Explor. 40(4), 71–74+86 (2021)
12. Wang, Q., Wei, J., Yan, R.Z., Qian, X.D., Yang, B.: Application of knowledge graph in smart
grid. Electron. Compon. Inf. Technol. 4(1), 135–137+147 (2020)
13. Liu, J., et al.: Application and research of knowledge graph in electric power field. Electric
Power Inf. Commun. Technol. 18(1), 60–66 (2020)
864 J. Wang et al.

14. Wang, M., Chen, W., Wang, S., Jiang, Y., Yao, L., Qi, G.: Efficient search over incomplete
knowledge graphs in binarized embedding space. Future Gener. Comput. Syst. 123, 24–34
(2021). https://doi.org/10.1016/j.future.2021.04.006
15. Xin, H., Duan, J., Dang, D.: Natural language question answering over knowledge graph: the
marriage of SPARQL query and keyword search. Knowl. Inf. Syst. 63(4), 819–844 (2021).
https://doi.org/10.1007/s10115-020-01534-4
16. Maulik, K., et al.: Text snippets to corroborate medical relations: an unsupervised approach
using a knowledge graph and embeddings. In: AMIA Joint Summits on Translational Science
proceedings 2020, vol. 5 (2020).
17. Mou, Q.L.: Research and Application of Deep Transfer Learning Algorithm for Text
Classification Problem. Northwest Normal University (2020)
18. Qu, Z.J., Fan, M.M., Zhou, R.L., Wang, H.L., Zhu, D.: Inverse index query technology of non-
primary row key for mass distribution network dispatching monitoring information. Power
Syst. Prot. Control 46(23), 162–168 (2018)
19. Fu, T.Y.: Application of Semantic Relevance in Patent Text. Jiangsu University of Science
and Technology (2020)
20. Xu, H., Hong, Q., Yao, X.M., Li, X.L., Lu, S.Y.: Semantic search algorithm based on
knowledge graph in smart grid. Lab. Res. Explor. 40(4), 71–74+86 (2021)
21. Liang, Z.C., Li, Z.W., Lai, K., Lin, Z.C., Li, T.G., Zhang, J.X.: Evaluation of general-
ization capability of predictive models using 10-fold cross validation and its R software
implementation. Chin. J. Hosp. Stat. 27(4), 289–292 (2020)
Longitudinal Guidance Control of Landing
Signal Officer Based on Variable Universe
Fuzzy Logic

Ming Zhao1,2 , Yang Liu1 , Hui Li1,2 , Yun Cao1(B) , Jian Xu1,2 , and Guilin Yao1,2
1 Harbin University of Commerce, Harbin 150028, China
hrbcu_lh@163.com
2 Heilongjiang Provincial Key Laboratory of Electronic Commerce and Information Processing,

Harbin 150028, China

Abstract. To ensure landing safety of carrier-based aircraft, a guidance control


system of the Landing Signal Officer (LSO) based on variable universe fuzzy logic
is presented in this paper. After analyzing influence factors during the longitudinal
landing process, the glideslope deviation and sink rate deviation are determined
as the safety factors. The LSO landing guidance control system characteristic and
structure are discussed. Considering the nonlinearity, complexity and fuzziness
of decision making behavior, a variable universe fuzzy system is designed to
realize the LSO prediction process. Simulation results show that the improved LSO
guidance prediction model presented in this paper can simulate the actual decision-
making characteristics of LSO, and the output results of the system conform to
the deviation correction effect under the real environment.

Keywords: Variable universe fuzzy controller · Longitudinal loop · Guidance


control · Landing signal officer

1 Introduction
Landing Signal Officer (LSO) guidance and evaluation technology play an important
role in ensuring carrier-based aircraft’s landing safety [1–3]. Due to the complexity of
the lending environment, the artificial landing of aircraft needs LSO guidance and super-
vision. The success rate of landing could be increased significantly, and the incidence of
accidents could be reduced with control of LSO [4, 5]. As a research object, LSO guid-
ance technology, wave-off decision technology, and evaluation technology are analyzed
and researched widely at present [6–8].
In order to analyze the influencing factors of carrier-based aircraft landing safety
comprehensively, after discussing the landing guidance characteristics of LSO, the guid-
ance prediction models of the longitudinal loop would be established, respectively.
Based on variable universe fuzzy logic, a coupling fuzzy guidance model is imple-
mented according to the priority principle, and an LSO landing guidance control system
is realized.

© The Author(s), under exclusive license to Springer Nature Switzerland AG 2022


A. E. Hassanien et al. (Eds.): BIIT 2021, LNDECT 107, pp. 865–872, 2022.
https://doi.org/10.1007/978-3-030-92632-8_82
866 M. Zhao et al.

2 Analysis of Safety Factors During Longitudinal Loop Landing


In the longitudinal loop, two factors have an impact on safety: glideslope deviation and
sink rate deviation.

2.1 Glideslope Deviation


It is a longitudinal real-time element that is the most concerned with LSO. By judging
the glideslope position of the carrier-based aircraft, the flight states of the aircraft can
be monitored all the time, and the ramp clearance should also be calculated.
As shown in Fig. 1, in the final landing stage, if the aircraft have a correct glideslope,
it can hook the arrester wire without LSO signals [7–12]. Else if the position of the
aircraft is higher than the ideal landing trajectory, it may be unable to complete the hook
action, and the wave-off maneuver would be adopted to escape for safety. Therefore,
LSO should not accept aircraft with a tendency higher than the ideal track. When the
aircraft’s position falls below the desired path, it could strike the carrier or sea. As a
result, the LSO can never accept the landing tendency of aircraft below the desired one
[5].

Very High
High
A Little High
OK
A Little Low
Low
Very Low

Desired touchdown

Carrier

Fig. 1. Glideslope position deviations

2.2 Sink Rate

The ideal landing longitudinal sink rate is constant. However, there is pitch, yaw and roll
motion during the landing process; the pilot sometimes changes the attitude by altering
the sink rate or throttle.
When the sink rate remains too low, the aircraft’s position could gradually higher than
the ideal glideslope, and the wave-off maneuver would be employed to ensure landing
safety. If the sink rate continues to be higher, the aircraft may be below the glideslope,
and there is a risk of falling into the sea or the ramp, which should be avoided.
Longitudinal Guidance Control of Landing Signal Officer 867

3 Landing Guidance Control System of LSO


The operating principle of the LSO landing guidance system is described that during the
landing process of carrier-based aircraft, the glideslope deviation and sink rate deviation
of longitudinal loop and centering deviation and drift rate deviation of the lateral loop
have been obtained based on the current position and attitude of aircraft. Both longitu-
dinal and lateral guidance control systems should be established for transmitting LSO
signals to guide aircraft. The structure of the LSO landing guidance system is shown in
Fig. 2.

LSO control loop

Predicion
information Coupling fuzzy Longitudinal fuzzy
guidance system guidance system

Aircraft control loop

Trajectory control loop Attitude control loop


Inner-loop Cockpit
Outer-loop Inner-loop controller
command
control control Airframe
Outer-loop technique technique Outer-loop
Engine state
command
Pilot vision sensitivity
APCS
Position Attitude

Fig. 2. The landing guidance system of LSO

4 Control Approach of Variable Universe Fuzzy Logic


4.1 Fuzzy Set Definition
Let Ai (i = 1, 2, ..., N ) is the fuzzy set of U ⊂ R, and μAi (x) is the fuzzy membership
function corresponding to Ai .
Definition 1. ∀x ∈ U , ∃Ai , s.t.μAi (x) > 0, then Ai is a complete division of the universe
U ⊂ R, where A = {Ai } is a complete fuzzy set.
Definition 2. Let A1 and A2 are arbitrarily two fuzzy set in universe U ⊂ R, if hgh(A1 ) >
hgh(A2 ), then A1 > A2 , where hgh(Ai ) = x ∈ U |μAi (x) = sup μAi (x ).
x ∈U

Definition 3. For a fuzzy set Ai of universe U ⊂ R, ∀x0 ∈ U , ∃Ai (x0 ) = 1, Aj (x0 ) = 0


(j = i), then Ai is called to be consistent in the universe U ⊂ R, and A = {Ai } is a
consistent fuzzy set.
868 M. Zhao et al.

4.2 Variable Universe Fuzzy Control Algorithm

The initial input universe at k = 0 moment of controller is Ui0 = [−ηCIi , CIi ], and
the initial fuzzy sets are A0ij (j = 1, 2, ..., m) in Ui0 , and the initial output universe is
V 0 = [−ξ YI , YI ].
At k moment, the center of Akij is xkij , −ηCIik = xki1 < xki2 < ... < xkimi < CIik , and
the center of Bjk is ykj , and −ξ YIk = yk1 < yk2 < ... < ykq < YIk , in the meantime, the
relationship between Akiri (ri = 1, 2, ..., ni ) and Akij (j = 1, 2, ..., m) is shown as (3), the
input and output universes change to

Uik = [−ηCIi αi (xik ), CIi αi (xik )] (1)

V k = [−ξ YI β(yk ), YI β(yk )] (2)

Through the effect of scaling factor, the universe shrinks with the decrease of vari-
ables. The controller that changes universe on-line is called variable universe fuzzy
controller, and the change process is shown in Fig. 3 [13–17].

Fig. 3. Shrinkage and expansion of universe

From the perspective of applicability, a triangular fuzzy membership function with


standard, completeness and consistency is usually selected to represent in engineering
Longitudinal Guidance Control of Landing Signal Officer 869

application, as shown in (3).


⎧  k

⎪ (xi − xki2 )/(xki1 − xki2 ) xki1 ≤ xik ≤ xki2

⎪ A k
(x k
) =


i1 i
0 xik ≤ xki1

⎪ ⎧

⎪ ⎪
⎨ (xi − xi(l−1) )/(xil − xi(l−1) ) xi(l−1) ≤ xi ≤ xil
k k k k k k k


Ail (xi ) = (xi − xi(l+1) )/(xil − xi(l+1) ) xil ≤ xi ≤ xi(l+1) (l = 2, 3, ..., ni − 1)
k k k k k k k k k

⎪ ⎪


⎪ 0 else

⎪ 



⎪ 0 xi1 ≤ xik ≤ xki(ni −1)
k

⎩ Aini (xi ) = (xk − xk
k k
i(ni −1) )/(x ini − x i(ni −1) ) x i(ni −1) ≤ xi ≤ x ini
k k k k k
i
(3)

5 Guidance Prediction System of LSO


5.1 Fuzzy Rule of Longitudinal Loop
The input values of the LSO longitudinal fuzzy inference system are glideslope deviation
and sink rate deviation, and the output signal is the longitudinal discrete instruction. The
fuzzy judgment language of the prediction controller adopts the seven-scale classifica-
tion, according to [2], the initial range of glideslope deviation is [−1.5, 1.5], and the
initial range of sink rate deviation is [−1, 1].
The design principles of the LSO longitudinal fuzzy guidance prediction system are
as follows:

(1) When the longitudinal input deviation is larger, the output signal should satisfy the
purpose of reducing the deviation as soon as possible. Especially there is a great
upward longitudinal deviation of carrier-based aircraft. The LSO should send the
“NB” instruction of downward direction to the pilot.
(2) When the longitudinal input deviation is smaller, based on considering the elimi-
nation of deviations, the system’s stability should be maintained as far as possible
to avoid excessive overshoot and oscillation.

Table 1 is the fuzzy control rule of the longitudinal loop.

5.2 Fuzzy Guidance System of Longitudinal Loop


The sinusoidal signal is input into the longitudinal fuzzy LSO decision-making system,
and the simulation curve of guidance instruction is obtained, as shown in Fig. 4. The
simulation results show that when the glideslope deviation is PB and the sink rate is
PB, the longitudinal output of LSO is NB. When the glideslope deviation is NB and the
sink rate is PB, the LSO longitudinal output is AZ. The output results satisfy the LSO
longitudinal instruction requirement.
The input signals of the lateral loop are the centering deviation and drift rate deviation,
and the output signal is the discrete instructions. The initial range of centering deviation
870 M. Zhao et al.

Table 1. Fuzzy control rule of the longitudinal loop.

Glideslpe deviation Sink rate


PB PM PS AZ NS NM NB
PB NB NB NM NM NM NS AZ
PM NB NB NM NM NM AZ AZ
PS NM NM NS NS AZ AZ PS
AZ NS NS NS AZ PS PS PM
NS NS NS AZ PM PM PM PM
NM AZ AZ PM PM PB PB PB
NB AZ AZ PM PM PB PB PB

1.5
Glideslope
1
0.5
Deviation

-0.5

-1
Sink rate
-1.5
0 1 2 3 4 5 6 7 8 9 10
t/s
(a) Input signals of longitudinal system

2
1.5
1 Longitudinal
0.5 instruction
Amplitude

0
-0.5
-1
-1.5
-2
0 1 2 3 4 5 6 7 8 9 10
t/s

(b) Output signals of longitudinal system

Fig. 4. Input/output curves of the longitudinal system


Longitudinal Guidance Control of Landing Signal Officer 871

is [−3.5, 3.5], and the initial range of drift rate deviation is [−1, 1]. The sinusoidal signal
with a period of 6.3 s of centering deviation and the signal with a period of 3.2 s of drift
rate deviation is input into the lateral controller. The output curve is shown in Fig. 5.

4
Centering
3
2
1
Deviation

0
-1
Drift rate
-2
-3
-4
0 1 2 3 4 5 6 7 8 9 10
t/s
(a) Input signals of lateral system

4
3
2
Lateral
Amplitude

1 Instruction
0
-1
-2
-3
-4
0 1 2 3 4 5 6 7 8 9 10
t/s
(b) Output signals of lateral system
Fig. 5. Input/output curves of lateral system

6 Conclusions
In the landing process of carrier-based aircraft, based on the characteristics of LSO
deviation instructions, the glideslope deviation, and sink rate deviation of the longitudinal
loop are determined as the influencing factors of carrier-based aircraft landing safety.
The deviations above are input into the landing guidance model of LSO and an intelligent
guidance prediction system fuzzy intelligent guidance system based on variable universe
fuzzy logic. Simulation results show that the guidance prediction model established by
fuzzy logic conforms to the operating characteristics of the true LSO, and the output
872 M. Zhao et al.

results accord with the correction of deviation effect under the real environment. In
the meantime, establishing the system model also provides an effective solution for the
objects with uncertainty, nonlinearity, and task environment complexity, especially for
the object is an individual.

Acknowledgments. This work is supported by the Natural Science Foundation of Heilongjiang


Province of China (No. YQ2020G002), University Nursing Program for Young Scholars with
Creative Talents in Heilongjiang Province (No. UNPYSCT-2020212), and Science Foundation of
Harbin Commerce University (No. 18XN064).

References
1. Rudowsky, T., Cook, S., Hynes, M.: Review of The Carrier Approach Criteria for Carrier-
Based Aircraft. Techinical Report NAWCADPAX/TR-2002/71, 52–58, 124–165, USA (2002)
2. Li, H., Jiang, H.T., Su, X.: Modeling landing signal officer instruction associated with
operation guide system. Int. J. Control Autom. 8(2), 373–382 (2015)
3. Wang, J., Wu, W., Jia, L.: Study and simulation analysis on wave-off capability of carrier-based
airplane. Aircr. Des. 30(4), 13–18 (2010)
4. Duan, Z., Wang, W., Geng, J., He, D.: Precision trajectory manual control technologies for
carrier-based aircraft approaching and landing. Acta Aeronaut. Astronaut. Sin. 40(4), 22328
(2019)
5. Zhu, Q., Yang, Z.: Dynamic recurrent fuzzy neural network-based adaptive sliding control
for longitudinal automatic carrier landing system. J. Intell. Fuzzy Syst. 37(1), 53–62 (2019)
6. Zuo, Z., Wang, L., Liu, H., Wang, Y.: Similarity for simulating automatic carrier landing
process of full-scale aircraft with scaled-model. Acta Aeronaut. Astronaut. Sin. 40(12), (2019)
7. Zhou, J., Jiang, J., Yu, C., Xiao, D.: Carrier aircraft dynamic inversion landing control based
on improved neural network. J. Harbin Eng. Univ. 39(10), 1649–1654 (2018)
8. Hess, R.: Simplified approach for modeling pilot pursuit control behaviour in multi-loop flight
control task. Inst. Mech. Eng. 220(2), 85–102 (2006)
9. Wang, L., Zhu, Q., Zhang, Z., Dong, R.: Modeling pilot behaviors based on discrete-time
series during carrier-based aircraft landing. J. Aircr. 53(6), 1922–1931 (2016)
10. Li, H.: Modeling landing signal officer instruction associated with operation guide system.
Int. J. Control Autom. 8(2), 373–382 (2016)
11. Shi, M., Cui, H., Qu, X.: Modeling landing signal officer for carrier approach. J. Beijing Univ.
Aeronaut. Astronaut. 32(2), 135–138 (2016)
12. Li, H.: Integrated evaluation technology of landing signal officer for carrier-based aircraft.
Int. J. Multimedia Ubiquit. Eng. 11(1), 169–178 (2016)
13. Shi, P., Xu, Z., Wang, S.: Variable universe adaptive fuzzy PID control of active supension.
Mech. Sci. Technol. Aerospace Eng. 38(5), 713–720 (2019)
14. Du, E., Wang, S., Chang, L.: Variable universe fuzzy controller design of missile mark
trajectory with feed-forward compensation. J. Acad. Armored Force Eng. 31(2), 84–89 (2017)
15. Yang, Z., Wang, H.: Maximum power point tracking for photovoltaic power system based on
asymmetric fuzzy control. Mech. Autom. 41(2), 153–156 (2012)
16. Liu, J., Zhang, Y.: Variable universe fuzzy PID control method in piezoelectric ceramic
precision displacement system. Autom. Instrum. 32(2), 45–49 (2017)
17. Li, D., Shi, Z., Li, Y.: Sufficient and necessary nonditions for Boolean fuzzy systems as
universal approximators. Inf. Sci. 178(2), 14–24 (2008)
Author Index

A G
Abdulghany, Nayera M., 796 Gao, Mengyu, 643
Alieksieiev, Volodymyr, 594 Geng, Wenli, 643
Gerguis, Ramy A., 796
Guo, Min, 152, 181, 214
B
Bai, Yu, 169 H
Bi, Huolong, 653 Han, Ping, 3, 61
Bieliatynskyi, Andrii, 826 Han, Tongtong, 152, 181, 214
Han, Wang, 70
Han, Xuena, 344, 441
C Han, Yong, 141
Cao, Yun, 482, 865 He, Kaiwen, 527
Heiden, Bernhard, 594
Chen, Ming, 537
Hong, Peng, 247, 321
Hu, Liangjin, 461
Hu, Wen, 537
D
Hu, Xueyong, 276, 854
Daizhi, Jin, 70
Huadong, Sun, 366, 392
Deyu, Chen, 761
Huang, Chongli, 15, 288
Dong, Xiaohong, 547, 567 Huang, Yao-yao, 661
Du, Yong, 247, 321
J
Jiang, Fan, 15, 288
E Jiang, Shu-bo, 88, 661
El-Telbany, Mohammed, 796 Jiang, Zaiyu, 276
Jianjun, Li, 786
Jin, Hao, 482
F Jin, Yining, 276, 844, 854
Fan, Jingtao, 257, 501 Jincheng, Su, 754
Fan, Ying, 267 Jing, Fan, 160
Fan, Zhipeng, 537 Jinpeng, Jiao, 98
Fang, Yang, 786 Junji, Liu, 51

© The Editor(s) (if applicable) and The Author(s), under exclusive license
to Springer Nature Switzerland AG 2022
A. E. Hassanien et al. (Eds.): BIIT 2021, LNDECT 107, pp. 873–875, 2022.
https://doi.org/10.1007/978-3-030-92632-8
874 Author Index

K R
Kang, Chengwen, 703 Ren, Wanli, 78
Kewen, Liu, 51 Rong, Xin, 557
Kong, Leilei, 141
S
L Shen, Yang, 190
Lei, Shuying, 703 Song, Rong, 401
Li, Ang, 511 Song, Yi, 653
Li, Dengping, 734 Stepanchuk, Oleksandr, 826
Li, Hui, 234, 309, 441, 482, 865 Su, Fangyuan, 670, 775
Li, Jianjun, 557, 670, 775 Su, Xiaodong, 332, 379
Li, Jiao, 61 Sun, Huadong, 415
Li, Juan, 613 Sun, Jianming, 344
Li, Lin, 25 Sun, Le, 854
Li, Shizhou, 332, 356, 379 Sun, Ming, 461
Li, Yanlai, 106 Sun, Xu, 141
Li, Zeming, 234, 441 Sun, Yutong, 141
Liang, Hongyu, 332, 379
Lin, Hai, 844 T
Liu, Junjun, 670, 775 Tan, Xue, 234, 309, 441
Liu, Kewen, 106 Tang, Meng, 257, 501
Liu, Qiuming, 577 Tang, Zheqing, 247
Liu, Wei, 537 Tian, Linlin, 689
Liu, Xiangbin, 577 Tian, Shaoqing, 15, 288
Liu, Xing, 836 Tonino-Heiden, Bianca, 594
Liu, Yang, 482, 865
Liye, Sun, 805 W
Lu, Shangkun, 681 Wang, Chaoyu, 379
Wang, Hongxin, 309
Wang, Jiaqiu, 276, 461, 844, 854
M
Wang, Jiaying, 511
Ma, Jian, 415
Wang, Peng, 276
Ma, Ling, 301
Wang, Ruijie, 844
Ma, Zhiwei, 429
Wang, Shiyu, 247
Maged, Maha A., 796
Wang, Tao, 131
Miao, Xiufeng, 715
Wang, Wenjing, 152, 181, 214
Mou, Hao, 844
Wang, Xiaoyu, 35
Wang, Yamin, 471
N Wang, Yanchun, 461
Newagy, Fatma, 796 Wang, Yang, 321
Ning, Hui, 141 Wang, Yu, 78, 247, 321, 450
Ning, Shi Yong, 206 Wei, SiFan, 450
Ning, Shiyong, 152, 181, 214 Wei, Sifan, 78
Niu, Chaoqun, 703 Wei, Tianshi, 247
Wen, Yan, 527, 734
P Wu, Jinpeng, 344
Pan, Wei, 321 Wu, Shirui, 119, 332, 356
Pang, Shi-kun, 605
Peng, Ezhen, 744 X
Pengfei, Zhao, 366, 392 Xiao, Yuanhao, 567
Pylypenko, Oleksandr, 826 Xiaodi, Xu, 786
Xie, Hai, 836
Q Xie, Jie, 689
Qiu, Zeguo, 511 Xin, HaiTao, 491
Qu, Yi, 653, 836 Xin, Haitao, 744
Author Index 875

Xu, Jian, 865 Yue, Xin, 78


Xu, Nan, 471 Yunxi, Xie, 632
Xu, Xiaoxia, 3 Yuxuan, Zhang, 632
Xu, Yaoqun, 257, 450, 501, 715
Z
Y Zaky, Youssef Y., 796
Yang, Caixia, 106 Zhai, Lili, 106
Yang, Fang, 557 Zhang, Genlin, 689
Yang, Guang, 527, 734 Zhang, Hui, 461
Yang, Huixin, 119, 356 Zhang, Jinping, 267
Yang, Hui-ying, 605 Zhang, Lizhi, 415
Yang, Jing, 225, 723 Zhang, Tian, 613
Yang, Xinhua, 854 Zhang, Yanrong, 401
Yang, Yu, 557, 670, 775 Zhang, Yu, 78, 450
Yanrong, Zhang, 632 Zhang, Yurong, 332, 379
Yao, Fengge, 25, 35, 623 Zhang, Yuru, 190, 482
Yao, Guilin, 119, 332, 356, 429, 865 Zhao, Ming, 482, 865
Ye, Jingming, 78 Zhao, Na, 623
Ying, Bai, 98 Zhao, Wei, 234, 441
Yingjing, Zhang, 366, 392 Zhao, Xiao, 206
Yong, Baishuo, 225, 723 Zhao, Xiao-Han, 88
Youssef, Ahmed, 796 Zhao, Zhijie, 415, 511
Youssef, George S., 796 Zheng, ChenYang, 547
Yu, Bin, 321 Zheng, Dequan, 225, 723
Yu, Hao, 491 Zhou, Bin, 471
Yu, Jintao, 169 Zhou, Wanting, 511
Yu, Yang, 786 Zhu, Haibo, 817

You might also like