Anti Fraud Engineering For Digital Finance Behavioral Modeling Paradigm 1St Edition Wang Online Ebook Texxtbook Full Chapter PDF

You might also like

Download as pdf or txt
Download as pdf or txt
You are on page 1of 69

Anti-Fraud Engineering for Digital

Finance: Behavioral Modeling Paradigm


1st Edition Wang
Visit to download the full and correct content document:
https://ebookmeta.com/product/anti-fraud-engineering-for-digital-finance-behavioral-m
odeling-paradigm-1st-edition-wang/
More products digital (pdf, epub, mobi) instant
download maybe you interests ...

Battery System Modeling 1st Edition Shunli Wang

https://ebookmeta.com/product/battery-system-modeling-1st-
edition-shunli-wang/

Contemporary Issues in Behavioral Finance 1st Edition


Simon Grima

https://ebookmeta.com/product/contemporary-issues-in-behavioral-
finance-1st-edition-simon-grima/

Stop. Think. Invest.: A Behavioral Finance Framework


for Optimizing Investment Portfolios 1st Edition Bailey

https://ebookmeta.com/product/stop-think-invest-a-behavioral-
finance-framework-for-optimizing-investment-portfolios-1st-
edition-bailey/

Primary Mathematics 3A Hoerst

https://ebookmeta.com/product/primary-mathematics-3a-hoerst/
Advances in Behavioral Economics and Finance
Leadership: Strategic Leadership, Wise Followership and
Conscientious Usership in the Digital Century 2nd
Edition Julia Puaschunder
https://ebookmeta.com/product/advances-in-behavioral-economics-
and-finance-leadership-strategic-leadership-wise-followership-
and-conscientious-usership-in-the-digital-century-2nd-edition-
julia-puaschunder/

Digital Marketing All-In-One For Dummies (For Dummies


(Business & Personal Finance)) Diamond

https://ebookmeta.com/product/digital-marketing-all-in-one-for-
dummies-for-dummies-business-personal-finance-diamond/

Digital Customer Experience Engineering: Strategies for


Creating Effective Digital Experiences 1st Edition Lars
Wiedenhoefer

https://ebookmeta.com/product/digital-customer-experience-
engineering-strategies-for-creating-effective-digital-
experiences-1st-edition-lars-wiedenhoefer/

Behavioral Finance and Asset Prices: The Influence of


Investor's Emotions 1st Edition David Bourghelle

https://ebookmeta.com/product/behavioral-finance-and-asset-
prices-the-influence-of-investors-emotions-1st-edition-david-
bourghelle/

Corporate Finance: Economic Foundations and Financial


Modeling, 3rd Edition Cfa Institute

https://ebookmeta.com/product/corporate-finance-economic-
foundations-and-financial-modeling-3rd-edition-cfa-institute/
Cheng Wang

Anti-Fraud
Engineering for
Digital Finance
Behavioral Modeling Paradigm
Anti-Fraud Engineering for Digital Finance
Cheng Wang

Anti-Fraud Engineering
for Digital Finance
Behavioral Modeling Paradigm
Cheng Wang
Department of Computer Science
and Engineering
Tongji University
Shanghai, China

ISBN 978-981-99-5256-4 ISBN 978-981-99-5257-1 (eBook)


https://doi.org/10.1007/978-981-99-5257-1

Jointly published with Tongji University Press Co., Ltd.


The print edition is not for sale in China (Mainland). Customers from China (Mainland) please order the
print book from: Tongji University Press Co., Ltd.

© Tongji University Press 2023

This work is subject to copyright. All rights are solely and exclusively licensed by the Publisher, whether
the whole or part of the material is concerned, specifically the rights of reprinting, reuse of illustrations,
recitation, broadcasting, reproduction on microfilms or in any other physical way, and transmission or
information storage and retrieval, electronic adaptation, computer software, or by similar or dissimilar
methodology now known or hereafter developed.
The use of general descriptive names, registered names, trademarks, service marks, etc. in this publication
does not imply, even in the absence of a specific statement, that such names are exempt from the relevant
protective laws and regulations and therefore free for general use.
The publishers, the authors, and the editors are safe to assume that the advice and information in this book
are believed to be true and accurate at the date of publication. Neither the publishers nor the authors or
the editors give a warranty, expressed or implied, with respect to the material contained herein or for any
errors or omissions that may have been made. The publishers remain neutral with regard to jurisdictional
claims in published maps and institutional affiliations.

This Springer imprint is published by the registered company Springer Nature Singapore Pte Ltd.
The registered company address is: 152 Beach Road, #21-01/04 Gateway East, Singapore 189721,
Singapore

Paper in this product is recyclable.


Contents

1 Overview of Digital Finance Anti-fraud . . . . . . . . . . . . . . . . . . . . . . . . . . 1


1.1 Situation of Anti-fraud Engineering . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.2 Challenge of Anti-fraud Engineering . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.3 Strategies of Anti-fraud Engineering . . . . . . . . . . . . . . . . . . . . . . . . . . 4
1.4 Typical Application in Financial Scenarios . . . . . . . . . . . . . . . . . . . . . 4
1.5 Outline of This Book . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
2 Vertical Association Modeling: Latent Interaction Modeling . . . . . . . . 11
2.1 Introduction to Vertical Association Modeling in Online
Services . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
2.2 Related Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
2.2.1 Composite Behavioral Modeling . . . . . . . . . . . . . . . . . . . . . . . 13
2.2.2 Customized Data Enhancement . . . . . . . . . . . . . . . . . . . . . . . . 14
2.3 Fine-Grained Co-occurrences for Behavior-Based Fraud
Detection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
2.3.1 Fraud Detection System Based in Online Payment
Services . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
2.3.2 Experimental Evaluation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
2.4 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39
2.4.1 Behavior Enhancement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39
2.4.2 Future Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39
3 Horizontal Association Modeling: Deep Relation Modeling . . . . . . . . . 43
3.1 Introduction to Horizontal Association Modeling in Online
Services . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43
3.1.1 Behavior Prediction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44
3.1.2 Behavior Sequence Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . 45

v
vi Contents

3.2 Related Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46


3.2.1 Fraud Prediction by Account Risk Evaluation . . . . . . . . . . . . 47
3.2.2 Fraud Detection by Optimizing Window-Based
Features . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48
3.3 Historical Transaction Sequence for High-Risk Behavior Alert . . . . 49
3.3.1 Fraud Prediction System Based on Behavior Prediction . . . . 49
3.3.2 Experimental Evaluation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53
3.3.3 Enhanced Anti-fraud Scheme . . . . . . . . . . . . . . . . . . . . . . . . . . 59
3.4 Learning Automatic Windows for Sequence-Form Fraud
Pattern . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62
3.4.1 Fraud Detection System based on Behavior Sequence
Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62
3.4.2 Experimental Evaluation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70
3.5 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81
3.5.1 Behavior Prediction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81
3.5.2 Behavior Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 82
3.5.3 Future Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 82
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 83
4 Explicable Integration Techniques: Relative Temporal Position
Taxonomy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 87
4.1 Concepts and Challenges . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 87
4.2 Main Technical Means of Anti-fraud Integration System . . . . . . . . . 89
4.2.1 Anti-fraud Function Divisions . . . . . . . . . . . . . . . . . . . . . . . . . 89
4.2.2 Module Integration Schemes . . . . . . . . . . . . . . . . . . . . . . . . . . . 90
4.2.3 Explanation Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 90
4.3 System Integration Architecture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91
4.3.1 Anti-fraud Function Modules . . . . . . . . . . . . . . . . . . . . . . . . . . 93
4.3.2 Center Control Module . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 94
4.3.3 Communication Architecture . . . . . . . . . . . . . . . . . . . . . . . . . . 99
4.4 Performance Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 100
4.4.1 Experimental Set-Up . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 100
4.4.2 Implementation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 101
4.4.3 Evaluation of System Performance . . . . . . . . . . . . . . . . . . . . . 103
4.4.4 Exemplification of CAeSaR’s Advantages . . . . . . . . . . . . . . . 106
4.5 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 108
4.5.1 Faithful Explanation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 108
4.5.2 Online Learning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 109
4.6 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 110
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 110
5 Multidimensional Behavior Fusion: Joint Probabilistic
Generative Modeling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 113
5.1 Online Identity Theft Detection Based on Multidimensional
Behavioral Records . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 113
5.2 Overview of the Solution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 116
Contents vii

5.3 Identity Theft Detection Solutions in Online Social Networks . . . . . 118


5.3.1 Composite Behavioral Model . . . . . . . . . . . . . . . . . . . . . . . . . . 118
5.3.2 Identity Theft Detection Scheme . . . . . . . . . . . . . . . . . . . . . . . 121
5.4 Evaluation and Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 121
5.4.1 Datasets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 121
5.4.2 Experiment Settings . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 123
5.4.3 Performance Comparison . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 131
5.5 Literature Review . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 134
5.6 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 136
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 137
6 Knowledge Oriented Strategies: Dedicated Rule Engine . . . . . . . . . . . . 139
6.1 Online Anti-fraud Strategy Based on Semi-supervised
Learning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 139
6.2 Development and Present State . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 141
6.2.1 Anti-fraud in Online Services . . . . . . . . . . . . . . . . . . . . . . . . . . 141
6.2.2 Graph Neural Networks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 142
6.2.3 Weak Supervision . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 142
6.3 Risk Prediction Measures in Online Lending Services . . . . . . . . . . . 143
6.3.1 Preliminary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 143
6.3.2 Graph-Oriented Snorkel . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 144
6.3.3 Heterogeneous Graph Neural Network . . . . . . . . . . . . . . . . . . 148
6.3.4 Loss Function . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 152
6.4 Risk Assessment and Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 152
6.4.1 Datasets and Evaluation Metrics . . . . . . . . . . . . . . . . . . . . . . . . 152
6.4.2 Baseline Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 154
6.4.3 Implementation Details . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 155
6.4.4 Performance Comparison . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 155
6.4.5 Ablation Study . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 157
6.4.6 Parameter Sensitivity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 159
6.5 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 160
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 160
7 Enhancing Association Utility: Dedicated Knowledge Graph . . . . . . . 163
7.1 Gang Fraud Prediction System Based on Knowledge Graph . . . . . . 163
7.2 Related Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 165
7.3 Recovering-Mining-Clustering-Predicting Framework . . . . . . . . . . . 167
7.3.1 Recovering Missing Associations . . . . . . . . . . . . . . . . . . . . . . 167
7.3.2 Mining Underlying Associations . . . . . . . . . . . . . . . . . . . . . . . 172
7.3.3 Clustering and Predicting . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 174
7.4 Experimental Evaluation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 177
7.4.1 Dataset Description and Experiment Settings . . . . . . . . . . . . . 177
7.4.2 On Model Comparison . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 178
viii Contents

7.4.3 On Address Disambiguation . . . . . . . . . . . . . . . . . . . . . . . . . . . 180


7.4.4 On Network Embedding . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 183
7.5 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 186
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 186
8 Associations Dynamic Evolution: Evolving Graph Transformer . . . . . 189
8.1 Dynamic Fraud Detection Solution Based on Graph
Transformer . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 189
8.2 Related Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 192
8.3 Fraud Detection in Online Lending Services . . . . . . . . . . . . . . . . . . . . 193
8.3.1 Preliminary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 193
8.3.2 Graph Transformer . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 194
8.3.3 Evolving Graph Transformer . . . . . . . . . . . . . . . . . . . . . . . . . . 197
8.4 Experimental Evaluation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 198
8.4.1 Datasets and Metrics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 198
8.4.2 Baseline Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 201
8.4.3 Implementation Details . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 201
8.4.4 Results for Node Classification . . . . . . . . . . . . . . . . . . . . . . . . . 201
8.4.5 Results for Edge Classification . . . . . . . . . . . . . . . . . . . . . . . . . 203
8.4.6 Ablation Study . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 204
8.4.7 Parameter Sensitivity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 205
8.5 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 205
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 206
Chapter 1
Overview of Digital Finance Anti-fraud

1.1 Situation of Anti-fraud Engineering

The combination of digital technology and finance gives birth to new business forms.
Under the trend of Internet plus, financial technology start-ups, innovative business
models, and solutions are emerging, covering many fields, e.g., third-party payment
[11], online insurance [1], online lending [2], and traditional banking innovation
business [3]. On the one hand, emerging digital financial institutions are constantly
infiltrating into traditional financial businesses. On the other hand, traditional insti-
tutions are also involved in digital finance in many ways.
With the support of digital technology, the fraud risks in digital finance are esca-
lating and the development potential of financial markets is gradually enlarged [4–6].
At the same time, the hidden risks are also increasing where frauds emerge in an
endless stream. From the perspective of platform fraud, the fraudulent platforms of
default account for a huge proportion. From the perspective of personal fraud, digital
financial fraud, led by the Internet black market, has been rampant and penetrated
into various links [7], e.g., digital financial marketing, registration, lending, and pay-
ment. The digital financial platform which uses data analysis to carry out financial
business is one of the main goals of the black market attack. The high incidence of
fraud reduces consumers’ trust in digital financial services. So the risk control link
of digital finance is generally facing greater pressure.

1.2 Challenge of Anti-fraud Engineering

Financial businesses have achieved rapid development by rooting in digital tech-


nology. Traditional financial businesses are constantly moving online forms, and
financial fraud is constantly updated and complicated. Accordingly, fraud means are
characterized by specialization, industrialization, concealment, and scenario [8–11].

© Tongji University Press 2023 1


C. Wang, Anti-Fraud Engineering for Digital Finance,
https://doi.org/10.1007/978-981-99-5257-1_1
2 1 Overview of Digital Finance Anti-fraud

Fig. 1.1 Digital financial frauds in different fields

Specialization. In the context of digital finance, fraud forms have evolved from
simple account stolen and swiped to more complex and diverse forms using big data
and other cutting-edge technologies. The criminals commit fraud from a widespread
pattern to a precise one, and overlap complex and diverse fraud tricks (as shown
in Fig. 1.1), e.g., pyramid selling, part-time money making, online purchase refund,
financial management, and virtual currency. A variety of fraud forms, together with
the injection of new technologies such as digital finance and blockchain, make digital
financial fraud more confusing, and difficult to identify, where the victim is unable
to prevent fraud.
Industrialization. Compared with traditional fraud, digital financial fraud is often
organized and large-scale. Criminals have a clear division of labor, close cooperation
and cooperation to commit crimes, forming a complete criminal industry chain.
This chain mainly includes four links: development and production, wholesale and
retail, fraud implementation, and money laundering. Through further subdivision, it
can be divided into .15 specific divisions (as shown in Fig. 1.2), including software
development, hardware production, network hackers, phishing retail, domain name
dealers, personal letter wholesale, bank card dealers, phone card dealers, ID card
dealers, phone fraud, mass SMS sending, online promotion, cash withdrawal, e-
commerce platform shopping, pornographic gambling, and drug websites.
Concealment. The virtual characteristics of Internet and digital technologies lead
to more covert fraud, which is mainly reflected in three aspects. First, criminals
tend to commit crimes in different places, which leads to the gradual mobile trend
of financial fraud. Digital financial scams are not limited by space, and even the
criminals in the same fraud gang come from all over the country. Second, criminals
tend to use multiple small transfers to achieve fraud. Due to the universality of digital
finance and the sinking service of customers, most of the losses caused by a single
fraud are less than .1,500 dollars. Third, it is difficult to obtain evidence only from
traditional solutions. Digital financial fraud often involves the problem of account
theft and identity fraud.
1.2 Challenge of Anti-fraud Engineering 3

Fig. 1.2 Basic chain of digital financial fraud

Scenario. Most digital financial services are carried out on the basis of specific
scenarios, and the corresponding financial fraud also presents scenario-based char-
acteristics. Taking online shopping as an example, digital financial institutions can
carry out various financial services such as consumer finance, supply chain finance
and return insurance by relying on online shopping. If the buyers and sellers collude
and make up the transaction behaviors, there may be multiple fraudulent behaviors
in this case. The sellers get inflated trading volume and obtain higher credit limit of
supply chain finance. The buyers may use consumer finance to cash out through false
purchase behaviors. In addition, the two parties can also return the goods fraudulently
paid freight insurance.
Traditional anti fraud solutions face many challenges in the new situation, such
as single dimension, low efficiency, and limited scope [12–14].
Single dimension. Traditional anti-fraud methods are based on a single dimension,
so it is difficult to form a multi-dimensional user portrait. This makes them difficult
to analyze customers’ behavioral preferences, solvency, payment ability, and fraud
tendency through user portraits. Take the credit investigation of the People’s Bank
of China as an example, there are still a large number of white credit households in
China (without credit card and other borrowing records) limited by the single data
source. It is necessary to build a multi-dimensional credit investigation system to
reduce fraud risk.
Low efficiency. Traditional anti-fraud techniques require a lot of manual operation
and high application cost. As the customer base of fintech business sinks, transactions
show the characteristics of frequent, real-time and large volume. Traditional anti-
fraud methods are not effective at identifying small amount and high-frequency
fraud, which makes it difficult for them to serve a sinking customer base.
4 1 Overview of Digital Finance Anti-fraud

Limited scope. With the in-depth development of digital technology, the combination
of financial fraud and other scenes is increasingly close. Non-financial scenarios such
as online shopping and online games also contain financial fraud risks, which are
difficult to identify with traditional anti-fraud techniques.

1.3 Strategies of Anti-fraud Engineering

Nowadays, the fraud carried out by criminals is usually characterized by gangs,


industrialization, and scale. Big data, artificial intelligence, and other cutting-edge
technologies are widely used to enhance fraud ability. The detection ability of anti
fraud technology directly affects the actual effect of anti fraud in digital finance. From
the perspective of the application and technology, digital financial anti fraud tech-
nology can be divided into data collection, data analysis, decision-making engines,
and other types.
Data acquisition obtains customer-related data from clients or networks. The use of
data acquisition technology should strictly follow laws, regulations, and regulatory
requirements. And it should collect user data under the condition of obtaining user
authorization. Data acquisition technologies include device fingerprint, web crawler,
biometrics, location identification, in-liveness detection, and so on.
Data analysis refers to discovering knowledge from data. Machine learning tech-
nology is a data analysis technology to achieve anti fraud through model prediction.
It relies on data, trains appropriate models through data analysis, and then uses the
models for prediction, to achieve anti fraud effect. It includes supervised machine
learning mode, unsupervised machine learning mode, and semi-supervised machine
learning mode.
Decision-making engine is the core of the digital anti-fraud system. A powerful
decision engine can effectively integrate various anti-fraud methods such as reputa-
tion lists, expert rules, and anti-fraud models. It also can provide anti-fraud personnel
with an efficient and functional human-computer interaction interface, greatly reduc-
ing anti-fraud operating costs and response speed. A decision engine can be judged
from multiple dimensions, such as engine processing capacity, response speed, and
UI interface.

1.4 Typical Application in Financial Scenarios

We mainly introduce two typical fraud scenarios and security solutions on the basis
of summarizing the manifestations of digital financial fraud, i.e., association mining
solutions in online payment services and association enhancement solutions in online
lending services. In view of the fraud in each scenario, we mainly introduce the
anti fraud technology and its application cases, and analyze potentially available
technologies.
1.4 Typical Application in Financial Scenarios 5

Fig. 1.3 The fraud process Fraudster


of online payment using
stolen accounts Spread Trojan Trading

Frauder
Monetization Money transfer

Black market Online mall

Online payment. In the payment stage, black industry groups often steal and use
personal names, mobile phone numbers, ID card numbers, bank card numbers and
other factors directly related to account security through social engineering methods.
They mainly adopt Fake WiFi, viral quick response code, pirated APP and Trojan link
to steal users’ private information. Then the collected key information is classified
and stored in the database. Account information (such as game accounts and financial
accounts) is used for financial crimes and realization through the black industry chain.
The user’s real information is used for selling and shoplifting.
For instance, a college student found that .50,000 dollars in his bank card were
“missing”. After repeated inquiries, he was informed that he had registered a new
account on an e-commerce platform and purchased up to .49, 966 dollars of goods.
It is not actually his purchase behavior. In the case of online payment using stolen
accounts, four specific operations are usually involved (as shown in Fig. 1.3). Step 1:
Spread Trojan. The gang sent fake short messages with Trojan links through fake base
stations around the university town. After the student clicked the link, his username
and password were revealed. Step 2: Trading. Because it is difficult and risky to
steal bank cards directly, fraudsters will think of cashing in through shopping in the
mall after they have mastered all kinds of information. Step 3: Money transfer. After
registering the account and binding the bank card, fraudsters will buy high-value
items such as gold, mobile phones, etc. through the online mall. And they receive the
goods through the interception of incoming calls or the setting of call transfer. Step 4:
Monetization. They monetize the stolen goods purchased through the underground
black industrial chain stolen goods selling network.
In this case, we can use the behavior sequence, biological probe and relation-
ship mapping technology to predict the risk in the early, middle and late stages of
the payment process. The behavior sequence technology can find the abnormality
of purchase records by comparing with the previously recorded weekday shopping
habits. The biological probe technology can judge the user’s usage habits according
to the user’s pressing force, finger contact surface, sliding screen speed and other
indicators, and detect the abnormal use in online shopping. In addition, the rela-
tionship mapping technology can estimate the user’s credit through the user’s social
relationships, and evaluate the user’s demand for the purchased goods.
Online lending. The main forms of fraud in online lending services are as follows:
agency, gang crime, machine behavior, account theft, identity fraud, and serial trans-
6 1 Overview of Digital Finance Anti-fraud

Fig. 1.4 The fraud process Fraudster Get personal Apply for
of online payment using information online loan
stolen accounts

actions. Among them, identity fraud is a relatively common fraud, which refers to the
lender forges personal identity, property certificates, and other materials provided.
Even the fraudsters use illegal means such as deception to obtain other people’s
information, thus posing as another person’s identity to cheat. Figure 1.4 shows the
process of identity fraud.
In life, the malicious agency may solicit college students to take part-time jobs
through social software. They give each student a mobile phone card and ask the
student to take the card to the bank to apply for a salary card. The agency can obtain
the student’s ID card, student status, education and other information by using the
bank card and mobile phone number for the purpose of registration, and then apply
for multiple credit businesses from the online loan platform.
For fraud in online lending, we can adopt face recognition, user portrait and other
technologies. Face recognition technology can identify whether the loan application
is initiated by the borrower himself or herself. Since some online loan platforms have
no video verification process, they need to be further verified with precise portrait
and other technologies. These platforms can depict the personal characteristics of
customers through text semantic analysis, user behavior analysis, and terminal anal-
ysis, in the whole process of online loan transactions. For example, through data
analysis of behavior trajectory, it is found that normal customers will stay for a few
seconds at each node of the application, and it will take at least .5 min to complete
the entire loan application process, while data analysis shows that the fraudsters will
complete all the processes in less than .10 s.

1.5 Outline of This Book

In this book, we will introduce some key technologies of anti-fraud engineering


in two representative application backgrounds. The content structure of this book
is shown in Fig. 1.5. To begin with, we mainly introduce the solution of behavior
modeling based on different perspectives of associations.

• Latent interaction modeling via Vertical Associations (Chap. 2).


• Deep relation modeling via Horizontal Associations (Chap. 3).

Then, we will introduce the model-level integrated technologies on two associa-


tions modelling solutions to further improve detection performance.
1.5 Outline of This Book 7

Multidimensional Behavioral Modeling


Knowledge Enhancing Associations
Explicable Integration Techniques

Oriented Association Dynamic


Strategies Utility Evolution
(Chapter 6) (Chapter 7) (Chapter 8)
(Chapter 4)

(Chapter 5)
Vertical Association Modeling Horizontal Association Modeling
Historical Transaction Sequence
Fine-Grained Co-Occurrences
Learning Automatic Windows

Behavior Association Modeling (Chapter 2, 3)

Fig. 1.5 Architecture of this book

• We design a novel three-way taxonomy of function division and integration tech-


niques to cope with complex and varied frauds (Chap. 4).
• We propose a joint (instead of fused) model to capture both online and offline
features of a user’s composite behavior (Chap. 5).

Meanwhile, we also explore some advanced technologies to customize and


improve the efficient performance of our behavior association modeling in more
cases.

• We propose a dedicated graph-oriented framework to solve the scarcity of data


labels (Chap. 6).
• We introduce a knowledge graph to address the low-quality data problem by
enhancing the utility of associations (i.e., recovering missing associations and
mining underlying associations) (Chap. 7).
• We propose a technical framework for dynamic heterogeneous graphs, which can
realize effective fraud detection in the face of evolving behavior patterns (Chap. 8).

The key technologies and methods of anti-fraud engineering for digital finance
include the following aspects:

• Fine-Grained co-occurrences modeling. The effectiveness of behavior-based meth-


ods often depends heavily on the sufficiency of user behavioral data. So it is a big
challenge to build high-resolution behavioral models by using low-quality behav-
ioral data. We mainly address this problem from data enhancement for behavioral
modeling. We extract fine-grained co-occurrence relationships of transactional
attributes by using a knowledge graph. Furthermore, we adopt the heterogeneous
network embedding to learn and improve representing comprehensive relation-
ships. More details will be provided in Chap. 2.
8 1 Overview of Digital Finance Anti-fraud

• Historical transaction sequence modeling. Account theft is indeed predictable


based on users’ high-risk behaviors, without relying on the behaviors of thieves.
Accordingly, we propose an account risk prediction scheme to realize the ex-ante
fraud detection. It takes in an account’s historical transaction sequence, and out-
puts its risk score. The risk score is then used as an early evidence of whether a
new transaction is fraudulent or not, before the occurrence of the new transaction.
More details will be provided in Chap. 3.
• Learning automatic windows technology. For the most significant features of
online payment fraudulent transactions are exhibited in a sequential form, the
sliding time window is a widely-recognized effective tool for this problem. How-
ever, the adaptive setting of sliding time window is really a big challenge, since
the transaction patterns in real-life application scenarios are often too elusive to
be captured. We pursue an adaptive learning approach to detect fraudulent online
payment transactions with automatic sliding time windows. We design an intelli-
gent window, called learning automatic window. It utilizes the learning automata to
learn the proper parameters of time windows and adjust them dynamically and reg-
ularly according to the variation and oscillation of fraudulent transaction patterns.
More details will be provided in Chap. 3.
• Integration scheme based on three-way taxonomy of function division. The inte-
gration of proper function modules is an effective way to further improve detection
performance by overcoming the inability of single-function methods to cope with
complex and varied frauds. However, a qualified integration is really inaccessible
under multiple demanding requirements, i.e., improving detection performance,
ensuring decision explainability, and limiting processing latency and computing
consumption. We propose a qualified integration system that can simultaneously
meet all of the above requirements. Particularly, it can assign the most effec-
tive decision strategy to the corresponding transaction adaptively by a devised
stacking-based multi-classification. More details will be provided in Chap. 4.
• Multidimensional behavioral modeling through joint probabilistic generative mod-
eling We concentrate on the issue, i.e., a bridge from coarse behavioral data to an
effective, quick-response, and robust behavioral mode, in online social networks
(OSNs) where users usually have composite behavioral records, consisting of
multi-dimensional low-quality data, e.g., offline check-ins and online user gener-
ated content (UGC). As an insightful result, we validate that there is a comple-
mentary effect among different dimensions of records for modeling users’ behav-
ioral patterns. To deeply exploit such a complementary effect, we propose a joint
(instead of fused) model to capture both online and offline features of a user’s
composite behavior. More details will be provided in Chap. 5.
• Knowledge graph-oriented framework. Most of on-going transactions have no
labels since platforms cannot determine whether they are frauds until a certain
amount of transactions. Traditional machine learning methods are not good at
dealing with the problem, though they have achieved qualified anti-fraud perfor-
mance in other Internet financial scenarios. To address this issue, we propose a
Snorkel-based Semi-Supervised GNN. We specially design an upgraded version
of the rule engines, called Graph-Oriented Snorkel, a graph-specific extension of
References 9

Snorkel, a widely-used weakly supervised learning framework, to design rules by


subject matter experts and resolve confliction. More details will be provided in
Chap. 6.
• Enhancing association utility technology. It is challenging that online lending
gang fraud predictions need to detect evolving and increasingly impalpable fraud
patterns based on low-quality data, i.e., very preliminary and coarse applicant
information. The technical difficulty mainly stems from two factors: the extreme
deficiency of information associations and weakness of data labels. In this work,
we mainly address the challenges by enhancing the utility of associations (i.e.,
recovering missing associations and mining underlying associations) on a knowl-
edge graph. Moreover, we propose an integrated framework which is consists of
four steps: Recovering, Mining, Clustering, and Predicting, for efficiently predict-
ing gang fraud. More details will be provided in Chap. 7.
• Enhancing association utility technology. It is challenging that online lending
gang fraud predictions need to detect evolving and increasingly impalpable fraud
patterns based on low-quality data, i.e., very preliminary and coarse applicant
information. The technical difficulty mainly stems from two factors: the extreme
deficiency of information associations and weakness of data labels. In this work,
we mainly address the challenges by enhancing the utility of associations (i.e.,
recovering missing associations and mining underlying associations) on a knowl-
edge graph. Moreover, we propose an integrated framework which is consists of
four steps: Recovering, Mining, Clustering, and Predicting, for efficiently predict-
ing gang fraud. More details will be provided in Chap. 8.

References

1. M.E. Haque, M.E. Tozal, IEEE Trans. Serv. Comput. 15(4), 2356 (2022)
2. W. Min, Z. Tang, M. Zhu, Y. Dai, Y. Wei, R. Zhang, in Proceedings of Workshop on Misinfor-
mation and Misbehavior Mining on the Web, Marina Del Rey, CA (2018)
3. M.A. Ali, B. Arief, M. Emms, A.P.A. van Moorsel, IEEE Secur. Privacy 15(2), 78 (2017)
4. E. Bursztein, B. Benko, D. Margolis, T. Pietraszek, A. Archer, A. Aquino, A. Pitsillidis, S.
Savage, Proc. ACM IMC 2014, 347–358 (2014)
5. T.C. Pratt, K. Holtfreter, M.D. Reisig, J. Res. Crime Delinq. 47(3), 267 (2010)
6. Z. Li, J. Song, S. Hu, S. Ruan, L. Zhang, Z. Hu, J. Gao, in Proceedings IEEE ICDE 2019,
Macao, China (8–11 Apr 2019), pp. 1898–1903
7. Y. Zhang, Y. Fan, Y. Ye, L. Zhao, C. Shi, in Proceedings of the 28th ACM International
Conference on Information and Knowledge Management, CIKM 2019, Beijing, China (3–7
Nov 2019) ed. by W. Zhu, D. Tao, X. Cheng, P. Cui, E.A. Rundensteiner, D. Carmel, Q. He,
J.X. Yu (ACM, 2019), pp. 549–558. https://doi.org/10.1145/3357384.3357876
8. A.D. Pozzolo, G. Boracchi, O. Caelen, C. Alippi, G. Bontempi, IEEE Trans. Neural Netw.
Learn. Syst. 29(8), 3784 (2018)
9. B. Cao, M. Mao, S. Viidu, P.S. Yu, in Proceedings IEEE ICDM 2017, New Orleans, LA, USA
(18–21 Nov 2017), pp. 769–774
10. C. Wang, C. Wang, H. Zhu, J. Cui, IEEE Trans. Dependable Secur. Comput. 18(5), 2122 (2021)
11. C. Wang, H. Zhu, Representing fine-grained co-occurrences for behavior-based fraud detection
in online payment services. IEEE Trans. Dependable Secur. Comput. 19(1), 301–315 (2022)
10 1 Overview of Digital Finance Anti-fraud

12. C. Wang, H. Zhu, IEEE Trans. Inf. Forensics Secur. 17, 2703 (2022). https://doi.org/10.1109/
TIFS.2022.3191493
13. C. Wang, H. Zhu, R. Hu, R. Li, C. Jiang, IEEE Trans. Big Data 1–1 (2022). https://doi.org/10.
1109/TBDATA.2022.3172060
14. C. Wang, H. Zhu, B. Yang, IEEE Trans. Comput. Soc. Syst. 9(2), 428 (2022). https://doi.org/
10.1109/TCSS.2021.3092007
Chapter 2
Vertical Association Modeling: Latent
Interaction Modeling

2.1 Introduction to Vertical Association Modeling in Online


Services

Online payment services have penetrated into people’s lives. The increased conve-
nience, though, comes with inherent security risks [1]. The cybercrime involving
online payment services often has the characteristics of diversification, specializa-
tion, industrialization, concealment, scenario, and cross-region, which makes the
security prevention and control of online payment extremely challenging [2]. There
is an urgent need for realizing effective and comprehensive online payment fraud
detection.
The behavior-based method is recognized as an effective paradigm for online
payment fraud detection [3]. Generally, its advantages can be summarized as fol-
lows: Firstly, behavior-based methods adopt the non-intrusion detection scheme to
guarantee the user experience without user operation in the implementation process.
Secondly, it changes the fraud detection pattern from one-time to continuous and
can verify each transaction. Thirdly, even if the fraudster imitates the daily opera-
tion habits of the victim, the fraudster must deviate from the user behavior to gain
the benefit of the victim. The deviation can be detected by behavior-based methods.
Finally, this behavior-based method can be used cooperatively as a second security
line, rather than replacing with other types of detection methods.
The effectiveness of behavior-based methods often depends heavily on the suf-
ficiency of user behavioral data [4]. As a matter of fact, user behavioral data that
can be used for online payment fraud detection are often low-quality or restricted
due to the difficulty of data collection and user privacy requirements [5]. In a word,
the main challenge here is to build a high-performance behavioral model by using
low-quality behavioral data. Then, this challenging problem can naturally be solved
in two ways: data enhancement and model enhancement.
For behavioral model enhancement, a widely recognized way is to build models
from different aspects and integrate them appropriately. For model classifications,
one type is based on the behavioral agent since it is a critical factor of behavioral

© Tongji University Press 2023 11


C. Wang, Anti-Fraud Engineering for Digital Finance,
https://doi.org/10.1007/978-981-99-5257-1_2
12 2 Vertical Association Modeling: Latent Interaction Modeling

models. According to the granularity of agents, behavioral models can be further


divided into the individual-level models [6–9] and population-level models [10–13].
In this work, we focus on the other way, i.e., behavioral data enhancement. As for
this way, a basic principle is to deeply explore relationships underlying the transac-
tion data. The more fine-grained correlations can possibly provide richer semantic
information for generating high- performance behavioral models. Existing studies in
data enhancement for behavioral modeling mainly focus on mining and modelling
the correlations (including co-occurrences) between behavioral features and labels
[14]. To further improve data enhancement, a natural idea is to investigate and utilize
the more fine-grained correlations in behavioral data, e.g., ones among behavioral
attributes.
As the main contribution of our work, we aim to effectively model the co-
occurrences among transactional attributes for high-performance behavioral models.
For this purpose, we propose to adopt the heterogeneous relation network, a special
form of the knowledge graph [15], to represent the co-occurrences effectively. Here,
a network node (or say an entity) corresponds to an attribute value in transactions,
and an edge corresponds to a heterogeneous association between different attribute
values. Although the relation network can express the data more appropriately, it
cannot finally solve the data imperfection problem for behavioral modeling, that is,
it has no effect on enhancing the original low-quality data.
An effective data representation preserving these comprehensive relationships can
act as an important mean of relational data enhancement. To this end, we introduce
network representation learning (NRL), which effectively capture deep relationships
[16]. Deep relationships make up for low-quality data in fraud detection and improve
the performance of fraud detection models. By calculating the similarity between
embedding vectors, more potential relationships could be inferred. It partly solves
the data imperfection problem. In addition to data enhancement, NRL transforms
the traditional network analysis from the artificially defined feature to the automatic
learned feature, which extracts deep relationships from numerous transactions.
The final performance of behavioral modeling for online fraud detection directly
depends on the harmonious cooperation of data enhancement and model enhance-
ment. Different types of behavioral models need matching network embedding
schemes to achieve excellent performance. This is one of the significant techni-
cal problems in our work. We aim to investigate the appropriate network embedding
schemes for population-level models, individual-level models, and models with dif-
ferent generalized behavioral agents. More specifically, for population-level models,
we design a label-free heterogeneous network to reconstruct online transactions and
then feed the features generated in embedding space into the state-of-the-art clas-
sifiers based on machine learning to predict fraud risks; while, for individual-level
models, we turn to a label-aware heterogeneous network that distinguishes the rela-
tions between attributes of fraudulent transaction, and further design multiple naive
individual-level models that match the representations generated from the label-
aware network. Furthermore, we combine the population-level and individual-level
models to realize the complementary effects by overcoming each other’s weaknesses.
2.2 Related Work 13

The main contributions can be summarized as follows:

• We propose a novel effective data enhancement scheme for behavioral model-


ing by representing and mining more fine-grained attribute-level co-occurrences.
We adopt the heterogeneous relation networks to represent the attribute-level co-
occurrences, and extract those relationships by heterogeneous network embedding
algorithms in depth.
• We devise a unified interface between network embedding algorithms and behav-
ioral models by customizing the preserved relationship networks according to the
classification of behavioral models.
• We implement the proposed methods on a real-world online banking payment
service scenario. It is validated that our methods significantly outperform the state-
of-the-art classifiers in terms of a set of representative metrics in online fraud
detection.

2.2 Related Work

With the rapid development of online payment service, fraud in online transactions
is emerging in an endless stream. Detecting fraud by behavioral models has become
a widely studied area and attracted many researchers’ attention.

2.2.1 Composite Behavioral Modeling

In this part, we briefly review different behavior-based fraud detection methods


according to the types of behavioral agents [5, 17, 18].
Individual-Level Model. Many researchers concentrated on individual-level behav-
ioral models to detect abnormal behavior which is quite different from individual
historical behavior. These works paid attention to user behavior which was almost
impossible to forge at the terminal, or focused on user online business behavior which
had some different behavioral patterns from normal ones.
Vedran et al. [19] explored the complex interaction between social and geospa-
tial behavior and demonstrated that social behavior could be predicted with high
precision. Yin et al. [4] proposed a probabilistic generative model combining use
spatiotemporal data and semantic information to predict user behavior. Naini et al.
[7] studied the task of identifying the users by matching the histograms of their data in
the anonymous dataset with the histograms from the original dataset. Egele et al. [8]
proposed a behavior-based method to identify compromises of high-profile accounts.
Ruan et al. [3] conducted a study on online user behavior by collecting and analyzing
user clickstreams of a well known OSN. Rzecki et al. [20] designed a data acquisition
system to analyze the execution of single-finger gestures on a mobile device screen
14 2 Vertical Association Modeling: Latent Interaction Modeling

and indicated the best classification method for person recognition based on pro-
posed surveys. Alzubaidi et al. [9] investigated the representative methods for user
authentication on smartphone devices in smartphone authentication including seven
types of behavioral biometrics, which are handwaving, gait, touchscreen, keystroke,
voice, signature and general profiling.
Population-Level Model. These works mainly detected anomalous behaviors at the
population-level that are strongly different from other behaviors, while they did not
consider that the individual-level coherence of user behavioral patterns can be uti-
lized to detect online identity thieves. Mazzawi et al. [10] presented a novel approach
for detecting malicious user activity in databases by checking user’s self-consistency
and global-consistency. Lee and Kim [21] proposed a suspicious URL detection sys-
tem to recognize user anomalous behaviors on Twitter. Cao et al. [11] designed
and implemented a malicious account detection system for detecting both fake and
compromised real user accounts. Zhou et al. [12] proposed an FRUI algorithm to
match users among multiple OSNs. Stringhini et al. [22] designed a system named
EVILCOHORT, which can detect malicious accounts on any online service with the
mapping between an online account and an IP address. Meng et al. [23] presented
a static sentence-level attention model for text-based speaker change detection by
formulating it as a matching problem of utterances before and after a certain decision
point. Rawat et al. [24] proposed three methodologies to cope up with suspicious
and anomalous activities, such as continuous creation of fake user accounts, hack-
ing of accounts and other illegitimate acts in social networks. VanDam et al. [25]
focused on studying compromised accounts in Twitter to understand who were hack-
ers, what type of content did hackers tweet, and what features could help distinguish
between compromised tweets and normal tweets. They also showed that extra meta-
information could help improve the detection of compromised accounts.

2.2.2 Customized Data Enhancement

To enhance the representation of data in behavioral models, the researchers have


focused on the deep relationships under the data. In the following, we summarize
the related literature on previous researches.
Zhao et al. [26] proposed a semi-supervised network embedding model by adopt-
ing graph convolutional network that is capable of capturing both local and global
structure of protein-protein interactions network even there is no any information
associated with each vertex. Li et al. [27] incorporated word semantic relations in
the latent topic learning by the word embedding method to solve that the Dirichlet
Multinomial Mixture model does not have access to background knowledge when
modelling short texts. Baqueri et al. [28] presented a framework to model residents
travel and activities outside the study area as part of the complete activity-travel
schedule by introducing the external travel to address the distorted travel patterns.
Chen et al. [29] proposed a collaborative and adversarial network (CAN), which
2.3 Fine-Grained Co-occurrences for Behavior-Based Fraud Detection 15

explicitly models the common features between two sentences for enhancing sen-
tence similarity modeling. Catolino et al. [30] devised and evaluated the performance
of a new change prediction model that further exploit developer-related factors (e.g.,
number of developers working on a class) as predictors of change-proneness of
classes. Liu et al. [31] proposed a novel method for disaggregating the coarse-scale
values of the group-level features in the nested data to overcome the limitation in
terms of their predictive performance, especially the difficulty in identifying poten-
tial cross-scale interactions between the local and group-level features when applied
to datasets with limited training examples.

2.3 Fine-Grained Co-occurrences for Behavior-Based


Fraud Detection

2.3.1 Fraud Detection System Based in Online Payment


Services

We focus on the fraud detection issue in a typical pattern of online payment services,
i.e., online B2C (Business-to-Customer) payment transactions. Here, to acquire the
victim’s money, frauds usually differ from the victim’s daily behavior. This is the
fundamental assumption of the feasibility of behavior-based fraud detection. Based
on this assumption, the research community is committed to designing behavioral
models to effectively distinguish the difference in terms of behavioral patterns. The
main challenge of this problem is to build a high-quality behavioral model by using
low-quality behavioral data. Naturally, from both aspects, there are two correspond-
ing ways to solve this problem: data enhancement and model enhancement.
In this work, we aim at devising the corresponding data enhancement schemes
for the state-of-the-art behavior models that act as the well-recognized approaches
of model enhancement [14]. More specifically, to realize data enhancement for
behavioral modeling effectively, we adopt the relation graph and heterogeneous net-
work embedding techniques to represent and mine more fine-grained co-occurrences
among transactional attributes. Then, based on the enhanced data, the corresponding
behavioral models (or enhanced behavioral models) can be adopted to realize fraud
detection. Thereout, as illustrated in Fig. 2.1, the whole flow of the data-driven fraud
detection system consists of three main parts: data representation, data enhancement
and model data enhancement.
Before describing the detailed methods, we summarize the relevant conceptions
and notations in Table 2.1 as preparations.
16 2 Vertical Association Modeling: Latent Interaction Modeling

Data Representation Data Enhancement Model Enhancement


History Composite Behavioral Models
Record Native Network Derivative Network
C2C Transactions

Heterogeneous Population -Level Model


B2C Transactions
User Network
Representation
C2C Transactions Learning Individual -Level Model

User B2C Transactions

Single Individual Model

Vector Space
C2C Transactions
Single Individual Model
User B2C Transactions

Online B2C Transactions Feature Transformation Fraudulent

Normal

Fig. 2.1 Workflow of the fraud detection system

Table 2.1 Notations of parameters


Variable Description
.T The set of transaction history records
.T1 The set of fraudulent transaction history records
.T0 The set of normal transaction history records
.T
B The set of B2C transaction history records
.T
C The set of C2C transaction history records
.ti A transaction with unique identifier .i
j
.attri A . j-th attribute of the transaction with unique identifier .i
.ϕ (·)
P The representation mapping function about the label-free network

I (·) The representation mapping function about the label-aware network
.sim(X, Y) The similarity between vectors .X and .Y
.Iu
a The set of all identifiers involving with the agent .gua
.Pg a
u
The behavioral model with the agent .gua
.r
a (i) The judgment of the .a-type agent model on the transaction with unique
identifier .i

2.3.1.1 Data Representation

Online payment transaction records are usually relational data that consist of multi-
ple entities representing the attributes in transactions. We employ a relation graph,
which express the data more appropriately in online payment services, to reconstruct
losslessly transaction record data, including B2C and C2C transactions.
Lossless Native Graph. Every attribute of a transaction is regarded as the entity. For
each transaction, we establish the relationships between each entity and its identifier,
e.g., the transaction number. Furthermore, we attach each identifier a label to denote
whether this transaction is fraudulent or normal. According to the property of trans-
2.3 Fine-Grained Co-occurrences for Behavior-Based Fraud Detection 17

Fig. 2.2 An exemplary procedure from the native graph (left) to derivative network (right), where
B2C transaction contains .8 attributes and C2C transaction contains .3 attributes

actions, the set of transactions, denoted by .T , can be divided into two disjointed
subsets, i.e., the normal and fraudulent transaction sets, denoted by .T0 and .T1 . Since
an entity may appear in different transactions, we use the co-occurrence relationship
to further connect the graphs formed by different transactions. Naturally, we call
this graph formed by relational data a native graph, as illustrated in the left part of
Fig. 2.2.
Note that the data reconstruction by relation graph merely acts as the initial-
ization of our data enhancement scheme, while it has no real effect on solving the
insufficiency of behavioral data. The so-called data insufficiency for behavioral mod-
eling means that, for a given behavioral agent, the existing data are not sufficient to
reflect the behavior pattern of this agent. For example, when some accounts with
low-frequency behavioral records are regarded as the behavioral agents, their exist-
ing behavioral data are possibly too sparse to effectively serve as a data basis for
behavioral modeling.

2.3.1.2 Data Enhancement

In this work, we utilize network embedding techniques [16] to realize the data
enhancement for behavioral modeling. Network embedding is outstanding in solving
graph related problems and effectively mines deep relationships. Then, the network
structure to be preserved should be determined before a network embedding operation
is launched. The network embedding that preserves the network structure of native
graph cannot directly help behavioral modeling for online payment fraud detection.
The reasons can be summarized as follows:
(1) Under the real-time requirement of online payment fraud detection, it is intol-
erable to perform network embedding operation for every new transaction due to the
response latency lead by large computing overhead. Thus, the uniqueness of trans-
action number (i.e., identifier) directly destroys the possibility of adopting network
embedding online.
(2) There is no need to embed the identifier, say the transaction number, into the
vector space, since it’s not a valid feature to represent user behavioral patterns. We
are interested in the co-occurrence relationships among different behavioral entities
rather than the relationship between a unique identifier and its entities.
18 2 Vertical Association Modeling: Latent Interaction Modeling

Therefore, we need generate a new derivative network of transaction attributes


based on the native graph, preparing for the network embedding.
Customized Derivative Networks. In the data we collected, there are both B2C and
C2C transactions. The proportion of frauds in C2C transactions is infinitesimal to
that in B2C transactions [32]. Moreover, the mechanism of C2C fraud transactions
is essentially different from that of B2C ones [33]. Thus, we limit the scope of this
work into online B2C fraudulent transaction detection. We utilize C2C transactions
as supplementary (not necessary) information for extracting the relationships among
behavioral agents of B2C transactions, i.e., account numbers, from the native graph.
Then, we adopt different methods to handle B2C and C2C transactions in the native
graph:
(1) For B2C transactions, we define two different vertices, say .u and .v, that
originally connect the same unique identifier as a vertex pair, and view it as an edge
.e = (u, v). For example, a B2C transaction with .m attributes has .m + 1 vertices and
.m edges in the native graph, while it correspondingly has .m vertices and .m(m − 1)/2

edges on the derivative network.


(2) For C2C transactions, we only choose a special attribute pair that has at least
one attribute appearing in B2C transaction records as vertex, e.g., the pair of account
number and account number, and use other attributes of their transactions to weight
the edges between the special attribute pair. We will analyze the impact of C2C
transactions on the model and show the gains from C2C transactions in Sect. 2.3.2.3.
We refer to such a denser network generated from the native graph as a derivative
network. An exemplar illustration is provided in Fig. 2.2.
The specific structure of derivative networks depends on the data requirements of
specific behavioral models. We also have tried to assign different derivative network
structure. From the complete graph to the minimum connected graph, we attempt to
only consider the node pairs associated with account_number as edges. The results
turn out to be a great poorer than the complete graph structure, and we analyze
that special attribute, like account_number, may do not necessarily play a decisive
role. So we adopt a complete graph structure including arbitrary node pairs in the
derivative network, and computed the similarity between all node pairs.
Heterogeneous Network Embedding. The specific vector spaces corresponding to
the derivative networks are learned by heterogeneous network embedding algorithms
[34–38]. For the behavioral models, we obtain the mapping functions from vertices
to vectors in specific vector spaces, denoted by .ϕ(·). To infer more potential rela-
tionships, we calculate the metric .sim(X, Y) as features for each transaction, where
the vectors .X, Y stem from .ϕ(·).

2.3.1.3 Model Enhancement

In this work, we classify user behavioral models into two kinds according to the
granularity of behavioral agents, i.e., the population-level model [13] and individual-
2.3 Fine-Grained Co-occurrences for Behavior-Based Fraud Detection 19

level model [6]. Accordingly, we establish the population-level model and individual-
level model based on the customized derivative network, respectively:
Population-Level Models. The population-level models identify the fraud by detect-
ing the population-level behavioral anomalies, e.g., behavioral outlier detection [39]
and misuse detection [40]. The classifiers based on behavioral data can act as this
type of models. For data enhancement for them, we need only data refactoring for
classifiers by preserving the co-occurrence frequency of behavioral attributes. To this
end, we generate a derivative network where the vertices are transaction attributes and
the edges with weights represent the co-occurrence frequency, taking no account of
transaction labels. We say such a derivative network is label-free. Transaction labels
just come into play in the training process of models. By embedding the label-free
network, we get the mapping relationship .ϕ P (·). Then, we feed the features based on
.ϕ (·) into the machine learning based classifiers [41].
P

Individual-Level Models. The individual-level models identify the fraud by detect-


ing the behavioral anomalies of individuals. They are regarded as a promising
paradigm of fraud detection. The efficacy heavily depends on the sufficiency of
behavioral data. To build the individual-level regular/normal behavioral models, we
need represent the regularity and normality of transaction behavioral data. Then,
we should take the labels into account when generating the derivative network. We
extract positive relationships generated from .T0 and negative relationships gener-
ated from .T1 . The positive relationship enhances the correlation between the agents
involved, while the negative relationship weakens the correlation. We say such a
derivative network is label-aware. By applying the network embedding method to
the label-aware network, we get the mapping relationship.ϕ I (·). Further, we establish
the individual-level models of probability in view of .ϕ I (·) [42].
Composite Behavioral Models. Learning from different aspects model can lead to
more reliable results. We adopt a union approach to reconcile the judgments from
different individual models to improve reliability [43]. At the population-level and
individual-level, we utilize the intersection to integrate judgments. That is, the fraud is
determined only if the judgments of both models are fraudulent. Our fraud detection
model consists of two levels of models and plays a complementary role.
After employing the composite behavioral models, a coming B2C transaction can
be transformed into the high-quality feature based on the learned vectors, and further
be predicted as either fraudulent or normal.

2.3.1.4 Graph Representation of Transaction Records

First of all, our method needs to represent transactional data in the form of a het-
erogeneous information network, and applies the attribute vectors to subsequent
tasks. These attribute vectors are obtained from heterogeneous network embedding
in transactions. Next, we present the process of generating heterogeneous native
graph.
20 2 Vertical Association Modeling: Latent Interaction Modeling

Denote a set of transaction history records

.T = T B ∪ T C,

where .T B and .T C represent the set of B2C and C2C transaction history records,
respectively. Let .ti ∈ T denote a transaction, where .i is the unique identifier of .ti .
Transactions are characterized by a sequence of attributes. We denote the. jth attribute
j
of a transaction.ti as.attri . Usually, some attributes have consecutive values, we need
to discretize these values and then naturally build a native graph based on the unique
identifier of transactions. We choose the value of attributes and unique identifiers
j
as vertices on the native graph. The pair of .(i, attri ) appearing in a transaction is
defined as the edge in the native graph.
In this work, we execute our method on an online banking payment dataset where
a B2C transaction contains .8 attributes and a C2C transaction contains .3 attributes.
To reconstruct the data losslessly, we build a native graph as illustrated in the left of
Fig. 2.2. Here, an attribute value appears in multiple different transactions, leaving
only one vertex in the native graph. Recall that we attach each identifier a label (.0 or
.1) to divide all transaction history records into two disjointed subsets, i.e., the normal

and fraudulent transaction sets, denoted by .T0 and .T1 , respectively.

2.3.1.5 Network Embedding

Derivative Network. A heterogeneous information network that reflects the impact


of transaction labels is what our model needs. We focus on treating the relation-
ships generating from normal transactions or fraudulent transactions unequally in
population-level and individual-level models. For the population-level model which
learns the difference between normal and fraudulent transactions from all transac-
tions, we only represent the transactions and leave the task of identifying labels to
the model. For the individual-level model which establishes user behavioral patterns
by normal transactions, it is necessary to embody the label of transaction in the
derivative network. For that we set two hyperparameters, .β and .γ , for fraudulent
transactions to distinguish other transactions, and formulate a weight .we of an edge
.e as: Σ Σ
.we (β, γ ) = εe + (−γ )β · εe , (2.1)
e→T 0 e→T 1

where the operator .→ means that a given edge in a derivative network corresponds
to the relation between two attributes of the transactions in a specific transaction set;
and .εe > 0 is the primary weight of edge .e depending on its type: When .e → T0 , the
weight of.e is equal to.εe > 0, when.e → T1 , and the weight of.e is equal to.(−γ )β · εe
with .γ ≥ 0 and .β = 0 or 1 cooperatively acting as the adjustment coefficient.
A larger weight of an edge indicates that its two vertices (corresponding to two
transaction attributes) are more closely relevant. In this work, we simply divide the
edges into two kinds according to whether or not the edges are directly relevant
2.3 Fine-Grained Co-occurrences for Behavior-Based Fraud Detection 21

to account numbers. We set the primary weights of the latter kind of edges by a
proportion of those of the former kind. For example, we set this proportion to be
.0.55, whose adjustment procedure will be introduced later in Sect. 2.3.2.2.
We follow two principles in the process of constructing derivative networks. The
principle of relationship extraction, as in Eq. (2.1), is that the more co-occurrences in
.T0 , the greater weights of edges. The other one is to remove corresponding vertices
of transactional unique identifier on native graph. For the transactions in .T C , we
retain the attributes of transactional account number which appeared in .T B on the
derivative network. For .e → T C , .e merely contains one type .(account number,
account number.), and other attributes are defined as the influence factor of the edge’s
weight. In the B2C scenario, we retain all other attributes except the unique identifier,
and then define two different vertices that connect the same unique identifier in the
native graph as a vertex pair, and view it as an edge in derivative networks.
In the above description, we find that the weights of edges generate a marked
disproportion due to using the summation in large datasets. For instance, the weight
of an edge is small between an account number and a transactional time when there
are very few transactions related to the account number in a dataset. But the weight of
an edge is tremendous between the transactional type and transactional time because
it can appear in transactions with various account numbers. This huge gap is not
conducive to reflect real relationships of different vertices in reality. We introduce a
mapping function to smooth the gap in the weights of edges, and map the weight .w
to an interval .[0, 1], that is,

1
. S(we ) = , (2.2)
1 + exp(− ln(α × we ) + θ)

where the parameters .α and .θ are important to change the weight, and control how
fast the gap reduces; the parameter .α controls the changing degree of weights; the
parameter .θ also controls the changing degree of weights, but it plays an important
role when .w is relatively large. We set .α to be a low value in order to ensure that the
ratio of two edges’ weight becomes smaller, and set .θ to be a high value for ensuring
that the ratio of two edges’ weight keeps as constant as possible when .w is relatively
small.
In the dataset adopted in our work, we set .α to .1.8 and .θ to .5, whose adjustment
procedures will be introduced later in Sect. 2.3.2.2. This strategy encourages the gap
moderately reduces when the weight .w is tremendous and the gap changes as little
as possible when it is a small value.
Heterogeneous Network Embedding. Heterogeneous network embedding is a spe-
cific kind of network embedding. To transform networks from network structure to
vector space, the commonly used models mainly include random walk [34], matrix
factorization [16], and deep neural networks [37].
We use a well-recognized heterogeneous network embedding algorithm called
HIN2Vec [35] to represent the derivative networks. Compared with other similar
algorithms, HIN2Vec distinguishes the different relationships among vertices, and
22 2 Vertical Association Modeling: Latent Interaction Modeling

Table 2.2 Main parameters


Attribute Value Explanation
Dimensionality .128 Dimensionality of node vectors
Number of random walks .160 Number of random walks starting
from each node
Length of random walks .10 Max length of each random walk
Length of meta-paths .5 Max window length of context
Negative sampling rate .5 Number of examples for negative
sampling
Initial learning rate .0.025 Initial learning rate in stochastic
gradient descent

treats them differently by learning the relationship vectors together. Besides, it does
not rely on artificially defined meta-paths. The parameter settings in HIN2Vec affect
the representation learning and application performance. We explain some main
parameters in Table 2.2, which shows the parameters of our experiments for ref-
erence. Note that the settings are related to the size of the input network. A small
dimensionality is not sufficient to capture the information embedded in relationships
among nodes, but a large value may lead to noises and cause overfitting. A larger
network might need a larger dimensionality to capture the information embedded.
The number and length of random walks determine the number of sample data, that
is, the greater the value, the more the sample data. Generally, the performance con-
tinues to improve and converge when the values are large enough. Though a large
length of meta-path cannot possibly affect the performance significantly, it is still
helpful in capturing high-hop relationships. The negative sampling rate determines
the proportion of negative samples in representation learning.

2.3.1.6 Fraud Detection Models

Fraud Detection in Population-Level Model. A heterogeneous information net-


work that fully reflects all transactions contains all the edges and vertices that have
appeared in the native graph. We treat the co-occurrence relationships generated from
.T0 or .T1 equally by setting .γ = 1 and .β = 0, that is, free to transactional label. In

the population-level model, we need draw a lesson from fraudulent transactions, and
learn the manifestations of fraudulent and normal transactions by advanced classi-
fiers. So we select the label-free network as the input of the heterogeneous network
embedding method. Then we get a mapping function .ϕ P (·), which is the vector
j j
representation of attributes in transactions. For an attribute .attri , .ϕ P (attri ) is the
representation in vector space from the label-free network.
In the simplest case, we replace the attribute with a vector representation in a
transaction. Then a transaction with .m attributes is represented as a matrix of size
.d × m, where .d is the dimension size of the vector representation. But we observe
2.3 Fine-Grained Co-occurrences for Behavior-Based Fraud Detection 23

that this solution does not work well and takes up plenty of computing and storage
resources. What we need are the features that can summarize a bunch of trans-
actions, so the features should be shared in similar transactions. To this end, we
choose to calculate the similarity of any two vector representations as new fea-
tures based on the above matrix. Specifically, we can get .m(m − 1)/2 similarities
to represent a transaction record. The procedure of computing the similarity is for-
malized as follows: Given a transaction with .m attributes and unique identifier .i,
.ϕ (attri ), ϕ (attri ), · · · , ϕ (attri ), for .ϕ (·) represents a .d-dimensional vector,
P 1 P 2 P m P
j
for .ϕ P (attri ), ϕ P (attrik ) we have
Σm
(xs × ys )
sim(X, Y) = /Σ s=1 /Σ
. (2.3)
m m
s=1 x s ×
2 2
s=1 ys

j
by using the Cosine similarity, where .X, Y respectively represent .ϕ P (attri ), ϕ P
(attrik ) and .xs , ys respectively represent the value on the .s-th dimension of the
vector .X, Y. The Cosine similarity pays more attention to the difference between
two vectors in direction and is not sensitive in numerical value. The population-level
model fits well with the Cosine similarity since it focuses on the tendency of most
individuals. To better represent a transaction, we also calculate similarities’ average
and variance. We denote .sim_avg(i), sim_var (i) as the average and variance of a
transaction with unique identifier .i. For a transaction without missing values, they
are calculated as follows:

2 Σ Σ
m−1 m
j
sim_avg(i) =
. sim(ϕ P (attri ), ϕ P (attrik )),
m(m − 1) j=1 k=i+1

2 Σ Σ
m−1 m
. sim_var (i) = v(i, j, k), (2.4)
m(m − 1) j=1 k=i+1

where ( )2
j
v(i, j, k) = sim(ϕ P (attri ), ϕ P (attrik )) − sim_avg(i) .
.

In reality, a transaction may have some missing values, we also consider its similarity
as missing values. When calculating the average and variance, we do not consider
the items corresponding to those missing values. In this work, we design the cosine
similarity between vectors and their average and variance as new features. All the new
features can be quickly calculated, thus ensuring that our model can easily complete
feature transformation based on network embedding.
In the real online payment scenario, we divide training samples and testing sam-
ples in time order to avoid time-crossing problems [44]. Time-crossing means using
some information that has not yet occurred when a transaction is tested. We use all
the data from the training samples to build a label-free network, and get the mapping
24 2 Vertical Association Modeling: Latent Interaction Modeling

function .ϕ P (·) by heterogeneous network embedding. Then we complete feature


transformation on all data, training and testing samples, based on mapping function
.ϕ (·). We get the population-level model by fitting training samples on existing clas-
P

sifiers, e.g., XGBoost. For an incoming transaction or testing samples, we input them
into the population-level model after feature engineering, and make a discriminant
prediction to obtain the probability of fraud in the transaction.
Fraud Detection in Individual-Level Models. In the individual-level model, the
derivative network needs to reflect the behavioral distribution of all normal transac-
tions without wasting information brought by fraudulent transactions. Our idea is that
the information on normal transactions enhances the association of attribute vertices
in the derivative network. On the contrary, the information brought by fraudulent
transactions weakens its connection. Therefore, we stipulate that an edge has a pos-
itive weight value when it is generated from .T0 , and an edge has a negative weight
value when the relationship occurs in .T1 by setting .γ = 1 and .β = 1. The strategy
effectively utilizes label information, which is also the biggest difference from the
label-free network for the population-level models. In some cases, the special rela-
tionship number in .T1 is much bigger than ones in .T0 , that causes the weight of some
edges to become negative or zero. Our solution is to remove these edges in derivative
networks. One reason is that these relationships reflected by edges are negligible
in the behavioral distribution of all normal transactions we want to get, when the
weight of an edge is negative or zero. The other reason is, negative weights cannot
be applied to the random walk process of network embedding method we adopt.
Similar to the population-level model, we get a mapping function .ϕ I (·), which is the
vector representation of attributes in .T .
Next, we discuss how to model behavioral models based on network embedding.
We denote the agent as the basic unit in models, that is an agent is an individual and
all transactions sharing a common agent’s value reflect the agent’s stable pattern.
Taking our online transaction record as an example, the attribute, account_number,
is a common choice as an agent. Under this agent, transactions are divided into
different parts, so that all transactions in each part have the same account number.
We can detect anomalies by comparing with behavioral models when we assume
that an agent’s behavioral pattern is stable. We discuss behavioral models from the
perspectives of single-agent and multi-agent, respectively.
Single-Agent Behavioral Model. Similar to feature transformation on the
population-level model, we choose to calculate the similarity of any two vector
representations based on a size of .d × m matrix, which is represented by a transac-
tion with .m attributes. Here, .d is the dimension size of vector .ϕ I (·). One difference
is that similarity is calculated differently. Given vector .X and .Y, .xs , ys respectively
represent the value on the .s-th dimension of the vector .X and .Y. We have:
/
Σm
'
.sim (X, Y) = (xs − ys )2 (2.5)
s=1
2.3 Fine-Grained Co-occurrences for Behavior-Based Fraud Detection 25

by using the Euclidean distance, which emphasizes the difference in numerical value
j
and therefore appropriates to characterize each individual. For the vectors .ϕ I (attri )
and .ϕ I (attrik ), we calculate the similarity .sim ' (ϕ I (attri ), ϕ I (attrik )) according to
j

Eq. (2.5). We introduce cohesivity to express the importance of a transaction in the


behavioral model and denote .C(i) as the cohesivity of a transaction with unique
identifier .i. The cohesivity .C(i) can be computed in the following way:

1
. C(i) = Σm−1 Σm , (2.6)
cm 0 + j=1 k=i+1 cm l × v' (i, j, k)

where.v' (i, j, k) = sim ' (ϕ I (attri ), ϕ I (attrik )) and.l = m( j − 1) + k − 1. The value


j

.cm l represents the .l-th value in the coefficient matrix

[ ]
. cm 0 , cm 1 , cm 2 , · · · , cm m(m−1)/2 ,

where .cm l , for .l = 0, 1, 2, · · · , m(m − 1)/2, can be determined by adopting the


method of linear regression. For all samples without missing values, we calculate the
similarity as new features according to Eq. (2.5), and then fit them on linear regression
to get the coefficient matrix. [We can get the regression coefficient ] and offset corre-
sponding, corresponding to . cm 1 , cm 2 , · · · , cm m(m−1)/2 and .cm 0 , respectively.
Denote an agent as .gua , where .a is the attribute type corresponding to the agent, and
.u represents the value of attribute .a of the agent. Accordingly, we denote the set of
agents refer to .a as .G a . Let .Iua denote theU set of all transactional identifiers involving
with .gua . Furthermore, we define .I a := u Iua . At this point, we formally denote
the behavioral model as follows. For a given agent .gua ∈ G a , its behavioral model
is defined as .Pgua , which is a discrete probability distribution function reflecting the
normal transactional patterns. For every possible transaction identifier .i, we have its
corresponding probability . pgua (i) of occurrence in .Pgua .
The procedure of computing the corresponding probability . pgua (i) is formalized
as follows:
σ (C(i))
. pg a (i) = Σ ' , (2.7)
i ' ∈I ua σ (C(i ))
u

where .σ (z) = 1+exp(z)


1
is the sigmoid function. In practice, the size of .Iua , denoted
by .|Iu |, is dependent on the product of the number of available values for all other
a

attribute types except the attribute of agent .a. So our behavioral model is a special
case, discrete probability distribution, by calculating the probability of each trans-
action in fraud detection. We adopt the same method as the population-level model
to divide the training samples and test samples, and only use the training samples to
build the model.
For some .u, .|Iua | is often a large value and the computational overhead of prob-
ability distribution will be unbearable. We use the clustering algorithm to overcome
this problem. For vectors referring to the same attribute type in vector space, the
vectors of the same cluster are represented by cluster vectors, that is, similar vectors
26 2 Vertical Association Modeling: Latent Interaction Modeling

Algorithm 2.1: Building multi-agent behavioral models


Input: The set of attribute types A
Output: The set of multi-agent behavioral models F
1 Initialize F ;
2 foreach a ∈ A do
3 foreach gua ∈ G a do
4 Initialize Pgua ;
5 foreach i ∈ Iua do
6 Compute C(i ) using Eq. (2.6);
7 Compute pgua (i ) using Eq. (2.7);
8 Add i, pgua (i ) into Pgua ;
9 end
10 Add Pgua into F ;
11 end
12 end
13 Return the set of multi-agent behavioral models F

are treated as one vector, which can quickly reduce the value of .|Iua |. In this work,
we choose the account number as the agent’s type. In other words, we establish
behavioral models for all account number which appears in a label-aware network.
We observe that single-agent models based on account number or other attributes
are often hard to achieve an excellent performance due to the absence of agents. An
effective way to solve the problem is modelling in multiple agents.
Multi-Agent Behavioral Model. To cope with insufficient or missing historical
transactions of the single agent, we prefer to the models under different agents without
acquiring more complete and adequate historical transactions. This part describes
how we build the behavioral model to detect a transaction better under multiple
agents in case of insufficient transactions. Similar to the commonly-used agent, i.e.,
account number, some other attributes, e.g., merchant number and location number,
can also act as the agents to build behavioral models. Note that the value space of
attribute types that can act as agents should not be too small. That will lead to a
lack of advantage for the individual-level behavioral model. Let .A denote a set of
attribute types that can act as agents. For each attribute in .A , we repeatedly model
the single-agent behavioral model and then add those models to the final set .F . We
can detect the fraud probability of a transaction under different agents with .F . The
procedure of building multi-agent behavioral models is described in Algorithm 2.1.
We define the fraud detection problem in individual-level behavioral models as
follows: Given a transaction, its fraud score rated by its corresponding probability
in the single-agent behavioral model determines whether the transaction is fraud or
not. This may include the following scenarios: (1) the transaction provides complete
information; (2) the transaction miss values in some attributes. For the former, we
can directly get its probability in behavioral models.
Since all attributes are required to calculate the fraud score of the transaction in
the behavior model, it is difficult to judge the transaction with missing values. So in
2.3 Fine-Grained Co-occurrences for Behavior-Based Fraud Detection 27

Algorithm 2.2: The process of fraud detection


Input: The set of attribute types A , The set of transactional identifiers I
Output: The set of judgment results R
1 Initialize R = ∅;
2 foreach i ∈ I do
3 r (i ) := 0;
4 foreach a ∈ A do
5 get r a (i ) using Eq. (2.10);
6 r (i ) := r (i ) ∨ r a (i );
7 end
8 get r a (i )' in Section 2.3.1.6;
9 r (i ) := r (i ) ∧ r a (i )' ;
10 Add r (i ) into R ;
11 end
12 Return judgment result R

our model, we compute the average possibility of all transactions, which are related
to existing attributes of the transaction with identifier .i, as the probability . pgua (i),
'
and define the set of these transaction identifiers as .Ii . Then we get the behavioral
model .Pgua corresponding to the agent . pgua (i), and denote the domain of .Pgua as .Pgua .
' '
For a transaction identifier .i, we get a new distribution .Pgua by removing the .Ii from
' '
the domain of .Pgua , and denote the domain of .Pgua as .Pgua . Next, we calculate its score
.scor eg a (i) as described in Eq. (2.8):
u

pgua (i) × exp(−Hgua )


scor egua (i) = Σ , (2.8)
.
N0 + |P1' | × i ' ∈P ' a pgua (i ' )
gua gu

where Σ
. Hgua = − pgua (i) × log2 pgua (i), (2.9)
i∈P gua

' '
|Pgua | is the cardinality of .Pgua , . N0 is responsible for adjusting the influence degree
.
of transactions other than the transaction .ti in the behavioral model on the score. The
larger . N0 is, the lower the influence of other transactions on the score. In our work,
we set . N0 = 0.
We observe that there is a clear distinction between fraudulent and normal transac-
tion scores. For an attribute type .a ∈ A , we set an interval .Ωa and give the judgment
result according to Eq. (2.10):
{
1, scor egua (i) ∈ Ωa
.r (i) =
a
(2.10)
0, scor egua (i) ∈
/ Ωa
28 2 Vertical Association Modeling: Latent Interaction Modeling

We denote fraudulent transactions by label .1 and normal transactions by label .0. The
upper and lower limits of the interval .Ωa depend on the scores distribution of training
samples.
Fraud Detection in Composite Models. A single-agent behavior model can only
give a certain fraud judgment. The normal judgment may not reliable due to the
release of transactions that cannot be checked. In this work, we imitate the one-veto
mechanism to synthesize the final results returned by multi-agent models. That is
only an agent behavioral model returns a judgment marked as fraud, the final result
is marked fraud. This strategy ensures that the multi-agent model is complementary
enough to capture as many fraudulent transactions as possible.
So far, we already have two different level ways to identify whether a transaction is
fraudulent or not. These two methods identify fraudulent transactions from different
perspectives. Population-level models compare the similarity between a transaction
and the learned transactional patterns. Individual-level models distinguish a transac-
tion by contrasting the difference between its current and past patterns. We compose
these two models to further improve the performance of our methods. The transaction
is detected as fraudulent transactions if and only if the result from both models are
judged as fraudulent transactions. The consistency of judgment on fraudulent trans-
actions reduces the probability of misjudgment of normal transactions, and ensures
that it has better performance than a single-model, i.e., the population model or indi-
vidual model. For different performance objectives, other combinations can be also
tried, which will be reserved for future research. The process of building a fraud
detection model is described in Algorithm 2.2.

2.3.2 Experimental Evaluation

To evaluate the performance of the proposed models based on co-occurrence rela-


tionships in transactions, we build heterogeneous information networks to represent
these relationships, and apply the vectors obtained by heterogeneous network embed-
ding to generate behavioral models. Through the empirical evaluation of real-world
transactions, we mainly aim to answer the following three research questions:
RQ1: How do the key parameters affect the performance of our models?
RQ2: How much gain does the data enhancement scheme based on network
embedding bring to the population-level, individual-level models?
RQ3: How does the design of enhancement scheme affect the performance of our
models?
In what follows, we firstly introduce the experimental settings, and then answer the
above research questions in turn.
2.3 Fine-Grained Co-occurrences for Behavior-Based Fraud Detection 29

2.3.2.1 Experiment Settings

Datasets. To validate the performance of proposed models, the evaluation is imple-


mented on a real-world online banking payment transaction dataset from one of the
biggest commercial banks in China, which contains three consecutive months of B2C
and C2C transaction records. The main statistics of the transactions are summarized
in Table 2.3. We use the dataset of April and May 2017 as the training samples, and
set the dataset of June 2017 as the testing samples. We also utilize C2C transactions of
April and May 2017, when we build heterogeneous information networks. All B2C
transactions are labelled either positive (fraudulent) or negative (normal), respec-
tively. The training samples contain .2, 393, 817 normal transactions and .40, 393
fraudulent transactions, and the testing samples contain .1, 003, 539 normal trans-
actions and .24, 898 fraudulent transactions. In the original set of transactions, each
transaction is characterized by .64 attributes. However, most of them have sparsely
valid values (about .10% to .30% on average). We finally choose .8 attributes in all to
build our models, which are shown in Table 2.4. The attributes, time and amount,
have continuous values, so we need the further discretization treatment for these
attributes. All C2C transactions are represented by .3 attributes which are shown in
Table 2.5. Note that the attribute amount in C2C transactions does not appear in the
derivative network, which only has an impact on the weights of incident edges.

Table 2.3 The transaction information


Month 2017.04 2017.05 2017.06 Total
B2C Normal .1,217,101 .1,176,680 .1,003,461 .3,397,242

B2C Fraudulent .13,271 .27,122 .24,898 .65,291

C2C .166,356 .205,614 .\ .371,970

Table 2.4 The selected attributes in B2C transactions


Attribute Value Description
account_number Discrete Each account_number represents a user’s account
merchant_number Discrete Each merchant_number represents a merchant in a
B2C transaction
place_number Discrete Each place_number represents an issuing area of
banking cards used for transactions
Time continuous The exact time when the transaction occurred
Amount continuous The amount of money transferred to the merchant
in a B2C transaction
Ip Discrete Whether a commonly used ip or not in a
transaction
last_result Discrete Judgment of the last transaction in the relevant
account_number
Type Discrete Each type represents a transaction of different type
30 2 Vertical Association Modeling: Latent Interaction Modeling

Table 2.5 The selected attributes in C2C transactions


Attribute Value Description
account1_number discrete The account1_number is the initiator of the C2C
transaction
account2_number discrete The account2_number is the recipient of the C2C
transaction
amount continuous The amount of money transferred to the recipient
in a C2C transaction

Table 2.6 Attribute details


Attribute Label-aware network Label-free network
account_number .221,040 .190,268

merchant_number .2,419 .2,406

place_number .327 .327

Time .8 .8

Amount .10 .10

Ip .2 .2

last_result .2 .2

Type .11 .11

We discretize the attribute time inspired by [45]. The time of day can be divided
into four time intervals. We set four intervals of the hour: .[0, 3), .[6, 11), .[15, 24),
and .[3, 6) ∪ [11, 15), according to the time distribution in transactions. We further
divide the attribute time into 8 unique values by distinguishing whether it is a week-
day. We make different approaches to discretize the amount attribute in B2C and
C2C transactions because of the different functions of attribute amount. For B2C
transactions, we discretize them into four different values according to the following
intervals, .[0, 60), .[60, 300), .[300, 3600), and .[3600, +∞). For C2C transactions,
we assign them different values, i.e., .(1, 1.5, 2, 2.5, 3), by the following intervals,
.[0, 100), .[100, 1000), .[1000, 5000), .[5000, 50000), and .[50000, +∞).

We also count the number of nodes with different attributes in the label-aware
and label-free networks, which are detailed in Table 2.6. Note that the difference
between the number of nodes with attributes account_number and merchant_number
is caused by the removal of edges and nodes from label-free networks. In addition, we
observe that the size of individual models is .2, 406 × 327 × 8 × 10 × 2 × 2 × 11
when we choose the attribute account_number as the agent. It is commonly too large
to calculate. So we cluster .2, 406 agents with attribute merchant_number and .327
agents with attribute place_number into .11 and .5 categories, respectively. Similarly,
we cluster .190, 268 agents with attribute account_number into .5 categories when
we choose the attribute merchant_number or place_number as agents.
2.3 Fine-Grained Co-occurrences for Behavior-Based Fraud Detection 31

Metrics. To evaluate the performance of our methods, we choose five representative


and well-performed techniques as the benchmarks: logistic regression (LR), random
forest (RF), naive bayes (NB), XGBoost (XGB), and convolutional neural networks
(CNN). Normally, according to the industry requirement, .1% is the tolerable upper
limit for FPR (False Positive Rate). So an achieved TPR (True Positive Rate) with an
FPR higher than .1% makes no sense in this work. We only focus on the meaningful
part of the ROC curve without considering the whole AUC (Area under The ROC
Curve). In this part, we use Precision, Recall (TPR), Disturbance (FPR) and F1-score
to comprehensively evaluate our methods.

2.3.2.2 Parameter Sensitivity

In this set of experiments, we systematically evaluate the parameter sensitivity of


our method. Different from the .k-fold cross validations, we select the last .1/3 of the
training samples in time sequence as the validation samples, and the other.2/3 to train
the model during the parameters tune. Dividing the verification set in time sequence
avoids the time-crossing problem and is more in line with the real application scenario
than randomly selecting the verification set.
Network Parameters. Parameter settings in Eq. (2.2) have a significant impact on
the weight of edges in derivative networks. In our work, most of the edge weights are
less than .1, 000, and the larger weights are only about .10, 000. So we intend to make
the transformation of weights satisfy a set of ratios, where the ratio is calculated by
. S(we )/S(1). The set of ratios satisfy the following rules: When the weight is less

than .25, the ratio is close to .we ; when the weight is about .100, the ratio is close to .50;
when the weight is very large, beyond .1, 000, the ratio is close to .100. To determine
parameter settings, we examine such changes in weights. We vary parameters .α
and .θ to determine their impacts on weight changes. Except for the parameters being
tested, all other parameters assume default values. We first examine different choices
of the parameter .α, and choose values of .α from .1 to .3. The weight changes under
different .α are shown in Fig. 2.3a, which shows that different .α slightly change the
weight, but the overall trend remains similar. Next, we examine different choices
of parameter .θ , and choose values of .θ from .3 to .7. The weight changes under
different .θ are shown in Fig. 2.3a, which shows that different .θ dramatically change
the weight, especially if it’s a huge value. Figure 2.3 also shows that the parameter
.α is positively correlated with . S(we ) and the parameter .θ is negatively correlated
with . S(we ). Finally, we observe that setting .α and .θ at .1.8 and .5 respectively is an
appropriate choice to reduce the huge gap between different weight values.
From Table 2.6, we observe that nodes with the attribute account_number far
outnumber nodes with other attributes. This imbalanced phenomenon leads to an
imbalanced network structure. So we further study the average degree of these agents
and find that the average degree of nodes with the attribute merchant_number is
similar to that with most attributes, but is about .90 times than that with the attribute
account_number. Therefore, we introduce a scheme to balance the network structure.
32 2 Vertical Association Modeling: Latent Interaction Modeling

Fig. 2.3 The TPR and F1-score under different FPR of integrations

Facing the node with a special attribute, which has .q times average degree than the
minimum one, we set the weight of edges associated with the special attribute is .q ∗
(.q ∗ = 1 − q/2 × 0.01) times of the weight corresponding to the minimum average
degree. In this work, we set the weights of edges associated with other attributes are
.0.55 (.= 1 − 90/2 × 0.01) times of that with the attribute account_number.

Embedding Parameters. Parameter settings in network embedding methods usually


make a difference to the performance of node representation for an application. To
tune the appropriate settings, we vary the values of important parameters to observe
how the performance changes under population-level models.
Dimensionality of Vector Space. First of all, Fig. 2.4a shows the impact of setting
different numbers of dimension .d. Generally, a small .d is not sufficient to capture
the information embedded in relationships among nodes, but a large .d may lead to
noises, and cause overfitting. In our work, the best performance is achieved when .d
is .128. Generally, a larger network might need a larger .d to capture the information
embedded in relationships between nodes.
Length of Random Walks. A longer random walk can generate more sample data.
Figure 2.4b manifests that the performance continues to improve when the length
of random walks .l is increased (then resulting in more sample data), and converges
when .l is large enough. Meanwhile, the more sample data, the more training time.
When .l is set as a great value, it brings slight performance growth but dramatic time
increase. In our network, we set .l to be .160, since it achieves a balance between
time-consuming and performance.
Length of Meta-paths. Figure 2.4c shows that the maximum length of meta-paths
has a significant impact on the performance. Capturing meta-paths with larger .ω is
crucial because some long meta-paths have an important semantic meaning. Note that
a large .ω will bring in useless semantic information to affect performance. Setting
the number of .ω to .4 or .5 is a good option in this work.
We verify the performance of different network embedding schemes on the
population-level model. Figure 2.4d shows the results of our experiment on the
2.3 Fine-Grained Co-occurrences for Behavior-Based Fraud Detection 33

Fig. 2.4 Parameter tuning in network embedding with different parameter pairs. Figures a, b, and
c respectively show the model performance of different parameters .d, .l, and .ω, when we set .d, .l,
and .ω at .128, .160, and .5 respectively in the process of testing others. Figure d shows the influence
of different .γ and .β on the population-level model

XGBoost classifier, and explains why we propose a customized network to deal with
labeled transactions. By setting the hyperparameters .β and .γ , we adjust the ratio of
the edge weights of fraudulent transactions to normal transactions in the network at
.1, .0, .−1, .−2, respectively. The ratio ‘.1’ represents that fraudulent and normal trans-
actions are treated in the same way. That is equivalent to label-free networks. The
ratio ‘.0’ represents that we only use normal transactions to build the network with-
out fraudulent transactions. The ratios ‘.−1’ and ‘.−2’ represent the weights of edges
generated by fraudulent transactions are .−1 or .−2 times that of normal transactions,
respectively. We observe that the label-free network outperforms the label-aware
network in the population-level models from Fig. 2.4d. We also observe that the
ratios ‘.−1’ and ‘.−2’ have similar performance, which shows that small changes
in the ratio have little impact on the model when the ratio is a negative value. In
our naive individual-level model, it learns the normal behavioral pattern from the
user historical pattern, and cannot exploit fraudulent transactions in the process of
34 2 Vertical Association Modeling: Latent Interaction Modeling

building models. The label-aware network integrates the information of fraudulent


transactions into the network structure, which is more suitable for individual-level
models than label-free networks.

2.3.2.3 The Gain of Network Embedding

Performance Gain for Population-Level Models. We compare the performance of


five representative classification models described in Sect. 2.3.2.1 with those under
the help of customized network embedding (NE) schemes. We set the parameters
of network embedding in Sect. 2.3.2.2. The ROC curves of different classifiers are
depicted in Fig. 2.5a. We observe that the models cooperating with network embed-
ding, i.e., RF+NE, XGB+NE, LR+NE, NB+NE, and CNN+NE, all outperform their
counterparts without network embedding. XGBoost gives the best results at differ-
ent FPR, followed by RF, CNN, LR, and NB with network embedding. When the
FPR is .0.001, XGBoost with network embedding obtains a recall of .93.9%, which
means that it can prevent about .94% of fraudulent transactions when the fraudster
begins to act, and just interfere .0.1% legal transactions. Random forest performs the
second best, just slightly poorer than XGBoost when the FPR is small than .0.002.
When we decrease the FPR to .0.0005, the performances of most methods do not
change stupendously despite a partial drop in the TPR. It is worth noting that the
performances of all models drop dramatically as the FPR decreases to .0.0001. The
TPR of XGBoost is slightly lower than .50%. Except for the poor performance of NB
and CNN, the others have almost similar recalls when FPR.= 0.0001.
Now we already have a basic understanding of the approximate performance of
all candidate classifiers. XGBoost is outstanding in all candidate machine learning
models when we set the same FPR.

Fig. 2.5 The ROC curves of population-level models. Figure a shows the performance of different
population-level models with or without NE. Figure b shows the impacts of different features on
population-level models
2.3 Fine-Grained Co-occurrences for Behavior-Based Fraud Detection 35

To explore how much gain C2C transactions can bring to our model, we design
the following four groups of experiments: (1) ‘B+C’, both using B2C and C2C
transactions on our model; (2) ‘B2C’, only using B2C transactions on our model;
(3) ‘Ori’, applying original transactions directly to the population-level model; (4)
‘Vec’, using B2C and C2C transactions to build the label-free network and adopting
HIN2Vec method to get the embedding vectors, but detecting fraud by feeding a
vector matrix, which consists of representations corresponding to attributes in a
transaction, into the population-level model. From Fig. 2.5b, we find that our model
is superior to other comparisons when the FPR is less than .0.15%. When the FPR
is greater than .0.15%, the gain of our model decreases, and the performance is
gradually consistent with other comparisons. Note that the poor performance on
‘Vec’ explains why we do not use the representation directly but introduce.sim(X, Y)
for the subsequent tasks. We also observe that the C2C transactions are effectively
utilized by our model. When the FPR is .0.75%, the gain of TPR reaches .2.5%.
Performance Gain of Individual-Level Models. In this part, we evaluate the per-
formance of the individual-level models in fraud detection with customized network
embedding. We present the performance of single-agent behavioral models and dis-
cuss the improvements by the multi-agent model compared with the single-agent
models. The improvements depend on the following two principles.
The first is the completeness principle of multi-agent models. If a transaction has
no historical data under a specific agent, then the single-agent model is impossible
to detect the transaction. To give a more straightforward sense, we define a measure
called check rate, which stands for the proportion of transactions that can be checked
with fraud detection techniques under a given agent. The union of subsets of trans-
actions that can be checked by single-agent models should be the complete set of
transactions. That is, the check rate of the final multi-agent model should be .1.
The second is the preferential principle of single-agent models under the complete-
ness principle. Before integrating different single-agent models into the multi-agent
model, we need to evaluate the performance of every single-agent model. If the per-
formance of a single-agent model is too poor, it will harm the performance of the
final multi-agent model.
In the implementation of our proposed models, the multi-agent model can apply to
all transactions, and the check rates under different single-agent models are shown in
Table 2.7. By calculating the Precision and Recall with different fixed Disturbances,
we investigate the performance of proposed single-agent models under the verifiable
dataset as presented in Table 2.7. The Disturbances are fixed as .0.0010, .0.0015,
.0.0020, .0.0050, .0.0075, and .0.0100, respectively. It is evident from Table 2.7 that
these single-agent models have a stable and good performance in partial data which
can be checked.
When we compare the performance of multiple single-agent models and the multi-
agent model, we experiment with all test transactions for all behavioral models. We
focus on the performance at different Disturbances between .0.001 and .0.0022. From
Fig. 2.6, we can obtain three observations as follows:
36 2 Vertical Association Modeling: Latent Interaction Modeling

Fig. 2.6 The performances of Precision (a) and Recall (b) under different fixed Disturbances in
behavioral models

Table 2.7 Performance of single-agent models


Attribute/check_rate Account_number/0.92633
Disturbance 0.0010 0.0015 0.0020 0.0050 0.0075 0.0100
Precision 0.81375 0.79173 0.74798 0.54692 0.44728 0.37581
Recall 0.68418 0.91701 0.92648 0.93479 0.93677 0.93825
Attribute/check_rate Merchant_number/0.53844
Disturbance 0.0010 0.0015 0.0020 0.0050 0.0075 0.0100
Precision 0.96880 0.95994 0.95147 0.90104 0.85672 0.81889
Recall 0.66302 0.76501 0.84403 0.95988 0.96295 0.96485
Attribute/check_rate Place_number/0.99997
Disturbance 0.0010 0.0015 0.0020 0.0050 0.0075 0.0100
Precision 0.93515 0.93433 0.91924 0.82686 0.76180 0.69708
Recall 0.58559 0.86690 0.91517 0.96405 0.96582 0.96658

First, the performance of single-agent models have a good performance in the


partial data, but do not have a stable performance on the whole data. Achieving a
good performance in the partial data is a necessary but not sufficient condition for
that on the whole data. Then, it is worth considering that the adoption of a multi-agent
model by combining multiple complementary single-agent models.
Second, the check rate for the single-agent model of place_number has a very close
to ‘1’, while the single-agent model of place_number underperforms the multi-agent
model in terms of the Precision and Recall. The reason why the multi-agent model
is superior to the single-agent model of place_number is that the former combines
the advantages of different single-agent models and has a more complete judgment
on detection transactions.
2.3 Fine-Grained Co-occurrences for Behavior-Based Fraud Detection 37

Fig. 2.7 The performance of data enhancement and model enhancement in our model. Figure a
shows the performance of different network embedding methods as data enhancement in population-
level model. Figure b shows the performance of model enhancement by combining the individual-
level and population-level models

Third, we find that the merchant_number curve provides the most stable precision
and recall, regardless of the disturbance rate. From Table 2.7, we observe that except
for the single-agent model of merchant_number, which only has a check rate of
about 50%, the other two single-agent models have a check rate of over 90%. In a
real scenario, the single-agent model of merchant_number can not be used alone to
implement anti-fraud tasks because of its low check rate. In our work, the single-
agent model can only detect fraud, but can not ensure that non-fraud is normal. The
high performance of the merchant_number model comes from its release of nearly
half of transactions, so its performance is not representative and credible. In online
payment services, the judgment results with high performance and low credibility
are not acceptable. By combining the judgment of multiple single-agent models, we
can make more accurate judgment results with the same credibility.

2.3.2.4 Performance of Enhancement Scheme

Performance of Data Enhancement. The framework of the proposed data enhance-


ment scheme is compatible with most network embedding methods. We compare the
effects of the state-of-art network embedding methods in the population-level model.
Besides HIN2Vec, we also investigate the performance of node2vec [46], transE [47]
and metapath2vec [48]. For similar parameters, we use the same values as HIN2vec,
and we use the default values for the others. Figure 2.7a shows the ROC curves of dif-
ferent network embedding methods in the population-level model. We observe that
all models cooperating with different network embedding methods have a similar
performance. HIN2Vec and metapath2Vec have a better performance than node2vec
and transE. The lower performance of node2vec mainly stems from its inability to dis-
tinguish the types of nodes. In transE, the method focuses on resolving relationships
Another random document with
no related content on Scribd:
VII
SECOND-HAND BOOKS

The love of books is a love which requires neither


justification, apology, nor defence.—Langford.
I HAVE confessed that I am of the company of book-lovers who
delight in dipping into the ‘lucky-tubs’ to be found outside
booksellers’ windows. I know of no pleasanter way of spending a
spare half-hour. Give me a few ‘loose’ coppers, place my feet upon a
likely road, and I am content. I am now, let me say, of the happy
company of book-fishermen. And this, mark you, is fishing in real
earnest, this effort to ‘hook’ good food for the mind, to place in one’s
basket a ‘book that delighteth and giveth perennial satisfaction.’
Ah! it is a good road I am on—one of London’s happiest
thoroughfares—a road rich in book-shops. Here for a humble penny
one may dip into tub or barrel and perchance pick out a volume
worth its weight in gold! We hear so frequently of marvellous
‘catches.’ You know how this, that, or the other fine sportsman
boasts of landing fish of amazing weight—well, it is so with your
book-fisherman. Has he not told you of first editions procured for a
single copper? And who shall say what fine day may not find us
among Fortune’s favoured ones?
And so now to our fishing! Here is a copy of Milton’s Paradise Lost,
‘hooked’ in the deep waters of a ‘penny tub.’ It is calf-bound, mark
you, and in fairish condition, though much stained with the passing of
years. My heart leaps; it is very old—a first edition possibly! But no; it
is anything but that, and alas! like the egg that has grown into a
proverb, it is only good in parts. Many of the pages are entirely
missing, and others partially so. Judged by the books that surround
me, it is dear at a penny ... Paradise Lost!
Yes, I confess that this fishing has its distressing side. One is
frequently disappointed. And how heart-rending it is to find great
works in a soiled and tattered condition, to discover, on drawing
one’s hand from some ‘lucky-tub,’ that one holds the remains, a few
pages, it may be, or the cover only, of a book that has played a part
in the making of this world’s history! And how touching to find a
winsome companion like the gentle Elia soiled, torn, bereft of
covering, showing yellow gum and coarse stitching! I confess that
such a sight almost moves me to tears. Fair wear and tear would
never have reduced the gentle Elia to so pitiable a state. I suspect
hands as callous as those of the butcher in the slaughterhouse
across the way. Alas! that there should be men to whom books are
merely so much paper and cloth. ‘A book,’ you tell them, ‘is the
precious life-blood of a master-spirit, embalmed and treasured upon
purpose, to a life beyond.’ And their answer is a smile. But this is no
time for repining. The great army of book-lovers swells with each
passing year. From all sides come recruits, often from the most
unexpected quarters, from mill and factory, mean street and slum.
Yes; ’tis a great day for books, and soon Everyman will have his
library, in fact as well as in name. And who dare say, who can guess,
what treasures his library will hold?
Now back to our fishing. Here is a tub that promises well; the price
per volume, as aforetime, is only one penny. See! Here is a dainty
volume, slim and shapely of form, and clothed in a delicate green. A
minor poet, you guess. Yes; the work of a minor poet, published, no
doubt, at the author’s own expense. But do not turn aside. Do not
say that such books are of no value. I confess that I am for lingering
over this slender booklet. Its cover is very pleasing; the type is large
and clear; the paper is of good texture. And what anxiety, what
patient care, probably went to the making of its contents! Brave
minor poet! You have withstood many rebuffs. The road you travel
holds, I doubt not, many pure delights: you walk, it may be, beneath
a star-strewn sky. But star-gazing has proved in your case a
dangerous occupation. ‘He who raises his eyes to the heavens
forgets the stones and puddles at his feet.’ Alas! you have had many
falls. And when perchance you have come to the ground, it has often
been to the accompaniment of heartless laughter. ‘Here,’ cry the
critics, ‘is another minor poet on all fours.’ And with ill-timed jests
they proceed to point out your weaknesses; how that you have not
the feet to walk aright, much less run; and as for wings, there is not,
’tis frequently said, so much as a sign of their sprouting. But for all
that you have scrambled to your feet, and marching bravely forward,
continued to give generously of your gentle fancy. Long may you
live! In you we have (and here is my strongest point in your favour)
many a great and worthy poet in the bud.
And so I confess gladly, and, indeed, with a proud heart, that in my
bookshelves you hold a warm, well-sheltered corner. I love to handle
your slender volumes, to pore over your early fancies, ill-expressed
at times, it may be, but with a sincerity that is refreshing, and a
simplicity that is delightful. And if your work is poor from cover to
cover—which is rarely, if ever the case—well, you have given us a
book.
Yes, I am of the company of book-lovers who revere anything in the
form of a book. Lovers are made that way; and it is futile to inquire
how I can bring myself to love books of ‘all sorts and conditions.’ As
well might you ask the nature-lover why he speaks so tenderly of,
say, the worm that peeps through the tender green of some sun-lit
lawn. ’Tis simply love—love for the humblest children of dear Mother
Earth. And so it is with the true book-lover; for the humblest volume
he has a tender thought.
But what of our fishing? This is, I take it, a fitting place to record how
on such and such a day I had the good fortune to ‘hook’ a copy of
this or that desirable work for a few humble pence—a ‘mere song’!
Well, so it has been, ‘day in and day out.’ But those books, I would
remind you, are now my companions, my friends, and I can no more
associate money with their value than I can judge a friend in the
flesh by the contents of his purse. To me they are priceless.
VIII
‘THE CULT OF THE BOOKPLATE’

YOU have often heard the cry, and know full well its meaning, ‘My
books are priceless.’ What wonder, then, if you and I—lovers of
books—take lively interest in what an ingenuous man of business
has called ‘The Cult of the Bookplate.’ ‘The mission of the
bookplate,’ he advises us, ‘has always been, and must always be,
primarily to indicate ownership of the books in which they are placed.
They may be ornate or simple, as the taste or means of the owner
may indicate; they may incorporate crests, arms, motto, or other
family attribute; or, again, they may reflect the personal interests or
occupations of the owner; but the real aim of the bookplate remains
ever the same—a reminder to those who borrow.’
Pretty ground this for contemplation—for doubts, counsels, hopes,
fears, regrets; aye, and for rejoicing! How my mind leaps, first this
way, then that, when I meditate upon that rich circle of friendship in
which I may borrow from a fellow book-lover’s treasured volumes,
and, of course, lend of my own! Yet by what unspeakable regrets am
I possessed when I think of certain treasured volumes lent in wildly
generous moments to good but ‘short-minded’ friends! I have in mind
a little volume of essays—a first and only edition—by an unknown
but charming writer, which is now in the possession of that restless
fellow K——. May he see these words and repent! And what of that
treasured edition—once mine, but, alas! mine no more—of certain
writings of Dr. Johnson? Oh, that I could send the good doctor in
quest of the volume! What blushes of shame he would bring to the
cheeks of the heartless borrower! ‘Sir!’ he would cry. And what words
would follow! Very speedily should I be in a position to fill the gap in
my shelves.
And there is that dainty little calf-bound volume of Lamb’s essays,
borrowed some months back by J——. Where are you and my little
volume now, good friend? For reasons known to ourselves alone I
address you tenderly. But I would that I could send the gentle Elia to
recover my lost gem. Very gently would he deal with you, with quaint
phrases, puns, and happy jests. Aye, and with little speeches uttered
with that fascinating lisp of his. Indeed, I fear, now that I come to give
the matter careful thought, that he would leave you empty handed. It
would be so like his charming ways to console, comfort, and amuse
you, and leave with you, after all, my volume of his incomparable
essays.
The truth is, this work of restoring borrowed volumes to one’s
shelves calls for a stout heart. I confess that I am wanting in the
necessary qualifications. I have not the courage to speak harshly to
a fellow book-lover. So firm is his hold on my affections that I am as
wax in his hands. Yet book-lovers to a man agree that the borrower
who never repays stands in dire need of correction. I must call
another to the task—one of stronger metal.
Listen! ‘Even the fieldmouse,’ cries my champion, ‘has a russet gown
to match the mould, but the book-lover who has let loose a borrower
in his library is as forlorn as the goat tied up for tiger’s bait. True, that
to spare your Homer you may plead you are re-acquainting yourself
with the Iliad, but that is to save Homer and lose Virgil. You cannot
profess that you study all the classics simultaneously; and who
knows that better than the borrower? Snatch your Browning from his
grip, and his talons sink into Goethe instead. What does it matter to
him? He is out for books, and he will not be placated until he has left
gaping rents in your shelves, like the hull of a bombarded battleship.
These chasms shall burden your soul with the weight of many
unkindly maledictions, but the borrower will return no evil thought, for
the simple and satisfactory reason that he will now think no more
either of you or of your books. Stabled securely upon his shelves,
they will remain on one of those perpetual leases that amount to a
freehold. It is useless to invade his lair with the hope of bringing back
the spoil. Are you not instructed that he has not yet had time to read
them, but that they are yours again whenever you will?
Outgeneralled and outflanked, you retreat empty-handed.
‘Books are gentle, lovable company. Why should the lust of them
corrupt human nature, turning an amiable citizen into that hopeless
irreclaimable, the inveterate book-borrower? Is it that law of
contrasts which associates with the noble steed the ignoble horse-
coper, and with the gentle dove the cropped head and unshaven jowl
of the pigeon-flyer? But truce to theories! It is the hour of action. Will
not a benignly reforming Government insist that lent books shall be
registered like bills of sale, and a list drawn up of notorious
borrowers, with compulsory inspection of their dens, to protect our
defenceless libraries from the ravages of the book-pirate? If it is
hopeless to look for his cure, shall we not at least petition for his
prevention?’
You will allow that all this bears directly upon the subject in mind.
Does not the ingenuous gentleman whom I have quoted at the head
of this chapter aver that the real aim of the bookplate remains ever
the same—‘a reminder to those who borrow.’ Here, then, is one
thread of hope, but only a very thin thread, I fear. Not for one
moment dare I venture to think that it will bear the weight of our
grievances. It is too fine, too delicate, to save us from the hands of
the ruthless borrower. Indeed, I suspect that if it in any wise alters
our position, it is only to draw us into fresh danger. For you know
how many and how varied are the charms of bookplates, both old
and new. Indeed, I have known book-lovers borrow a volume for the
sole purpose of tracing the design upon the fly-leaf. It is a fault of
which the present writer is guilty. With shame he confesses it.
But wait! Why should I speak with blushes of my admiration for the
brave armorial designs which adorn the calf-bound volumes of my
friend H——? Well may he be proud of his family attributes, and well
may I admire the manner in which some skilful designer, long
departed, has incorporated arms and family motto with the familiar
words Ex Libris. I know not, by the way, how any book-lover can
bring himself to ignore information so absolutely clear. The
announcement ‘from my library’ seems in the case of the
particular bookplate in mind to come, nay, does come, from a
trumpet of amazing dimensions. But it is to be feared that the
imaginative designer has been allowed too free a hand. So rich is his
fancy, so skilful his line work, that the force of his call to duty is
dulled by admiration. Perhaps that is why my friend’s volume still
rests on my shelves. And perchance herein may rest an explanation
of the heartless manner in which my friend has held fast to my
treasured volume of Cowper’s poems.
It is, I say, to be feared that designers of bookplates have sacrificed
the primary aim of their calling to the elaboration of playful fancies.
From the very birth of the bookplate the fault seems to have been
present. I am told that the earliest specimens date back to 1516, and
on the Continent, notably in Germany, even earlier than that. Far
back into the ages must we travel to find the first offenders. Let the
interested book-lover examine the ancient examples presented in
1574 by Sir Nicholas Bacon to the University of Cambridge. He will
then see pretty clearly how the war has been waged between the
pictorial and the practical, and how, all along the line, the victory has
been with the former. And what wonder with such mighty craftsmen
as Albrecht Durer, Lucas Cranach, and Hans Holbein to wield the
steel point of the engraver! Can one be surprised if such men defeat
the chief aim of the bookplate, and put to silence with their wonderful
skill the simple cry Ex Libris? Bookplates by Durer, Cranach, or
Holbein must surely give great value to the volumes in which they
rest. Note the danger! True book-lovers will blush to own it, but we
must acknowledge the fact that a bookplate may have greater
attractions than the volume in which it rests!
Wherefore, I say, we book-lovers will be well advised if we see to it
that we do not fall into the error of keeping on our shelves books
which may be coveted for the plates they contain. Bookplates in the
delicate manner of Chippendale, with ‘wreath and ribbon’ and open
shell work, are too alluring. Designs in the manner of Sheraton are
also dangerously attractive. Jacobean plates come nearer the
desired mark. But to my mind the good old English style of plate,
‘simple armorial,’ is best fitted for the purpose.
Always must we remember that the primary object of the bookplate
is a reminder to those who borrow. On this score I am disposed to
favour those inexpensive modern plates in which are interwoven
some dear, familiar scene—a nook or corner of one’s garden, or a
beloved scene in one’s native place. If the ruthless borrower has
aught of good in him, surely he will be affected by such tender
personal associations! But we have seen that the average borrower
of books is a strange fellow. Alas! I know him only too well. Indeed, I
too must confess that ‘out of an intimate knowledge of my own sinful
ways have I spoken.’
IX
BEDSIDE BOOKS

I come to my subject in a sleepy mood. It seems a daring confession


to make. But you will allow that only when one’s mind is bent on
thoughts of sleep can one hope to speak fittingly of bedside books.
’Tis a subject calling for gentle, quiet thoughts. And what better state
of mind? You remember Robert Louis Stevenson’s prayer, ‘Give us
the quiet mind.’ How often has a similar prayer been offered! Too
often are we disturbed in thought—harassed, perplexed, worried. Let
us now turn our attention to books that soothe and lull to rest. Here
they stand, ready to hand. But name them I dare not, save in my
own heart. For your taste in this matter may be totally different from
mine. I dare only say at this point—for here surely I may speak with
confidence—that no bedside shelf is complete without a copy of
Stevenson’s prayers. With gratitude I confess that of the many
volumes which have comforted me during dark hours not one is so
dear, so close to my heart, as the little volume bearing the golden
letters R. L. S.
‘Be with our friends, be with ourselves. Go with each of us to rest. If
any awake, temper to them the dark hours of watching; and when
the day returns, return to us our sun and comforter, and call us up
with morning faces and with morning hearts—eager to labour: eager
to be happy, if happiness be our portion; and if the day be marked for
sorrow, strong to endure it.’ Certainly the prayers of R. L. S. should
have a place on every bedside shelf. That you are familiar with the
foregoing prayer, I cannot doubt. ‘Many are the golden passages the
lover of good books has by heart.’ It may be that you have upon your
own particular bedside shelf many ‘devotional authors’ with whose
every word you are familiar—books, small and great, which are as
jewels in your shelf. And no doubt you have upon the same shelf
many every-day and every-hour books, acting, as it were, as a
setting to your gems. For certainly the bedside shelf, if it is to be
complete, must contain books to suit all moods. One cannot be
certain in what mood the night watches will find one. The over-
excited brain, for instance, needs its own particular medicine, and
sometimes two, three, or more drugs are required, according to the
state and nature of the patient. In the majority of cases it is futile to
attempt a cure with a book less lively than the patient’s own brain.
His abnormal condition must be righted by degrees. One book, or
drug, must follow another, till his mind has been restored to a normal
state. Then may he resort to his accustomed ‘rest books,’ and so fall
asleep.
But I fear that such talk ‘smacks’ of the doctor and his medicine
chest, and I desire to conjure up restful thoughts. Well may the
reader be forgiven if he starts up in protest. Indeed, here is the
difficulty and the danger of seeking to promote a restful condition.
One is so apt to make, with the best intentions possible, a remark
which has the reverse effect. There is, I say, the risk of naming a
book which to the reader might come as a call to action—to daring
deeds and mighty enterprises—a mood as far removed from slumber
as the North Pole from the South.
I may, however, speak freely enough in the company of book-lovers
who wake with the rising sun and take to themselves one of their
beloved books. They will not resent my likes and dislikes—they who
open the day with a ‘jolly good book.’ In their company I may confess
that for the early morning I prefer a book with plenty of ‘go’ in it. Give
me life and spirit and enterprise. Thus may I hope to retain some
measure of the buoyancy of youth. It is good to have been young in
youth, and, as the years go, to grow younger. ‘Many,’ it is written,
‘are already old before they are through their teens; but to travel
deliberately through one’s ages is to get the heart out of a liberal
education. Times change, opinions vary to their opposite, and still
the world appears a brave gymnasium, full of sea-bathing, and horse
exercise, and bracing, manly virtues; and what can be more
encouraging than to find the friend who was welcome at one age
welcome at another?’
Let Westward Ho! stand on your bedside shelf, and many other
books of the same brave and lively order—‘the travel and adventure
books of our spirited youth.’ These, if you meet fresh days with a
book, will brace you for the battle. Stevenson must, of course,
remain one of your companions—your faithful friend both night and
morning. Bravery he will give you, and grace also.

Forth from the casement, on the plain


Where honour has the world to gain,
Pour forth and bravely do your part,
O knights of the unshielded heart!
Forth and for ever forward!—out
From prudent turret and redoubt,
And in the mellay charge amain
To fall, but yet to rise again!
Captive? Ah, still, to honour bright,
A captive soldier of the right!
Or free and fighting, good with ill?
Unconquering but unconquered still!

And mark again with what ‘manly grace’ and beauty of expression
Stevenson turns our thoughts to the ‘Giver of all strength.’
‘Give us grace and strength to bear and to persevere. Offenders,
give us the grace to accept and to forgive offenders. Forgetful
ourselves, help us to bear cheerfully the forgetfulness of others. Give
us courage and gaiety and the quiet mind. Spare us to our friends,
soften us to our enemies. Bless us, if it may be, in all our innocent
endeavours. If it may not, give us the strength to encounter that
which is to come, that we be brave in peril, constant in tribulation,
temperate in wrath, and in all changes of fortune, and down to the
gates of death, loyal and loving one to another.’
If there is a more helpful bedside author than Stevenson, I should
much like to make his acquaintance. To few is it given to speak ‘the
word that cheers’ with such a fine combination of tenderness and
courage.
‘It is a commonplace,’ he says, ‘that we cannot answer for ourselves
before we have been tried. But it is not so common a reflection, and
surely more consoling, that we usually find ourselves a great deal
braver and better than we thought. I believe this is every one’s
experience; but an apprehension that they may belie themselves in
the future prevents mankind from trumpeting this cheerful sentiment
abroad. I wish sincerely, for it would have saved me much trouble,
there had been some one to put me in a good heart about life when I
was younger; to tell me how dangers are most portentous on a
distant sight; and how the good in a man’s spirit will not suffer itself
to be overlaid, and rarely or never deserts him in the hour of need.’
To the troubled, relaxed mind such words come as a bracing tonic.
Too often have we passed sleepless hours for the want of a word in
season—something to put a little ‘grit’ into us for the duties of the
morrow. Where the average mortal is concerned Stevenson certainly
supplies that need. Should he by any chance fail—well, there is an
essayist of our own day, waiting to minister to the most exacting
needs. I have in mind the many beautiful and tender pages written
by one whom we associate with a certain college window. Certainly
of him it may be said that he seeks to comfort and console, and to
soothe and lull to rest.
X
OLD FRIENDS

Come, and take choice of my library,


And so beguile thy sorrow.
Goldsmith.

NOW let us dwell upon our every-day and every-hour books—our


dear old familiar friends. ‘On a shelf in my bookcase,’ says
Alexander Smith, ‘are collected a number of volumes which look
somewhat the worse for wear. Those of them that originally
possessed gilding have had it fingered off, each of them has leaves
turned down, and they open of themselves in places wherein I have
been happy, and with whose every word I am familiar as with the
furniture of the room in which I nightly slumber; each of them has
remarks relevant and irrelevant scribbled on their margins. Those
favourite volumes cannot be called peculiar glories of literature; but
out of the world of books I have singled them, as I have singled my
intimates out of the world of men.’
Ah! that makes pleasant reading. For do not the sentiments
expressed reflect our own feelings? And do they not place us in
gracious and distinguished company? In his charming way,
Goldsmith whispers, ‘The first time I read an excellent book, it is to
me as if I had gained a new friend. When I read over a book I have
perused before, it resembles the meeting with an old one.’ And to
this Dillon adds, ‘Choose an author as you would choose a friend’;
whilst Langford, touching the same theme, declares that ‘a wise man
will select his book with care, for he will not wish to class them all
under the sacred name of friends.’
And as friendship has its roots deep set in love and sympathy, and is
for ‘serene days and country rambles, and also for rough roads and
hard fare, shipwreck, poverty, and persecution, and, moreover,
keeps company with the sallies of the wit,’ it is easy enough to
understand why such authors as Charles Lamb, Oliver Goldsmith,
William Hazlitt, Leigh Hunt, Richard Jefferies, Thomas De Quincey,
Joseph Addison, and, of later years, Robert Louis Stevenson, have
our affections.
Here they stand—Lamb, Goldsmith, Hazlitt, Hunt, Jefferies—the
whole lovable company. What shall I say concerning these friends of
ours? I am moved by deep and serious feelings. But, according to
his own telling, the gentle Elia, the first in mind, ‘had a general
aversion from being treated like a grave or respectable character,
and kept a wary eye upon the advances of age that should so entitle
him. He herded always, while it was possible, with people younger
than himself. He did not conform to the march of time, but was
dragged along in the procession. His manners lagged behind his
years. He was too much of the boy man. The toga virilis never sate
gracefully on his shoulders. The impressions of infancy had burnt
into him, and he resented the impertinence of manhood.’ And
therein, surely, rests the secret of his charm. In spite of his brave
confessions, how firm to discerning hearts is the bed of the stream
over which his thoughts flow! Who can doubt the source of a stream
that flows so sweetly?
And what of Oliver Goldsmith—poor ‘Goldy,’ as he was called by his
circle of intimates on earth? He, too, was very human, and, indeed,
had many weaknesses. And they tell us—they who write of such
matters with authority—that his days of poverty and wretchedness
were largely, if not entirely, the outcome of his follies. Even in the
sphere in which he shines—a clear, bright, inextinguishable star—it
is said that he had many short-comings. ‘He had neither the gift of
knowledge nor the power of research. As an essayist and poet, he
has neither extended views nor originality; as a critic, upon the few
occasions upon which he embarks on criticism his sympathies are of
the most restricted kind.’ And yet for the warmth and gentleness of
his heart and the purity of his style we love him. ‘His playful and
delicate style transformed everything he touched into something
radiant with warmth and fragrant with a perfume all its own.’
And how fared it with Hazlitt—the keen critic, the impassioned writer
—‘unbending and severe, insurgent in his political views’? Are we
not told that he was really more of an artist and sentimentalist than a
politician? ‘As for his life, it was aesthetic, Bohemian, and irregular in
the extreme. The restraints of domestic life were intolerable; he
wanted to be alone to write; rough accommodation and coarse fare
appeased him best; tinkerdom was the ordinary state of his interior
environment; save for two pictures (which served as a link with past
aspiration and were treasured accordingly), he had no property; a
fugitive amour seemed to furnish the emotional side of him with the
stimulant it most required; he was a night rambler and a reveller in
Rousseau, over whose Héloise and Confessions he expended
literally pints of tears.’ Such was the temperament of the writer, artist,
and sentimentalist who gave us those incomparable essays ‘On
Going a Journey,’ ‘On the Ignorance of the Learned,’ and ‘On
Familiar Style.’
And what of those other old friends, Hunt, Jefferies, De Quincey,
Robert Louis Stevenson? But our inquiries have gone far enough.
What boots it to repeat that our friends were human in life, just as
surely as they are human in their books, but with a humanity that
allures, charms, captivates? They do not preach to us, these old
friends of ours, or make open claims to virtue; and yet we are never
so conscious of goodness as when they are near. Their lightest
raillery scorns a mean act. In their company meanness flees as from
a pestilence.... Our friends!
Wisely is it said that the ‘best way to represent to life the manifold
use of friendship is to cast and see how many things there are which
a wise man cannot do himself; and then it will appear that it was a
sparing speech of the ancients to say, “that a friend is more than
himself”: for that a friend is far more than himself.’
And so I thank heaven for my friends, for the wise, the lovely, and
the noble-minded who stand side by side, ever willing, ever ready,
upon my humble shelf.
XI
THROUGH ROSE-COLOURED SPECTACLES

NOW let another occupy the printed page. I have promised to give
the experiences of other book-lovers, to show how books influence
their thoughts and ways; and I am anxious to introduce a short, slim
gentleman of sixty odd summers, with a smiling face and an air of
wellbeing, a retiring, peaceful book-lover, whom you would never
suspect of playing any part in a mystery.
Nevertheless, my friend must plead guilty to practising the ‘art of
make-believe’ to such a degree that one could never be certain how
much was real concerning him and his affairs and how much was
imaginary. Indeed, the only sure and unchanging thing about him
was his spectacles and the manner in which he viewed life through
them—his point of view.
‘My spectacles,’ he told me, over and over again, ‘are rose-coloured.
You understand, rose-coloured. They and myself are inseparable.
Without them I am as bad as stone-blind, and dare not take a step in
any direction.’
Then he would smile in a manner that led one to suspect that he was
merely drawing upon his imagination. But I learnt that my friend’s life
had been lived under such peculiar difficulties, and that he had
passed through so much sorrow and affliction, that without his rose-
coloured spectacles he was, in one sense, stone-blind.
It pleased him to imagine that the lenses in his treasured spectacles,
which were gold-rimmed and old-fashioned in shape, had been cut
from rose-coloured pebbles, with the power of giving a rosy hue to
life, and bringing all things into correct perspective.
‘Correct perspective and the right point of view,’ he remarked on a
certain day, ‘are everything in life. My spectacles give me the correct
vision. They bring men and affairs into proper focus, and, what is
more, they give them a rose tint. Robert Louis Stevenson wore
spectacles something like mine, but his were far and away more
powerful. They enabled him to see farther and more clearly. They
were of a deeper and purer tint.’
He drew from his pocket a small cloth-bound edition of passages
from Stevenson’s works. The little volume did not measure more
than, say, three by five inches, and was considerably soiled and
worn; but he handled it as though it were worth its weight in precious
stones.
It was clear, before he opened the volume, that he knew the greater
part of the contents by heart; for he commenced to quote as he ran
his fingers round the edge of the cover:
‘“When you have read, you carry away with you a memory of the
man himself; it is as though you had touched a loyal hand, looked
into brave eyes, and made a noble friend; there is another bond on
you thenceforward, binding you to life and to the love of virtue.”’
He accompanied the quotation with a pleasing smile, as who should
say, ‘How true that is and how nobly expressed!’ Then he turned the
leaves hastily as though looking for a favourite passage; but he
abandoned the search a moment later, and glanced up.
‘I fancy I can give you the passage correctly. I should like you to hear
it. It will throw light upon what I have said about my rose-coloured
spectacles.’
He looked up, as he spoke, at the trees overhanging the lane
through which we walked.
‘“Nor does the scenery any more affect the thoughts than the
thoughts affect the scenery. We see places through our humours as
through differently-coloured glasses.”’
He paused a moment, then repeated the last line slowly and with
emphasis: ‘We see places through our humours as through
differently-coloured glasses.’
‘“We are ourselves,”’ he continued, ‘“a term in the quotation, a note
of the chord, and make discord and harmony almost at will. There is
no fear for the result, if we but surrender ourselves sufficiently to the
country that surrounds and follows us, so that we are ever thinking
suitable thoughts or telling ourselves some suitable sort of story as
we go. We become thus, in some sense, a centre of beauty; we are
provocative of beauty, such as a gentle and sincere character is
provocative of sincerity and gentleness in others....”’
Then he told me ‘some suitable sort of story’ about a certain man
who built a castle upon dry land, a castle of stone, firm as a rock,
and filled it with his heart’s desire. But no sooner had the man taken
up his abode therein than the tide of circumstances turned.
Misfortune followed misfortune; sorrow followed sorrow; first, the loss
of earthly possessions, then the loss of loved ones. All brightness
and hope were taken out of the man’s life, and for many years he
dwelt in darkness.
At this point my friend turned away, and slowly, thoughtfully, polished
his spectacles. One could not help thinking that he was relating in a
parable the story of his own past. This suspicion was strengthened, if
not actually confirmed, when he readjusted his spectacles and
continued:
‘Then this same man built a castle in the air partly out of the
creations of his own mind, partly out of the creations of others, a
castle of thought, a building without visible support. He found,
however, that this castle in the air, built on lines he had been taught
to smile at in his youth, was more enduring than his castle of stone.
Moat and drawbridge were impassable, the gates impregnable.
Changed circumstances could not affect it; misfortune and sorrow
could not shake it; even death left it unmoved.’
‘You see,’ he continued, ‘what I am driving at? Listen to this from my
little volume: “No man can find out the world, says Solomon, from
beginning to end, because the world is in his own heart.” And this:
“An inspiration is a joy for ever, a possession as solid as a landed
estate, a fortune we can never exhaust, and which gives us year by
year a revenue of pleasurable activity. To have many of these is to
be spiritually rich.”’
The next moment he drew from his pocket a worn leather case and
showed me a portrait of Robert Louis Stevenson. He had it wrapped
in two layers of paper, both yellow with age and stained from much
handling. But the likeness was well preserved, as clear, perhaps, as
on the day it was taken.
‘I number this likeness,’ he said, ‘amongst my treasures. They go
everywhere with me—this portrait of Stevenson and this little volume
of extracts from his works.’ He fingered the cover affectionately. ‘The
case,’ he continued, ‘is worn with much handling, but the rose-
coloured lenses have not lost their power. Listen to this: “It is in virtue
of his own desires and curiosities that any man continues to exist
with even patience, that he is charmed by the look of things and
people, and that he awakens every morning with a renewed appetite
for work and pleasure.” And this: “Noble disappointment, noble self-
denial, are not to be admired, not even to be pardoned, if they bring
bitterness. It is one thing to enter the kingdom of heaven maim;
another to maim yourself and stay outside.”’
He glanced up and handed me the volume. ‘Make your own
selection,’ he suggested; ‘read something that condemns me.’
I acted on the suggestion, or, rather, the first part of it; for my
selection, contrary to his request, was in the form of commendation:
‘“His was, indeed, a good influence in life while he was still among
us; he had a fresh laugh; it did you good to see him; and, however
sad he may have been at heart, he always bore a bold and cheerful
countenance, and took fortune’s worst as it were the showers of
spring.”’
I was not aware how entirely this fitted my friend’s case until some
months had passed. Our friendship was only in its infancy at that
time, little more than an acquaintance. We had no formal
introduction. He had asked the time of day, then gone on to talk of
his rose-coloured spectacles. We had much to say concerning his
spectacles in the days that followed—always in a light and pleasant
vein. To be tedious or heavy was, to his mind, a grievous fault,
particularly in books. In life and in letters he would always look for,
and never fail to find, the brightest side, the happiest passages. And
he would apply the one to the other—a passage from Stevenson, or
some other author, to an incident in his own or some other life—in a
manner that was wonderfully illuminating and helpful.
In brief, his was ‘the life that loves, that gives, that loses itself, that
overflows; the warm, hearty, social, helpful life.’ From a sorrowful
chapter in his history he would weave a story for the help of others,
always from a rose-coloured standpoint; from a calamity he would
make a fairy tale, showing that, in spite of adversity, the House
Beautiful was still upon its hill-top.
I remarked, in introducing him, that he was guilty of playing a part in
a mystery. You will have seen through the mystery by now; at least,
as regards his rose-coloured spectacles. But there is more to be said
concerning his life and his love of books.

You might also like