Disruptive Technologies For Big Data and Cloud Applications Proceedings of ICBDCC 2021 J Dinesh Peter Steven Lawrence Fernandes Amir H Alavi Editors

You might also like

Download as pdf or txt
Download as pdf or txt
You are on page 1of 54

Disruptive Technologies for Big Data

and Cloud Applications Proceedings of


ICBDCC 2021 J Dinesh Peter Steven
Lawrence Fernandes Amir H Alavi
Editors
Visit to download the full and correct content document:
https://ebookmeta.com/product/disruptive-technologies-for-big-data-and-cloud-applica
tions-proceedings-of-icbdcc-2021-j-dinesh-peter-steven-lawrence-fernandes-amir-h-al
avi-editors/
More products digital (pdf, epub, mobi) instant
download maybe you interests ...

Machine Intelligence and Data Science Applications:


Proceedings of MIDAS 2021 Skala Vaclav

https://ebookmeta.com/product/machine-intelligence-and-data-
science-applications-proceedings-of-midas-2021-skala-vaclav/

Machine Learning and Big Data Analytics (Proceedings of


International Conference on Machine Learning and Big
Data Analytics (ICMLBDA) 2021) 1st Edition Rajiv Misra

https://ebookmeta.com/product/machine-learning-and-big-data-
analytics-proceedings-of-international-conference-on-machine-
learning-and-big-data-analytics-icmlbda-2021-1st-edition-rajiv-
misra/

Cloud Native Data Center Networking 1st Edition Dinesh


G Dutt

https://ebookmeta.com/product/cloud-native-data-center-
networking-1st-edition-dinesh-g-dutt/

Cloud Computing Cloud Computing for Secure and Smart


Applications 2021 Mohammad R. Khosravi

https://ebookmeta.com/product/cloud-computing-cloud-computing-
for-secure-and-smart-applications-2021-mohammad-r-khosravi/
Advanced Technologies Systems and Applications VI
Proceedings of the International Symposium on
Innovative and Interdisciplinary Applications of
Advanced Technologies IAT 2021 1st Edition Naida
Ademovi■
https://ebookmeta.com/product/advanced-technologies-systems-and-
applications-vi-proceedings-of-the-international-symposium-on-
innovative-and-interdisciplinary-applications-of-advanced-
technologies-iat-2021-1st-edition-naida-ademov/

CISSP For Dummies 7th Edition Lawrence C Miller Peter H


Gregory

https://ebookmeta.com/product/cissp-for-dummies-7th-edition-
lawrence-c-miller-peter-h-gregory/

Obtaining Value from Big Data for Service Systems: Big


Data Technology Stephen H. Kaisler

https://ebookmeta.com/product/obtaining-value-from-big-data-for-
service-systems-big-data-technology-stephen-h-kaisler/

Towards the Integration of IoT, Cloud and Big Data:


Services, Applications and Standards 1st Edition Vinay
Rishiwal

https://ebookmeta.com/product/towards-the-integration-of-iot-
cloud-and-big-data-services-applications-and-standards-1st-
edition-vinay-rishiwal/

Data Science■Analytics and Applications: Proceedings of


the 5th International Data Science Conference--iDSC2023
1st Edition Peter Haber

https://ebookmeta.com/product/data-science%e2%80%95analytics-and-
applications-proceedings-of-the-5th-international-data-science-
conference-idsc2023-1st-edition-peter-haber/
Lecture Notes in Electrical Engineering 905

J. Dinesh Peter
Steven Lawrence Fernandes
Amir H. Alavi Editors

Disruptive
Technologies
for Big Data
and Cloud
Applications
Proceedings of ICBDCC 2021
Lecture Notes in Electrical Engineering

Volume 905

Series Editors

Leopoldo Angrisani, Department of Electrical and Information Technologies Engineering, University of Napoli
Federico II, Naples, Italy
Marco Arteaga, Departament de Control y Robótica, Universidad Nacional Autónoma de México, Coyoacán,
Mexico
Bijaya Ketan Panigrahi, Electrical Engineering, Indian Institute of Technology Delhi, New Delhi, Delhi, India
Samarjit Chakraborty, Fakultät für Elektrotechnik und Informationstechnik, TU München, Munich, Germany
Jiming Chen, Zhejiang University, Hangzhou, Zhejiang, China
Shanben Chen, Materials Science and Engineering, Shanghai Jiao Tong University, Shanghai, China
Tan Kay Chen, Department of Electrical and Computer Engineering, National University of Singapore,
Singapore, Singapore
Rüdiger Dillmann, Humanoids and Intelligent Systems Laboratory, Karlsruhe Institute for Technology,
Karlsruhe, Germany
Haibin Duan, Beijing University of Aeronautics and Astronautics, Beijing, China
Gianluigi Ferrari, Università di Parma, Parma, Italy
Manuel Ferre, Centre for Automation and Robotics CAR (UPM-CSIC), Universidad Politécnica de Madrid,
Madrid, Spain
Sandra Hirche, Department of Electrical Engineering and Information Science, Technische Universität
München, Munich, Germany
Faryar Jabbari, Department of Mechanical and Aerospace Engineering, University of California, Irvine, CA,
USA
Limin Jia, State Key Laboratory of Rail Traffic Control and Safety, Beijing Jiaotong University, Beijing, China
Janusz Kacprzyk, Systems Research Institute, Polish Academy of Sciences, Warsaw, Poland
Alaa Khamis, German University in Egypt El Tagamoa El Khames, New Cairo City, Egypt
Torsten Kroeger, Stanford University, Stanford, CA, USA
Yong Li, Hunan University, Changsha, Hunan, China
Qilian Liang, Department of Electrical Engineering, University of Texas at Arlington, Arlington, TX, USA
Ferran Martín, Departament d’Enginyeria Electrònica, Universitat Autònoma de Barcelona, Bellaterra,
Barcelona, Spain
Tan Cher Ming, College of Engineering, Nanyang Technological University, Singapore, Singapore
Wolfgang Minker, Institute of Information Technology, University of Ulm, Ulm, Germany
Pradeep Misra, Department of Electrical Engineering, Wright State University, Dayton, OH, USA
Sebastian Möller, Quality and Usability Laboratory, TU Berlin, Berlin, Germany
Subhas Mukhopadhyay, School of Engineering & Advanced Technology, Massey University,
Palmerston North, Manawatu-Wanganui, New Zealand
Cun-Zheng Ning, Electrical Engineering, Arizona State University, Tempe, AZ, USA
Toyoaki Nishida, Graduate School of Informatics, Kyoto University, Kyoto, Japan
Luca Oneto, Department of Informatics, Bioengineering, Robotics, University of Genova, Genova, Genova,
Italy
Federica Pascucci, Dipartimento di Ingegneria, Università degli Studi “Roma Tre”, Rome, Italy
Yong Qin, State Key Laboratory of Rail Traffic Control and Safety, Beijing Jiaotong University, Beijing, China
Gan Woon Seng, School of Electrical & Electronic Engineering, Nanyang Technological University,
Singapore, Singapore
Joachim Speidel, Institute of Telecommunications, Universität Stuttgart, Stuttgart, Germany
Germano Veiga, Campus da FEUP, INESC Porto, Porto, Portugal
Haitao Wu, Academy of Opto-electronics, Chinese Academy of Sciences, Beijing, China
Walter Zamboni, DIEM - Università degli studi di Salerno, Fisciano, Salerno, Italy
Junjie James Zhang, Charlotte, NC, USA
The book series Lecture Notes in Electrical Engineering (LNEE) publishes the
latest developments in Electrical Engineering - quickly, informally and in high
quality. While original research reported in proceedings and monographs has
traditionally formed the core of LNEE, we also encourage authors to submit books
devoted to supporting student education and professional training in the various
fields and applications areas of electrical engineering. The series cover classical and
emerging topics concerning:
• Communication Engineering, Information Theory and Networks
• Electronics Engineering and Microelectronics
• Signal, Image and Speech Processing
• Wireless and Mobile Communication
• Circuits and Systems
• Energy Systems, Power Electronics and Electrical Machines
• Electro-optical Engineering
• Instrumentation Engineering
• Avionics Engineering
• Control Systems
• Internet-of-Things and Cybersecurity
• Biomedical Devices, MEMS and NEMS
For general information about this book series, comments or suggestions, please
contact leontina.dicecco@springer.com.
To submit a proposal or request further information, please contact the Publishing
Editor in your country:
China
Jasmine Dou, Editor (jasmine.dou@springer.com)
India, Japan, Rest of Asia
Swati Meherishi, Editorial Director (Swati.Meherishi@springer.com)
Southeast Asia, Australia, New Zealand
Ramesh Nath Premnath, Editor (ramesh.premnath@springernature.com)
USA, Canada:
Michael Luby, Senior Editor (michael.luby@springer.com)
All other Countries:
Leontina Di Cecco, Senior Editor (leontina.dicecco@springer.com)
** This series is indexed by EI Compendex and Scopus databases. **

More information about this series at https://link.springer.com/bookseries/7818


J. Dinesh Peter · Steven Lawrence Fernandes ·
Amir H. Alavi
Editors

Disruptive Technologies
for Big Data and Cloud
Applications
Proceedings of ICBDCC 2021
Editors
J. Dinesh Peter Steven Lawrence Fernandes
Department of Computer Science Department of Computer Science
and Engineering Creighton University
Karunya Institute of Technology Omaha, NE, USA
and Sciences
Coimbatore, Tamil Nadu, India

Amir H. Alavi
Civil and Environmental Engineering
University of Pittsburgh
Pittsburgh, PA, USA

ISSN 1876-1100 ISSN 1876-1119 (electronic)


Lecture Notes in Electrical Engineering
ISBN 978-981-19-2176-6 ISBN 978-981-19-2177-3 (eBook)
https://doi.org/10.1007/978-981-19-2177-3

© The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Nature
Singapore Pte Ltd. 2022
This work is subject to copyright. All rights are solely and exclusively licensed by the Publisher, whether
the whole or part of the material is concerned, specifically the rights of translation, reprinting, reuse
of illustrations, recitation, broadcasting, reproduction on microfilms or in any other physical way, and
transmission or information storage and retrieval, electronic adaptation, computer software, or by similar
or dissimilar methodology now known or hereafter developed.
The use of general descriptive names, registered names, trademarks, service marks, etc. in this publication
does not imply, even in the absence of a specific statement, that such names are exempt from the relevant
protective laws and regulations and therefore free for general use.
The publisher, the authors, and the editors are safe to assume that the advice and information in this book
are believed to be true and accurate at the date of publication. Neither the publisher nor the authors or
the editors give a warranty, expressed or implied, with respect to the material contained herein or for any
errors or omissions that may have been made. The publisher remains neutral with regard to jurisdictional
claims in published maps and institutional affiliations.

This Springer imprint is published by the registered company Springer Nature Singapore Pte Ltd.
The registered company address is: 152 Beach Road, #21-01/04 Gateway East, Singapore 189721,
Singapore
Preface

This work comprises the proceedings of the International Conference on Big Data and
Cloud Computing (ICBDCC’21). This conference was organized with the primary
theme of promoting ideas that provide technological solutions to the big data and
cloud computing applications. ICBDCC provided a unique forum for the prac-
titioners, developers and users to exchange ideas and present their observations,
models, results and experiences with the researchers who are involved in real-time
projects that provide solutions for research problems of recent advancements in big
data and cloud computing technologies. In the last decade, a number of sophisticated
and new computing technologies have been developed. With the introduction of
new computing paradigms such as cloud computing, big data and other innovations,
ICBDCC provided a high-quality dissemination forum for new ideas, technology
focus, research results and discussions on the evolution of computing for the benefit
of both scientific and industrial developments. ICBDCC is supported by a panel of
reputed advisory committee members both from India and from all across the world.
This proceedings includes topics in the fields of big data, data analytics in cloud,
cloud security, cloud computing and big data and cloud computing applications.
The research papers featured in this proceedings provide novel ideas that contribute
to the growth of the society through computing technologies. The contents of this
proceedings will prove to be an invaluable asset to the researchers in the areas of big
data and cloud computing.
We appreciate the extensive time and effort put in by all the members of the
organizing committee for ensuring a high standard for the papers published in this
volume. We would like to express our thanks to the panel of experts who helped us to
review the papers and assisted us in selecting the candidate for the Best Paper Award.

v
vi Preface

We would like to thank the eminent keynote speakers who have shared their ideas
with the audience and all the researchers and academicians who have contributed
their research works, models and ideas to ICBDCC’21.

Coimbatore, India J. Dinesh Peter


Pittsburgh, PA, USA Amir H. Alavi
Omaha, NE, USA Steven Lawrence Fernandes
Contents

A Statistical Performance Analysis of GPU WAH Range Querying . . . . . 1


Mitchell Nelson, Joseph M. Myre, and Jason Sawin
Anonymized Questionnaire Analysis with Differential Privacy
for Large-Scale Crowdsourcing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
Yuichi Sei and Akihiko Ohsuga
An Optimized K-means Clustering Approach on Top of MapReduce . . . 19
Omar Abdul Wahab
A Framework to Preserve and Examine Pandemic-Healthcare-Data
Using IoMT . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
Seifedine Kadry and Venkatesan Rajinikanth
AR Cloud-Based Indoor Navigation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37
V. Kiruthika, M. Jagadeeswari, Sneha Prabha, and Sreejaa
Health Record Maintenance Using Cloud Computing and Multi
Authority Attribute-Based Encryption . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49
S. Hamsanandhini, Malathi Eswaran, and V. Varanambika
Ensemble DNN for the Brain Tumor Segmentation—A Hybrid
Framework Centric on Layer Level and Decision Level Fusion
of Multimodal Medical Images . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63
S. Sandhya, M. Senthil Kumar, and B. Chidhambararajan
Deep Learning-Based BDMSF Resource Sharing—A Systematic
Approach for Analysis and Visualization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75
K. Elaiyaraja, M. Senthil Kumar, and B. Chidhambararajan
Mitigation and Swift Curative Procedure on Alluring Smart City
Using Falcon Technology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85
G. S. Nivethini, R. Yokesh Srenevas, R. Rahfar Nisha,
and G. Ignisha Rajathi

vii
viii Contents

A Review of Security Analysis of Wearable Implantable Medical


Devices Using Biometric Encryption . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 95
G. Arun Jeba Kumar and V. Evelyn Brindha
Blockchain for CCTV Surveillance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 107
D. Dharani, K. Anitha Kumari, and R. Vasanthan
Intelligent Traffic Management System Using YOLO Machine
Learning Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 119
B. Gomathi and G. Ashwin
IoT-Based Security Camera Bot Using Raspberry Pi . . . . . . . . . . . . . . . . . . 129
M. Rufus, J. John Paul, S. Merlin Gilbert Raj, and G. Shine Let
Performance Analysis of Different Deep Learning Models
for Forest Fire Classification . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 143
S. Harshaw Kamal, R. K. Ragul Raj, T. Sabari, and R. Karthika
Secure GEDAR Routing Protocol for Underwater Data Collection
Using WSN . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 153
R. Joshua Samuel Raj, Bhupender Singh, D. R. Ganesh,
and N. Muthukumaran
A Comprehensive Review on Automatic Image Captioning Using
Deep Learning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 167
P. V. Kavitha and V. Karpagam
A GAN-Based Triplet FaceNet Detection Algorithm Using Deep
Face Recognition for Autism Child . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 177
R. Joshua Samuel Raj, S. Anantha Babu, A. Jegatheesan,
and V. M. Arul Xavier
Rapid Efficient Loss Less Color Image Compression Using RCT
Technique and Hierarchical Prediction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 189
R. Joshua Samuel Raj, T. Sudarson Rama Perumal, N. Muthukumaran,
and D. R. Ganesh
Design and Development of Web-based Photoplethysmogram
Signal Monitoring and Human Vital Parameters Measurement . . . . . . . . 203
W. S. Nimi, P. Subha Hency Jose, and R. Jegan
Smart Fetal Health Monitor . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 213
J. Anu Shilvya and P. Subha Hency Jose
Event Location Detection from Online Clustering Algorithms
Using Geo-Tagged User Data in Social Streams . . . . . . . . . . . . . . . . . . . . . . . 227
Bhuvaneswari Anbalagan
Smart Cyberbullying Detection with Machine Learning . . . . . . . . . . . . . . . 237
Shakambhari, Joshua Samuel Raj, and S. Anantha Babu
Contents ix

Tourist Sentiment Analysis Using Natural Language Processing . . . . . . . 249


T. N. Prabhu, S. Aarthi, and P. Nanthini
Sentiment Analysis of Twitter Data Using Machine Learning . . . . . . . . . . 259
K. Deepa, H. Sangita, and H. Shruthi
Comparative Study on Recognition of Food Item from Images
for Analyzing the Nutritional Contents . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 269
E. S. Sreetha, G. Naveen Sundar, and D. Narmadha
Shot Boundary Detection and Video Captioning Using Neural
Networks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 277
Avantika Balaji, S. Ganesh, T. Abishek Balaji,
and K. R. Sarath Chandran
Ensuring the Presence of a Person During Virtual Classes Using
Histogram of Oriented Gradients (HOG) Algorithm . . . . . . . . . . . . . . . . . . 287
S. Nithya, M. Revathi, A. Sathiya Sree, T. Sivapriya, and P. Vaishnavi
Identification of Alzheimer’s Disease Using Principal Component
Analysis-Based Data Mining Techniques . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 299
T. Jemima Jebaseeli, D. Jasmine David, and R. Emilin Renitta
Dew Computing-Inspired Mental Health Monitoring System
Framework Powered by a Lightweight CNN . . . . . . . . . . . . . . . . . . . . . . . . . 309
Tanusree Podder, Diptendu Bhattacharya, and Abhishek Majumdar
Feature Dimensionality Reduction Method on Social Network
Dataset . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 321
R. Rajkumar
Internet of Things (IoT) for Coronavirus (COVID-19) Pandemic:
A Survey on Trailblazing Techniques . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 333
Salomi Selvadass, J. John Paul, I. Thusnavis Bella Mary,
and A. Diana Andrushia
Comparison of Stock Market Prediction Using Deep Learning
Algorithms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 339
S. Revathi, Regina Begam, Radhika, and R. Akila
Automatic Irrigation and Crop Protection System Based on IoT . . . . . . . 355
M. Raja, N. M. Nithish, B. Saravana Shankar, and D. Sadhurwanth
Visual Question Answering System Using Co-attention Model . . . . . . . . . 365
D. Karthika Renuka, L. Ashok Kumar, R. Geetha Rajakumari,
R. Vinitha, R. Meena, and B. Swathi
x Contents

Internet of Things and Cloud Computing for Smart


Vermicomposting by Using Eisenia Fetida and Its Optimization
by ANN . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 375
Amar Kumar Das, Saroja Kumar Rout, Srikanta Kumar Dash,
and Abhijit Mangaraj
Time Series Analysis to Forecast Wind Speed . . . . . . . . . . . . . . . . . . . . . . . . 389
M. Sai Anand and R. Ramalakshmi
Software-Defined Network-Based Packet Keys to Secure Critical
Infrastructures of Internet of Things . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 403
Antony Taurshia, Jaspher W. Kathrine, and S. Jebapriya
Automated Face Authentication and Recognition Using Deep
Neural Network with SVM Classifier in Cloud Environment . . . . . . . . . . . 413
T. Sujatha, N. R. Wilfred Blessing, Sruthi Anand, and Esther Daniel
A Data Sharing Protocol to Minimize Security and Privacy Risks
of Cloud Storage in Big Data Era . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 423
S. Akila and Dhina Suresh
An IoT-Based Smart Device to Monitor and Analyse
the Performance of Athletes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 433
S. Revathi, V. Muthu Priya, C. A. Bhargavan, and Fatah Mohammed
Optical Character Recognition-Based Signboard Detection . . . . . . . . . . . . 447
N. Dinesh and Senthilkumar Mathi
IPL Win Prediction Using Machine Learning . . . . . . . . . . . . . . . . . . . . . . . . 457
P. V. Kavitha, R. Sai Pavitra, K. Suwetha, and P. Uvashree
HEIST DETECTOR: A Secured IOT-Based Real-Time System . . . . . . . . 467
R. Abinaya, S. Tharani, A. S. Arunachalam, and K. Suthendran
Blockchain-Based Decentralized E-Voting System Using Smart
Contract . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 481
R. Priscilla, G. Jaspher Willsie Kathrine, M. Rubavathi,
and T. Sruthi Krithika
Machine Learning-Based Diagnosis of Diseases Associated
with Abnormal and Heavy Menstrual Bleeding: A Literature
Review . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 493
P. Raji and P. Subha Hency Jose
Analysing the Resting-State Functional Connectivity of Chronic
Pain Patients . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 503
V. Rejula, J. Anitha, and R. V. Belfin
Detection of Drowsiness Using Artificial Intelligence . . . . . . . . . . . . . . . . . . 513
J. Sri Nivetha and G. Aswini
Contents xi

Classifying Sleep Stages Automatically in Single-channel Against


Multi-channel EEG: A Performance Analysis . . . . . . . . . . . . . . . . . . . . . . . . 527
B. L. Radhakrishnan, E. Kirubakaran, Immanuel Johnraja Jebadurai,
and Kummari Gurudev
Design and Development of a Weather Forecasting Android
Application . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 539
Nandini S. Hinduja, D. Jayashree, O. Pandithurai, A. R. Monica,
and N. V. Keerthana
Enhanced Monotonic Activation Function in Convolutional Neural
Network for Multiclass EEG Signal Classification . . . . . . . . . . . . . . . . . . . . 559
M. Bhuvaneshwari, E. Grace Mary Kanaga, and J. Anitha
BPSO-PSO-SVM: An Integrated Approach for Cancer Diagnosis . . . . . . 571
Amrutanshu Panigrahi, Santosini Bhutia, Bibhuprasad Sahu,
Mohammad Gouse Galety, and Sachi Nandan Mohanty
Detecting the Lateral Movement in Cyberattack at the Early Stage
Using Machine Learning Techniques . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 581
Ashwathy Anda Chacko, Bijolin Edwin, and M. Roshni Thanka
Deep Learning-Based Big Data Analytics Model for Activity
Monitoring of Elderly People . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 589
M. Roshni Thanka, Sujitha Juliet, E. Bijolin Edwin, and R. Raahul John
Customized Internet of Things-Based Bus Tracking
and Management System . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 601
K. Akilan, Anusha Chandrasekaran, Arunesh Kumar,
S. B. Jashwaanth, A. Joshua, S. Rajalakshmi, and S. Angel Deborah
An Emerging Paradigm in IoT-Based Indoor Positioning System . . . . . . . 613
Shilpa Shyam, Sujitha Juliet, and Kirubakarn Ezra
SMART CARAFE: An IoT-Based Real Time System . . . . . . . . . . . . . . . . . 621
K. Suthendran and T. Subburaj
Deep Learning-Based Lung Cancer Detection . . . . . . . . . . . . . . . . . . . . . . . . 633
S. Mahima, S. Kezia, and E. Grace Mary Kanaga
Pneumonia Detection from Chest X-Ray Images Using Deep
Learning Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 643
C. Lenny, A. Ajitha Margharet, B. Shiny, Sabnam Tigga,
and S. Thomas George
A Meta-Analysis on the Algorithms for Virtual Machine
Consolidation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 657
Rose Rani John and E. Grace Mary Kanaga
xii Contents

Implementation of Compression Technique for Endoscopy Video


Using Intra-Coding HEVC . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 667
Suvarna Nandyal and Heena Kouser Gogi
Hybrid Multi-filter and Harmony Search Algorithm-Based Gene
Selection Method for Cancer Classification . . . . . . . . . . . . . . . . . . . . . . . . . . 679
Bibhuprasad Sahu and Mohammad Gouse
COVID-19 Cases Prediction Using Different LSTM Models
and Comparison of Effectiveness of Different Models . . . . . . . . . . . . . . . . . 689
Essmily Simon, Swapna Sasi, and Aswathy Wilson
EEG-Based Home Automation System Using Brain Sense Device . . . . . . 701
Christina Saju, Samson T. Anil, and S. Thomas George
Fertilizer Recommendation System Using Machine Learning . . . . . . . . . . 709
D. Jayashree, O. Pandithurai, L. Paul Jasmin Rani,
Praveena K. Menon, Mahek V. Beria, and S. Nithyalakshmi
Land Use and Land Cover Mapping of Landsat Image using
Segmentation Techniques . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 717
M. Mohith and R. Karthi
Analyzing the Financial Soundness and Resilience of Select Small
Finance Banks with RBI’s Big Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 729
T. Augustus Immanuel Pauldurai, J. Anitha, and M. Vijila
Enhancing Data Security for Sharing Personalized Data in Mobile
Cloud Environments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 739
G. Kavitha, P. Latchoumy, and A. Sonya
A Decision Support System for Scheduling Lockdown in COVID-19
Pandemic . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 751
Abirami Sreerenganathan and R. Joshua Samuel Raj
Simulation and Analysis of Intrusion Resilient Smart Metering
System . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 759
Abhishek Patil, Adeesh Acharya, B. Y. Manthan, S. Nagasundari,
and Prasad B. Honnavalli
Design and Development of Big Data Framework Using NoSQL–
MongoDB and Descriptive Analytics of Indian Green Coffee
Export Demand Modeling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 777
Saivijayalakshmi Janakiraman and N. Ayyanathan
Two-Tier Securing Mechanism Against Web Application Attacks . . . . . . 787
Vikas Matam, H. S. Shankaranarayana Hebbar, Prince Jha,
Amruthanshu Bhat, S. Nagasundari, and Prasad B. Honnavalli
Contents xiii

IoT Enabled Detection and Notification System for Potholes


and Road Cracks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 799
N. Aishwarya, N. G. Praveena, C. Karthikeyan, and S. Priyanka
Constructing Pixel Picture Languages Using Cell-Like SN+ P
System . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 807
Y. Preethi Ceon, Hepzibah A. Christinal, S. Jebasingh,
and D. Abraham Chandy
A Decision Support System for Restricted Movement
with Semaphores During Lockdown in COVID-19 Pandemic . . . . . . . . . . 815
Bushra Haqqi and R. Joshua Samuel Raj
Comparative Analysis of Deep Learning Models for Cotton Leaf
Disease Detection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 825
X. Anitha Mary, Kumudha Raimond, A. Peniel Winifred Raj,
I. Johnson, Vladimir Popov, and S. J. Vijay
Facility Recommendation Based on Trajectory Clustering . . . . . . . . . . . . . 835
S. Sharmila and B. A. Sabarish
COVID-19 Symptom Analysis and Prediction Using Machine
Learning Techniques . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 847
S. Mahima, T. Mathu, and Kumudha Raimond
Segmentation of Streets and Buildings Using U-Net from Satellite
Image . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 859
Jayasurya Subramanian, Senthil Kumar Thangavel,
and Pasquale Caianiello
Satellite Pose Estimation Using Modified Residual Networks . . . . . . . . . . 869
M. Uma Rani, Senthil Kumar Thangavel, and Ravi Kumar Lagisetty
Opinion on Prediction Algorithms for Identifying Autism
Spectrum Disorder . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 883
D. Darling Jemima, A. Grace Selvarani, and J. Daphy Louis Lovenia
Classification of Music Genres Based on Mel-Frequency Cepstrum
Coefficients Using Deep Learning Models . . . . . . . . . . . . . . . . . . . . . . . . . . . . 891
Manoj Preetham, Jemimah Beulah Panga, J. Andrew,
Kumudha Raimond, and Hien Dang
About the Editors

J. Dinesh Peter is currently working as an associate professor, Department of


Computer Sciences Technology at Karunya University, Coimbatore. Before this,
he was a full-time research scholar at the National Institute of Technology, Calicut,
India, from where he received his Ph.D. in computer science and engineering. His
research focus includes big data, image processing, and computer vision. He has
several publications in various reputed international journals and conference papers
which are widely referred to. He is a member of IEEE, CSI and IEI and has served as
session chairs and delivered plenary speeches for various international conferences
and workshops.

Steven Lawrence Fernandes is an assistant professor in the Department of Computer


Science at Karunya University, Coimbatore. After earning his Ph.D. in Electronics
and Communication Engineering at Karunya Institute of Technology and Science
and his Masters in Microelectronics at Manipal Institute of Technology, he began his
postdoctoral research at the University of Alabama, Birmingham. There, he worked
on NIH-funded projects. He also conducted postdoctoral research at the University
of Central Florida. This research included working on DARPA, NSF, and RBC-
funded projects. His publications include research articles in highly selective arti-
ficial intelligence venues. Dr. Fernandes’s current area of research is focused on
using artificial intelligence techniques to extract useful patterns from big data. This
includes robust computer vision applications using deep learning and computer-aided
diagnosis using medical image processing.

Dr. Amir H. Alavi is an assistant professor in the Department of Civil and Environ-
mental Engineering and holds a courtesy appointment in the Department of Bioengi-
neering at the University of Pittsburgh. Dr. Alavi’s research interests include struc-
tural health monitoring, multifunctional structures, advanced sensors, low-power
energy harvesting, and engineering informatics. His research activities involve the
implementation of self-sustained and multifunctional sensing and structural systems
enhanced by engineering informatics in the fields of civil infrastructure, construc-
tion, aerospace, and biomedical engineering. Dr. Alavi has authored seven books

xv
xvi About the Editors

and over 200 publications in archival journals, chapters, and conference proceed-
ings. He has received several award certificates for his journal articles. He is among
the Google Scholar 200 Most Cited Authors in Civil Engineering, Web of Science
ESI’s World Top 1% Scientific Minds in 2018, and the Stanford University list of
Top 1% Scientists in the World 2020.
A Statistical Performance Analysis of
GPU WAH Range Querying

Mitchell Nelson, Joseph M. Myre, and Jason Sawin

1 Introduction

As science, industry, and entertainment continue to generate staggering amounts


of data, database managers have been seeking hardware/software hybrid solutions
to assist in processing the deluge of information. Bitmap indices are commonly
employed software solutions for large read-only scientific databases. Such indices
are coarse binary representations of the underlying data. As these representations are
generally sparse matrices, they are highly compressible using hybrid run-length com-
pression schemes. The compressed results of such schemes can be queried directly
using native hardware-enabled bitwise operators, significantly increasing query pro-
cessing efficiency.
Recent work has shown that porting some or all of the bitmap query processing to
graphical processing units (GPUs) can significantly increase efficiency [1, 2, 10, 11,
14, 15]. GPUs are massively parallel computational accelerators initially designed
to assist in the execution of graphic-intensive software packages. However, they are
now commonly used to execute a wide array of applications, including scientific
simulations, neural networks, etc. GPUs are now commonplace in many computing
systems. This ubiquity means that incorporating GPUs into query execution adds
benefits beyond their fast processing rate. The use of GPUs frees the CPU and system
RAM to manage less specialized processes. It also means that a more complete use of
the system resources is realized, thus saving money and reducing the environmental
cost of query processing.
This paper presents a statistical performance analysis of the GPU bitmap query
engine we introduced in Nelson et al. [11]. The engine was specifically designed to
process range queries over bitmaps compressed using Word Aligned Hybrid codes

M. Nelson · J. M. Myre · J. Sawin (B)


Department of Computer and Information Sciences,
University of St. Thomas, Saint Paul, MN, USA
e-mail: jason.sawin@stthomas.edu
© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 1
J. D. Peter et al. (eds.), Disruptive Technologies for Big Data and Cloud Applications,
Lecture Notes in Electrical Engineering 905,
https://doi.org/10.1007/978-981-19-2177-3_1
2 M. Nelson et al.

Table 1 Left table shows a relation which records the number of different models of cars sold in
a year. The right shows a potential bitmap index for that relation
Cars Sold Model Bins Volume Bins
Model # sold m0 m1 m2 m3 m4 m 5 v0 v1 v2 v3
Tiago 51K 1 0 0 0 0 0 0 0 0 1
Nexon 48K 0 1 0 0 0 0 0 0 1 0
Alitroz 47K 0 0 1 0 0 0 0 0 1 0
Harrier 15K 0 0 0 1 0 0 0 1 0 0
Tigor 10K 0 0 0 0 1 0 0 1 0 0
Zest 220 0 0 0 0 0 1 1 0 0 0

(WAH) [17]. Here, we present an ANOVA analysis [7] applied to the results of a
rigorous empirical study of our engine. The products of this analysis indicate potential
features of the query framework that can be tuned to increase efficiency. Further, the
results provide guidance for the creation of a more generalized framework.

2 Background

Bitmap Indices: Bitmap indices are created by binning the tuples of the relation
being indexed. First, attribute domains are partitioned into sets of discrete values
bins or ranges of values bins depending on the domain. Then, each tuple is analyzed,
creating a row in the bitmap. The value of each attribute is inspected, and a 1 is
placed in the bitmap bin that corresponds to that value. The remaining bins in that
set are assigned 0.
A possible example of this binning process is shown in Table 1. The relation on
the right records the yearly sales of Tata car models. Since there are two attributes,
there are two sets of bins in the bitmap shown on the left. As the values of the Model
attribute are discrete, each value is assigned its own bin. These are the bins prefixed
with an m. A 1 in the m 0 bin indicates the corresponding tuple had the Tiago
value for the Model attribute, m 1 indicates a value of Nexon, and so on. The # sold
attribute holds values that fall in a continuum. Thus, it can be binned using range
bins. In this example, the bin v3 represents values ≥50 k, v2 represents the range
[20 k–50 k), v1 is [10 k, 20 k), and v0 is <10 k. The first tuple in Cars Sold results
in a bitmap row of 1 in m 0 representing that it records the number of Tiagos sold.
All other m bins are assigned 0. Since there were over 50 k Tiagos, a 1 is placed
in the v3 , and the remaining v bins are assigned 0. This process is applied to all the
tuples in Cars Sold.
A significant benefit of bitmap indices is the ability to query the bitmap directly
using hardware-enabled bitwise operations. Again, consider the example shown in
Table 1. An answer to the query “Which models had a sales volume between 15 and
55 k cars?” can be derived by bitwise ORing columns of the bitmap. Specifically,
A Statistical Performance Analysis of GPU WAH Range Querying 3

v1 ∧ v2 ∧ v3 = R, where R is the result vector of the operation. Each row in R that


contains a 1, corresponds to a tuple in the database that has a # sold value that needs
to be further inspected to see if it falls into the exact range.
The other main benefit of bitmap indices is their compressibility. The matrices
generated by the binning process are generally very sparse. As such, the columns
of the bitmaps are amenable to run-length compression. Numerous compression
techniques have been developed for bitmap indices. One of the most prominent is
Word Aligned Hybrid codes (WAH) [17]. WAH uses a hybrid encoding of literal
values and compressed homogeneous runs tuned to the system’s word length. It
has been shown that WAH can achieve tremendous compression with the added
benefit that the compress results can be queried directly without requiring explicit
decompression.
Graphics Processing Units (GPU): NVIDIA’s CUDA provides a programming
platform for NVIDIA GPUs. With CUDA, thousands of threads can be organized
into Cartesian structures (1-, 2-, or 3-dimension). These structures intuitively map
to many computational problems. In CUDA, threads are organized hierarchically as
threads, thread blocks, and thread grids. Threads are executed in warps (groups of 32
threads). Thus, thread blocks are most often a multiple of 32 threads. The memory
hierarchy present in NVIDIA GPUs consists of global memory, L2 cache, L1/shared
memory, and registers, in descending order of access time. The shared memory can
act as a fast access memory for data storage on chip.
GPU Range Queries: NVIDIA GPUs have been fundamental in enhancing the per-
formance of WAH queries [1, 2, 10, 11, 14, 15]. All of these approaches transfer
the bins to the GPU in their compressed form. To fully make use of the parallel
processing power of the GPUs, the bins are then decompressed. In this paper, we
will focus on the highest performing method published thus far, the ideal hybrid
approach of Nelson et al. [10]. The hybrid query approach computes a range query
of the form A1 ∨ . . . ∨ An , where Ai is a decompressed bitmap bin using a par-
allel reduction. Parallel reductions exploit the presence of independent operations
to enhance performance. For example, R3 = A5 ∨ A6 and R4 = A7 ∨ A8 could be
solved simultaneously. This reduces the time of this operation from O(n) to O(log n),
where n is the number of bins in the query.

3 Evaluation Methodology

In this section, the testing methodology used to produce our results is described. All
tests were executed on a machine running Ubuntu 16.04.5 LTS equipped with dual
8-core Intel Xeon E5-2609 v4 CPUs (each at 1.70 GHz) and 32 GB of RAM. All tests
were developed using CUDA v9.0.176 and run on two different GPUs: an NVIDIA
GeForce GTX 1080 with 8 GB of memory and an NVIDIA Titan X with 12 GB of
memory.
4 M. Nelson et al.

Parameters related to the GPU implementation and the characteristics of the


queried dataset can affect the performance of WAH range queries on GPUs. Vary-
ing these factors can induce a nonlinear response in performance. This necessitates
analysis to determine the parameters that reliably predict the performance of WAH
range queries on GPUs.
We examine two sets of performance parameters. The first set is intrinsic to the
architectural implementation of the GPU WAH range-query algorithm. These include
the use of shared memory, CUDA streams for decompression, CUDA streams query
execution, GPU base clock frequency, and global memory bandwidth. The second set
of parameters is intrinsic to the dataset/query. These parameters include the number of
bins being queried, the number of rows in the dataset, and the dataset compressibility.
We use the ideal hybrid GPU-based range-query method described in Nelson et
al. [10] to execute a range queries of 4, 8, 16, 32, and 64 random bit vectors.
Each experiment was run six times, and the execution time of each was recorded.
To remove transient program behavior, the first result is discarded and the remaining
execution times are used for analysis.

3.1 Dataset Creation

We use the Zipf synthetic datasets for our evaluation. These datasets are created
using a Zipf distributions which represent a clustered approach to discretization.
This essentially creates a skewed distribution of 1’s in our bitmap, simulating the
types distribution seen in real data. The Zipfdistribution generator assigns each bit a
n
probability of: p(k, n, skew) = (1/k skew )/ i=1 (1/i skew ), where n is the number of
elements determined by cardinality, k is their rank, and the coefficient skew creates
an exponentially skewed distribution. We set k = 10, n = 10 and set skew = 0, 1, 2,
and 3. These different skew values create datasets of varying bit density. Using these
parameters, we generate 16 different datasets containing 100 bins (i.e., ten attributes
discretized into ten bins each) and 8, 16, 32, and 64 million rows.

3.2 Statistical Performance Analysis

We use a statistical ANOVA [7] approach to analyze our test results. This approach
quantifies the impact of each factor (from the beginning of this section) on perfor-
mance. An ANOVA analysis determines whether a statistically significant difference
exists among the means of each test. This is done by separating the total observed mea-
surement variation into two components: (1) the variation within a system (assumed
to be measurement error) and (2) the variation between systems (assumed to be due
to both actual differences between systems and to measurement error). Statistically
significant differences between systems are determined via an F-test that compares
variances across the systems.
A Statistical Performance Analysis of GPU WAH Range Querying 5

An m-factor ANOVA analysis is required as more than two factors are present in
the sets of dataset and architectural factors outlined at the beginning of this section.
When two or more factors are present in an ANOVA analysis, the interactions between
factors can be considered. Interactions are important to consider as the combination
of factors can be more impactful than simply summing the impact of each factor
independently (the whole can be greater than the sum of the parts). The impact of
each unique factor and interaction of factors are called effects. From the results of
the ANOVA analysis, it is possible to compute the percent impact of each effect by
forming the ratio of total variation in measurement due to each effect to the sum of
total variation of all effects and measurement errors.

4 Results

Here, we present the results of the experiments and ANOVA analyses described
above. We first present the results of the ANOVA analysis of all factors, then analyzes
of factors linked strictly to architectural details and dataset details. All results are
reported as a rank ordering of percent impact of each effect.
Rankings of effects for all factors are shown in Fig. 1. The two most significant
factors contributing to variations in performance are the use of shared memory on the
GPU and the number of rows in the dataset, accounting for 40.26% and 27.32% of the
overall variation in performance, respectively, and the interaction thereof accounting
for 20.66% of the overall variation in performance. The sum of the stand-alone and
interaction effects of these two factors accounts for 88.24% of the overall variation
in performance. All remaining effects each account for less than 3.5% of the overall
variation in performance.
Rankings of effects for architecturally linked factors are shown in Fig. 2. The two
most significant architectural factors are the use of shared memory on the GPU and the
base clock rate of the GPU, accounting for 90.79% and 5.04% of the overall variation
in performance, respectively, and the interaction thereof accounting for 3.2% of the
overall variation in performance. The sum of the stand-alone and interaction effects
of these two factors accounts for 99.03% of the overall variation in performance.
The sum of all remaining effects accounts for less than 1% of the overall variation
in performance.
Rankings of effects for dataset linked factors are shown in Fig. 3. The two most
significant factors associated with the dataset are the number of rows in the dataset
and the number of columns in the query, accounting for 63.06% and 23.68% of the
overall variation in performance, respectively, and the interaction thereof accounting
for 12.17% of the overall variation in performance. The sum of the stand-alone and
interaction effects of these two factors accounts for 98.81% of the overall variation
in performance. The sum of all remaining effects accounts for just over one percent
of the overall impact on performance.
6 M. Nelson et al.

Fig. 1 Five most influential factors to variations in performance ranked by percent of overall
variation

Fig. 2 Four most influential architectural factors to variations in performance ranked by percent
of overall variation

Fig. 3 Four most influential dataset and query factors to variations in performance ranked by
percent of overall variation

5 Discussion of Results

As seen in Fig. 1, the two most significant factors are the use of shared memory and
the number of rows (88.24% of the total variation in performance). Interestingly,
bit density has no significant effect on performance for the GPU query method
tested here. Figure 4 presents the approximate profiles of GPU execution time when
varying the two most significant factors, which demonstrate these effects in practice.
As shown, (A) does not use shared memory and has a large number of rows, (B) does
not use shared memory and has a small number of rows, (C) uses shared memory
A Statistical Performance Analysis of GPU WAH Range Querying 7

A)
B)
C)
D)

Fig. 4 Shown are profiles of four query executions when the two most significant factors to per-
formance (the use of shared memory and the number of rows in the database) are varied

Fig. 5 Effects of the three most significant factors (the use of shared memory on the GPU, the
number of rows in the dataset, and the interaction thereof) on execution time

and has a large number of rows, and (D) uses shared memory and has a small number
of rows.
Query performance derived from the primary factors (use of shared memory and
the number of rows) can also be visualized in Fig. 5. Following the arrows in this
figure results in performance enhancement. E.g., using shared memory, decrease the
total number of rows, or both. We can see the importance of examining interactions of
factors in this figure, as exploiting both factors to gain performance is more beneficial
than only using one.

6 Related Work

GPUs and CUDA have enabled the acceleration of many general-purpose computing
problems. Many times, these come from a focus on core mathematical routines [6, 12,
13] or parallel programming primitives [5, 8]. With these, researchers have been able
8 M. Nelson et al.

to accelerate numerous other applications including computational fluid dynamics


models [3], finite element methods [16], and traditional relational databases [4].
Among the many tools that are useful for performance analysis, ANOVA has used
to identify critical factors to focus on for performance enhancements [7, 9].

7 Conclusion and Future Work

In this paper, we present a statistical performance analysis of WAH range queries


using GPUs and the method of Nelson et al. [10]. We use an ANOVA analysis to
quantify the effects of architectural and dataset factors on the variation of overall
performance. Focusing only on architectural factors, our analysis finds that the use
of shared memory and the rate of the base clock are responsible for 99.03% of the
total variation in performance. Focusing only on dataset factors, our analysis finds
that the number of rows and columns is responsible for 98.81% of the total variation
in performance. When considering all factors, our analysis finds that the use of shared
memory and the total number of rows accounts for 88.24% of the total variation in
performance.
These results suggest that for GPU WAH range queries, the use of shared memory
is critical for performance. Secondary to shared memory is the rate of the base clock.
Intuitively, our results also show that datasets with more rows will be slower to query
than datasets with fewer rows, all else being equal.
We plan to continue investigating the performance space of the hybrid method.
This includes the effect of additional data characteristics, data layout in shared mem-
ory, and the distribution of data/queries across multiple GPUs.

References

1. W. Andrzejewski, R. Wrembel, GPU-WAH: applying GPUs to compressing bitmap indexes


with word aligned hybrid, in International Conference on Database and Expert Systems Appli-
cations (Springer, Berlin, Heidelberg, 2010), pp. 315–329
2. W. Andrzejewski, R. Wrembel, GPU-PLWAH: GPU-based implementation of the PLWAH
algorithm for compressing bitmaps. Control Cybern. 40, 627–650 (2011)
3. P. Bailey, J. Myre, S.D. Walsh, D.J. Lilja, M.O. Saar, Accelerating lattice Boltzmann fluid flow
simulations using graphics processors, in 2009 International Conference on Parallel Processing
(IEEE, 2009), pp. 550–557
4. P. Bakkum, K. Skadron, Accelerating SQL database operations on a GPU with CUDA, in Pro-
ceedings of the 3rd Workshop on General-Purpose Computation on Graphics Processing Units
(ACM, New York, NY, 2010), GPGPU-3, pp. 94–103. http://doi.acm.org/10.1145/1735688.
1735706
5. N. Bell, J. Hoberock, Thrust: a productivity-oriented library for CUDA, in GPU Computing
Gems Jade Edition (Elsevier, 2012), pp. 359–371
6. J. Dongarra, M. Gates, A. Haidar, J. Kurzak, P. Luszczek, S. Tomov, I. Yamazaki, Accelerating
numerical dense linear algebra calculations with GPUs, in Numerical Computations with GPUs
(2014), pp. 1–26
A Statistical Performance Analysis of GPU WAH Range Querying 9

7. D.J. Lilja, Measuring Computer Performance: A Practitioner’s Guide (Cambridge University


Press, 2005)
8. D. Merrill, CUB: CUDA unbound (2016). http://nvlabs.github.io/cub
9. J. Myre, S.D. Walsh, D. Lilja, M.O. Saar, Performance analysis of single-phase, multiphase, and
multicomponent lattice-Boltzmann fluid flow simulations on GPU clusters. Concurr. Comput.
Pract. Exp. 23(4), 332–350 (2011)
10. M. Nelson, Z. Sorenson, J.M. Myre, J. Sawin, D. Chiu, GPU acceleration of range queries over
large data sets, in 2019 IEEE/ACM International Conference on Big Data Computing, Appli-
cations, and Technologies (BDCAT19) (IEEE/ACM, 2019). https://doi.org/10.1145/3365109.
3368789
11. M. Nelson, Z. Sorenson, J.M. Myre, J. Sawin, D. Chiu, Parallel acceleration of CPU and GPU
range queries over large data sets. J. Cloud Comput. 9(1), 1–21 (2020)
12. S. Tomov, J. Dongarra, M. Baboulin, Towards dense linear algebra for hybrid GPU accelerated
manycore systems. Parallel Comput. 36(5–6), 232–240 (2010a)
13. S. Tomov, R. Nath, H. Ltaief, J. Dongarra, Dense linear algebra solvers for multicore with GPU
accelerators, in 2010 IEEE International Symposium on Parallel & Distributed Processing,
Workshops and Phd Forum (IPDPSW) (IEEE, 2010b), pp. 1–8
14. B. Tran, B. Schaffner, J. Sawin, J.M. Myre, D. Chiu, Increasing the efficiency of GPU bitmap
index query processing, in International Conference on Database Systems for Advanced Appli-
cations (Springer, 2020), pp. 339–355
15. B. Tran, B. Schaner, J.M. Myre, J. Sawin, D. Chiu, Exploring means to enhance the efficiency
of GPU bitmap index query processing. Data Sci. Eng. 6(2), 209–228 (2021)
16. S.D. Walsh, M.O. Saar, P. Bailey, D.J. Lilja, Accelerating geoscience and engineering system
simulations on graphics hardware. Comput. Geosci. 35(12), 2353–2364 (2009)
17. K. Wu, E.J. Otoo, A. Shoshani, Compressing bitmap indexes for faster search operations, in
Proceedings 14th International Conference on Scientific and Statistical Database Management
(IEEE, 2002), pp. 99–108
Anonymized Questionnaire Analysis
with Differential Privacy for Large-Scale
Crowdsourcing

Yuichi Sei and Akihiko Ohsuga

1 Introduction

Several studies for crowdsourcing have been proposed recently [3, 4, 18]. In crowd-
sourcing, a requester posts tasks, such as questionnaires, programming, and proof-
reading, and a worker selects and conducts a task while considering the complexity
and fee of the task. Crowdsourcing systems receive money from requesters in advance
and pay fees to workers for completed tasks. Therefore, personally identifiable infor-
mation (PII) of each requester and worker is registered to crowdsourcing systems.
In this article, we focus on questionnaires as tasks. A requester posts a question-
naire, which a worker can answer if he/she wishes to. Because the crowdsourcing
system has the PII of workers, it is preferable that workers send the answers of the
questionnaire to the requester directly when questionnaires need sensitive informa-
tion of workers, such as salary and religion. However, it is possible that requesters
can identify workers. Most questionnaires contain questions about the answerers’
attributes, such as age and sex, because the requester analyzes the results of the ques-
tionnaire in terms of the attributes of the answerers in many cases. The requester may
identify the worker from these basic items of questionnaires, such as age and sex.
Sweeney [17] found that 87% of the US population is uniquely identified by {date of
birth, sex, 5-digit ZIP}. Of course, other attributes can also be used for identification.
Rocher et al. [14] reported that 99.98% of Americans can be identified using 15
attributes. This problem can cause workers to avoid answering questionnaires.

Y. Sei (B) · A. Ohsuga


The University of Electro-Communications, Tokyo, Japan
e-mail: seiuny@uec.ac.jp
A. Ohsuga
e-mail: ohsuga@uec.ac.jp
Y. Sei
JST, PRESTO, Kawaguchi, Saitama, Japan

© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 11
J. D. Peter et al. (eds.), Disruptive Technologies for Big Data and Cloud Applications,
Lecture Notes in Electrical Engineering 905,
https://doi.org/10.1007/978-981-19-2177-3_2
12 Y. Sei and A. Ohsuga

Moreover, requesters can obtain other information about workers from crowd-
sourcing systems. The requester can see the IDs of workers in crowdsourcing systems
because the requester should check the work result of each worker and tell the crowd-
sourcing system that the requester accepts to pay an agreed fee for the work if the
requester is satisfied with it. Based on the IDs, requesters can check several attribute
values registered in the crowdsourcing system of workers, such as state/province of
address, career, and skills. This information can also be used to identify workers. In
this study, we assume that a worker agent exists in each worker’s personal computer
or smartphone. The worker agent anonymizes the worker’s attributes.
Our previous work can collect information from workers while protecting their
privacy. However, it assumes that the number of answers to be collected is only one;
moreover, it assumes that the privacy-protection level among workers should be the
same. In this study, we propose a technique that can collect multiple answers under
the assumption that each worker can set a different privacy-protection level for each
answer.
The rest of this article is organized as follows. In Sect. 2, we introduce our appli-
cation and attack model. In Sect. 3, we define the privacy used in this work. In Sect. 4,
we describe the related methods. We present the design of the proposed algorithm in
Sect. 5, and we present the simulation results in Sect. 6. In Sect. 7, we conclude the
paper.

2 Assumptions

We assume that requesters want to analyze the results of questionnaires containing


several sensitive information, such as salary and religion. For instance, a requester
may want to know the relationship among educational background, age, sex, and
salary. The requester can post a questionnaire that contains questions about these
attributes. If we can ensure that the requester cannot identify workers from edu-
cational background, age, sex, and other public attribute values registered in the
crowdsourcing system, workers can answer their salary without worries. However,
if we cannot ensure that, the worker is hesitant to answer this questionnaire.
In addition, we assume that all sensitive attribute values are categorized ones. In
many cases, analyzers (i.e., requesters in this study) create a histogram or a cross
tabulation of the data for their analysis. Therefore, analyzers do not need detailed
numerical values. If analyzers want to analyze numerical values, such as salary, they
categorize it in advance. For example, category 1 includes from 0 to 100, and category
2 includes from 100 to 200.
We assume that requesters generate cross tabulation for a basic analysis. In prac-
tice, the applications of cross tabulations involve only less than four variables at a time
[8, 19] even if the requesters collect information about many attributes. For instance,
even if a questionnaire has questions about age, sex, education, hobby, salary, and
disease, it is difficult to analyze a cross tabulation created from all attributes all at
Anonymized Questionnaire Analysis with Differential Privacy … 13

once. As usual, analyzers create several cross tabulations from selected attributes for
each purpose.
Further, we assume that the crowdsourcing system has the PII of workers. There-
fore, workers do not send the answers to the questionnaire to the crowdsourcing sys-
tem but send them to the requester after being anonymized. We assume that requesters
are semi-honest entities. That is, the requesters follow the proposed protocol but try
analyzing the individual information from each disguised data.

3 Privacy Metric

There are many important privacy metrics, such as k-anonymity [2, 10] and l-diversity
[11, 13]. In this study, we use differential privacy [6]; in particular, we focus on local
differential privacy [5], which has been widely studied recently.
Definition 1 [ε-local differential privacy] Let X be a set of sensitive values. A
randomized mechanism A satisfies ε-local differential privacy if for any x, x  ∈ X
and y ∈ Y ⊂ Range(A),

P(A(x) = y) ≤ eε P(A(x  ) = y  ). (1)

We consider that each question is a database with only one record having only one
column. Each worker can set a privacy level for each question. Moreover, a requester
can set different fees for different questions and privacy levels. For instance, the
fees of questions about gender with high- and low-privacy levels are 2 and 3 cents,
respectively, and the fees of questions about disease names with high- and low-privacy
levels are 5 and 10 cents, respectively.
Let q be the number of questions in a questionnaire. When each database (i.e.,
each answer) of a worker satisfies ε1 , …, εq differential privacies, the set of answers
satisfies (ε1 + · · · + εq ) differential privacy [9]. Workers can consider not only each
privacy level of each answer but also the privacy level of the set of all answers if
needed.

4 Related Work

Many privacy-protection data collection methods using differential privacy have


been proposed. We proposed single to randomized multiple dummies with Bayes
(S2Mb) previously [16]. In S2Mb, we assumed that there was only one value to be
collected from participants, and the privacy-protection level was common among all
participants. In this study, we use S2Mb as a basic approach for data collection.
There are several other methods on privacy-preserving data collection, such as [1,
7, 12]. Erlingsson et al. [7] proposed a randomized aggregatable privacy-preserving
14 Y. Sei and A. Ohsuga

ordinal response that provides local differential privacy, and it was implemented in
Google Chrome. Murakami and Kawamoto [12] assumed that there were sensitive
and nonsensitive data and proposed a local differential privacy mechanism that could
enhance data utility by protecting only sensitive data.

5 Proposed Method

5.1 System Overview

An overview of the proposed system is shown in Fig. 1. Each worker can determine
his/her privacy-protection level (i.e., the value of ε). Because it is difficult for a
layperson to understand the meaning of ε, we assume that the requester can determine
several values of ε in advance. For example, the requester can prepare four levels:
nonanonymization (ε = ∞), high-anonymization (ε = 0.1), middle-anonymization
(ε = 1), and low-anonymization (ε = 10). How to determine the privacy-protection
level is outside the scope of this study.

5.2 The Worker Protocol

The worker protocol is based on S2Mb [16] although S2Mb does not assume multiple
data collection and different privacy levels. Let Fi be the number of options of a
question i of a questionnaire, Si be a set of options of question i, and ti, j be the
selected option of the true answer of worker w j for question i.
When worker w j determines to answer a questionnaire of the crowdsourcing,
he/she first specifies privacy levels ε1, j , . . . , εq, j for each of q questions of the ques-
tionnaire. Then, he/she calculates a set of parameters si, j and pi, j for all i = 1, . . . , q
on the basis of εi, j and Fi , using the following equation:

Fig. 1 Overview of the proposed system


Another random document with
no related content on Scribd:
Monazite sands exist on the Brazilian coast, probably in larger
quantities than in all the rest of the world. In 1910 Germany imported
$1,000,000 worth. The thorium in the sands, used in the
manufacture of gas mantles, is extracted in Brazilian factories before
exportation. Two per cent of thorium is in the sand, sometimes nearly
6 per cent. It is found on the coast north of Rio and on some river
banks in Rio, Espirito Santo, Bahia, and Minas.
Graphite exists in several States, especially Minas and Bahia in
rather inaccessible locations, but one deposit in Rio is worked, for a
pencil factory in the city of Rio; others in a small way for local use.
Other Minerals. Platinum is found in gold bearing quartz and in
river alluvium in Pernambuco, Minas, and Parahyba; nickel in Minas,
Santa Catharina, and Rio Grande do Sul; salt in Rio Grande do
Norte, Rio, and Minas, worked in the last two; much is imported.
Other minerals found in various localities are asbestos, antimony
and tin, bismuth, barium, cinnabar, emery, kaolin; marble, white,
rose, onyx, and green; mica, molybdenite, saltpetre, silver and lead,
soapstone and talc, and wolfram. Among the stones garnets, opals,
pearls, rubies, sapphires, emeralds, topaz, and tourmalines are
found in more or less profusion as well as rock crystal, useful to
opticians. Minas contains almost every variety of ore and gem, which
with its good climate and fertile soil have made it the best populated
State, though without a large city.
Petroleum has been discovered in a number of States, among
them São Paulo, Minas, Alagôas, Pernambuco, Bahia, and Sergipe;
some of excellent quality in Bahia; but whether in quantities for large
exploitation is uncertain until further investigation and work are
carried on. Some geologists believe that prospects are highly
favorable. Oil of fine quality is recently reported at Piracicaba, São
Paulo, but as the petroleum is generally in schist rock its extraction
would be expensive. Recent advices state that Brazil has 35 oil fields
in four States with an area of 10,000 square miles; in the entire
country 75,000 square miles with an estimated producing capacity
within ten years of 500 to 600 million barrels.

Investments
In view of the varied resources of Brazil, to enumerate the
possibilities for investors would be difficult. There is hardly a line of
industry which cannot there be carried on successfully. That of coffee
growing is so well developed as to be somewhat overcrowded, but in
almost any other line there is a field for the investor. Whether it be
mining of gold or diamonds, of coal, iron, or manganese, be it
agriculture, stock raising, the lumber industry, or manufacturing, the
harnessing of the waterfalls to produce hydro-electric power, the
construction of public works, the field for the capitalist, large or small,
is of infinite variety and excellent promise. The present Government
is planning a broad and active development of the electric power
available from its great and numerous water-falls.
CHAPTER LI
SOUTH AMERICAN TRADE

As to many it may seem presumptuous that one with no practical


experience should venture to discuss foreign trade, I beg with an
apology for my temerity to make a slight explanation.
On my six trips to South America (1903-1916) I saw and heard so
much of the shortcomings of my countrymen there, and meanwhile
perceived such ignorance at home that as early as 1907 I wrote an
article on “Our Commercial Relations with South America,” published
in the Van Norden Magazine, wherein I set forth many points which
prominent men of affairs have repeatedly urged upon the attention of
their fellows, even up to the eighth Annual Trade Convention at
Cleveland, May, 1921.
My personal observation being supplemented by extensive
reading, I venture to hope that my remarks under this heading may
be charitably viewed by those who are wiser than I, and prove of
some slight service to those whose acquaintance with South
American affairs is more limited.

In proportion to our wealth and our domestic activities our export


trade before the Great War was indeed small in comparison to that of
other nationalities. Slight interest was taken in outside matters of any
kind, even our publicists giving little heed to foreign affairs. However,
prior to 1914 there had been a slowly growing interest and a gradual
increase in our export trade, which from 1915 to 1920 showed a
more rapid extension. In 1915 our exports amounted to
$3,500,000,000, in 1920 to $8,228,000,000; to South America in
round numbers, in 1915, $144,000,000, in 1920, $624,000,000, in
1921, $273,000,000.
As to the past and future of this matter, with especial reference to
South America, two widely divergent opinions prevail; one, that we
have accomplished wonders, and that our trade with that continent
will be permanent and, with improvement in exchange and other
conditions, increasing; the other, that we have not done so well as
we might and ought; and that owing to our indifference, inefficiency,
ignorance, and bumptiousness, we shall be unable to retain anything
like the proportion of trade which we have enjoyed or so much of it
as might seem our reasonable share. With some ground for each
opinion, the truth as usual lying between, there is a possibility of
either result depending upon a variety of circumstances. The first is
whether some of us acquire a willingness to learn, or persist in
certain mistaken notions and practices. Well merited criticism of the
methods of some exporters and salesmen is far from applying to all.
The “S” of a well known concern is as familiar in South America as in
North. Other great corporations are famous the world over. Their
success in foreign sales has meant the employment of many men
abroad and of a large number at home, with the home business
supplemented and steadied by the foreign. In addition to the
extensive pre-war export of some large companies, many small
ones, whose names are less familiar, have long sent their wares to
foreign lands.
A matter of prime importance is that the entire nation and people
become convinced of the value, the necessity even, of our
maintaining a large export and import trade, for we cannot have one
without the other. The provincialism of our thought and education,
which have a reciprocal influence, must be laid aside. Congressmen
should be able to feel that their reëlection will depend upon their
ability to grasp the problems confronting the whole nation, problems
of labor, transportation, commerce, finance, and world interests,
rather than upon their catering to a special class or securing a
sectional advantage. It would be well if they were high-minded
enough to act for the country’s best interests regardless of their
future fate. To demand ability and statesmanship of their
representatives in these crucial times is the privilege and duty of the
people.
As a nation we have prospered because of the richness of our
natural resources and the enormous extent of our agricultural lands.
The latter being now for the most part occupied, with increasing
population our welfare will depend more largely upon the
development of our manufacturing industries and of our export trade.
That the prosperity of our manufacturing towns and seaports will be
reflected in our agricultural districts and will benefit the entire nation
should be self-evident. Supported by the people the Government will
act in accordance with its best judgment. In any case, every one
should feel that it shows a shameful lack of a sense of duty and of
patriotism to place one’s personal fortune above the nation’s welfare
in peace no less than in war.
For success in foreign trade as well as for safety at home our
Government must and no doubt will see that production is not stifled
for any reason, that our transportation on land and sea, and
communication by wire is unhampered by strikes or otherwise. If
need arises, previous restrictive measures should be removed and
suitable aid granted. With abundance of shipping which we formerly
lacked, equality with European freight rates must be maintained or
competition will be impossible. The establishing by our banks of
needed branches, fortunately made practicable, has been
accomplished. The important question of trademarks and patents
may require further Governmental consideration and diplomatic
action, though some international agreements have already been
made. In certain countries the laws have been unfair, prejudicial to
the interests of honest manufacturers and favoring the unscrupulous;
some of whom have taken advantage of the situation to the
embarrassment of legitimate American business. Trademarks have
been practically stolen, through previous registration by foreigners
without title to use them. We must remember that the same thing has
been done by Americans in the United States, who have registered
here trademarks owned in Europe.
Of immense service would be a few free ports where raw material
could enter, and without paying duty be exported either as entered or
after being manufactured. Foreign countries have fostered
commerce in this way and by allowing favorable freight rates through
subsidies and otherwise. Competition under Government ownership
has produced an enormous deficit. While better results may be
expected under private ownership, our shipping will be at a
disadvantage from difficulties imposed by the Seamen’s Bill. It is said
that American shippers may be able to pay higher wages than
European if relieved of the necessity of employing larger crews and
superfluous engineers. The Bureaus of the Department of
Commerce now perform very valuable service: the Bureau of
Foreign and Domestic Commerce, the Bureau of Standards; also the
Bureau of Markets of the Department of Agriculture. A consistent
foreign policy, undoubtedly to be formulated and pursued by our able
Secretaries of State and Commerce, will be of great service in
relation to foreign trade and for our general prosperity.
To the intelligent sympathy of the country at large and the
coöperation of the Government must be added the eager purpose of
the manufacturer, and the interest of young men who will make of
export trade their chosen field of labor. The manufacturer who
contemplates entering this broader field or who, through peculiar war
conditions, has been brought into it without preliminary investigation,
should recognize the fact that careful intensive study is a
prerequisite for successful permanent trade, a method which has
been followed by many Europeans and by some Americans with
excellent results.
The book here presented it is hoped will furnish a useful
groundwork of information on South America, to be supplemented by
further study of details appropriate to the character of the
prospective exports and to any special conditions. In these countries
generally, we have observed a great diversity in the population and
disparity in their condition. One may hope that the latter will be
diminished by advance in wages and by the education of the Indians,
by means of which their producing and their purchasing power may
be increased; but for a long time two broad classes must be
distinguished and catered to: the cultured and literate, and the poor
and illiterate laborers, especially the Indians of the North and West
Coasts. It is evident that the requirements of a cultivated society
where the customs and dress are European in character, or of a
homogeneous middle-class population, would be quite different from
those of Indians who sleep on the floor, a whole family in one room.
A personal acquaintance with the character of the people, their
manner of life, and their methods of business is extremely desirable.
If the head of a manufacturing industry is able himself to make “The
South American Tour” even in a hasty manner, it will be to his
advantage; if not, his export manager, if he has one, should
personally study the ground. Those who look merely for a slight
supplementary trade may best accomplish this by arranging with a
reliable commission house and following directions. If the
manufacturer decides to undertake the matter himself, he must plan
a careful campaign.
To make haste slowly is a good rule. Unhappily in the past some
who have attempted foreign trade have ignored the advice and
experience of others, and deemed information quite unnecessary.
With the know-it-all attitude, the idea that business is business
everywhere, and that goods and methods successful at home must
be equally good for abroad, before the War they proceeded in such a
manner as either to make an utter failure and abandon the project, or
after large and needless losses to secure profitable business.
Criticism of two different kinds made by South Americans should
lead to the correction of faults; otherwise there will be a complete
loss of trade on the part of those who are guilty, and much injury to
our commerce generally from the resulting bad reputation given to all
Americans. One form of criticism is directed to the character,
methods, and manners of the traveling salesman or agent, the other
to the shortcomings of the home office.
During the War period when at times our goods alone were
available, even poor methods and service brought results. That the
continuance of such a course will be successful in the face of the
severe competition now arising is too much to expect. A friendly
Englishman long engaged in business in South America, in 1916
remarked that he was afraid the Americans would lose 60 per cent of
their business after the War. A Peruvian the same year declared that
they would lose it all; so much had he been disgusted by the
arbitrary manner of some salesmen of the type who said practically,
“There is the stuff. Take it or leave it as you like.” With a correct
atmosphere in the home office and a more careful choice of
salesmen such crudeness would be avoided.
If the heads of the office are unable to visit the countries, there is
greater reason for wide reading. The “Movies,” which seem to
entertain many, present pictures of a few phases of life; but it is not
by such means that one acquires the intimate knowledge of a
country and people essential for a proper conduct of trade. For
agreeable and profitable relationship of any sort with those of other
nationalities we must realize that they also have their point of view;
we need to consider how they regard us. While we may believe our
country to be the greatest and best, and our ways and manner of
living superior, we must bear in mind that others are equally loyal to
their own; though their country may be smaller and in some respects
less advanced, its people are equally patriotic, they prefer their own
way of living and methods of business where these are different.
Many South Americans have a wider knowledge of the world, greater
culture and taste, and these in general are more punctilious in
manners and dress than the majority of Americans. We must
therefore, while preserving our own tastes and ideals, have equal
respect for theirs, cultivating a catholicity, a breadth of view, quite
different from the spirit common among us, that everything different
is thereby inferior, that we can teach the world everything, and that
we have nothing to learn. Such an attitude is merely a mark of
ignorance and provincialism.
Aside from visiting the countries there are many sources of
information in regard to sales possibilities for any class of goods.
The lists of imports of the countries and of some cities are available
in commerce reports, with figures showing the approximate quantity
and ratio of these. While the list of our exports seems to embrace
almost everything, all of the goods are not sold everywhere; a
knowledge of the various markets, of the prices at which goods are
sold, and of trade conditions is necessary, to ascertain whether
competition is possible and if there is a prospective increase of
present business. Detailed information as to many lines of
manufactures and markets may be obtained from consular reports,
from the branches of the Department of Commerce located in a few
cities, or by writing directly to the Bureau of Foreign and Domestic
Commerce in Washington. Many persons have written to our
Consuls in Latin America, often to their great disgust, for information,
not merely such as might be procured in Washington, but what might
be gained by looking in a geography or reading one of many
available books. The Consuls are continually making reports with
suitable information on matters which are within their province.
Membership in certain commercial organizations gives the privilege
of receiving trade information; the Philadelphia Commercial
Museum, the National Association of Manufacturers, and the
American Manufacturers Export Association, chambers of
commerce, commercial clubs, trade associations, such as one of
jewelers and silversmiths, all may be useful in this direction. The Pan
American Union through its Bulletin and otherwise furnishes much
information about Latin America. Export Trade Journals, other
magazines and newspapers, are serviceable.
If from investigation it appears that there is a market for one’s
goods in any section or universally, that quality and prices can be
such as to make competition favorable, that the market can be
enlarged, or should there be none that one can be created, and a
determination is therefore formed to enter export trade, the next
question is how the goods shall be sold. The methods are various,
but of only two kinds: the direct and the indirect.
Direct methods include the establishing of branch houses; the
appointing of a general agent for one or more countries or of a local
agent for a limited territory; the employment of traveling salesmen;
and advertising in circulars, newspapers, or magazines, for mail
orders to be filled by freight or parcel post. The choice of methods,
and the appointing of agents or salesmen demand the greatest care.
Exclusive rights of sale have been given for the whole continent to a
South American, incompetent even to take care of a small district.
Salesmen have been appointed from the home office who perhaps
had done well here but were utterly unfit for work in South America.
It is desirable to have representatives of our own nationality.
Others if employed solely by an American Company may do their
best for it, but we now know that many Germans, possibly others,
have taken agencies for the sole purpose of keeping the goods out
of the market. A good salesman or agent of any sort should have as
his first qualification ability to speak Spanish fluently, unless his work
is confined to Brazil, in which case of course he must speak
Portuguese. Next he should be a gentleman and simpático. The
spirit which led some youths in the early days in Panamá to call the
residents niggers, monkeys, and savages is one which, though not
indulged in outwardly to such a degree, is sufficient to prevent the
harmonious relations necessary to make permanent, satisfactory
business dealings. Unquestioned integrity, unfailing courtesy,
patience, tact, straightforward action, are all highly important
qualities, as well as those essential from a strictly business point of
view, such as critical knowledge of the goods, etc. Confidence and
friendliness count more in South America than at home. Social
qualifications are desirable. It has been said of the British that they
were too cold and exclusive, that the Germans were more friendly.
On the other hand, some Americans have felt that the South
Americans did not care for more than a business acquaintance. This
is doubtless true in many cases, but one who is cultured,
sympathetic, and well mannered is likely to have social opportunities
which he may accept to advantage.
Branch houses will best serve the large manufacturer, giving a
standing not otherwise attained, and best promoting permanent
relations. From these houses salesmen go to neighboring territory.
The manager must be a man of wide experience, familiar not only
with the product and home matters, but with the language, customs,
and business methods of the country in which he is located. Some
corporations engage business houses in different sections as local
representatives or distributors, with exclusive rights in restricted
territory. Such arrangements, supplemented by advice and literature
from the home office may prove effective in securing sales.
Those who cannot afford branch houses or the risk which may
attend the cost of a traveling salesman’s exclusive service are now
able through the Webb-Pomerene law to coöperate with other
houses in the same or in associated lines of industry. Both
investigation and sales may thus be profitably conducted.
Advertising only, without the employment of other agencies, has
been highly profitable to many. It is said that advertising in South
America brings better results than in the United States. To avoid utter
waste of money careful investigation as to sales possibilities and
media should be made before planning a campaign. One large mail
order house has carried on an enormous foreign business. Other
firms have accomplished much in a similar way. Advertising is done
in journals and magazines published here and circulated there, in
local publications of various kinds, in moving-picture houses; also by
means of mailed circulars, and to some extent by electric signs.
The importance of correct technical and idiomatic translation in
advertising in Spanish and Portuguese cannot be over-estimated.
Gross and ridiculous errors have been made in the past. A book
knowledge of languages seldom prepares one adequately for such
work. Foreign translators are more numerous than formerly, but they,
also, too often make egregious blunders; not of the same character,
but caused by their not comprehending exactly the English which
they translate.
If indirect methods of trade are preferred as involving less risk,
trouble, and preliminary expense, and if the medium is carefully
chosen, it may be more profitable. Export commission houses or
export agents will relieve the manufacturer of almost all care. One
large commission house not only acts as selling agent for
manufacturers through its branches in many parts of South America;
it also operates steamship lines, carries on banking and exchange,
and handles important financial transactions for South American
Republics. Certain firms of national or worldwide reputation and
large capital have for many years been satisfied to conduct their
foreign trade through such a house. The opportunity for commission
houses of this sort was not overlooked by foreigners and one
company of these in New York did an annual business of
$30,000,000 before the War.
The experience of a commission house is an asset, which saves
many mistakes. Their experts have a wide range of information
covering American and European competition, and details such as
suitable patterns, correct packing, etc. The commission house may
have its capital tied up for six months in transactions, or did prior to
the more general use of the trade acceptance, while the
manufacturer might receive cash for his goods. For small people this
method of sales has many advantages, especially when first
launching into export trade. Conference and honorable coöperation
are necessary and the protection of the commission house from
direct under-selling or from other unfair dealings. The service of
export agents is preferred by some, these acting as salesmen,
forwarders, or shippers, either for one or more concerns, perhaps on
salary and commission, or as independent agents.
After securing orders, by whatever means employed, the
responsibilities of the shipping department begin. The principles
governing the execution of orders would seem to be rudimentary.
One wonders how a business in this country could achieve even a
small measure of success when violating the most elementary rules
of conduct. Yet this has been and still is done in South American
trade as recent information from various sources shows, despite the
fact that these things should go without saying, and furthermore that
they have been iterated and reiterated for years.
First, the goods to fill an order should be precisely like the sample,
if there was one, not something inferior, as has often happened, nor
something just as good, or even better. If ordered without a sample
strictest attention should be paid to prescribed details. If it is
specified that cloth be 28¹⁄₂ inches wide or 25 centimetres, that is
what is wanted. If two-wheeled vehicles are ordered, what sort of
business is it that permits of sending, by mistake, four-wheeled
vehicles a distance of 5000 miles, even though the bill was made the
same and the goods were more expensive? as was done by a well
known manufacturer to his loss. The loss to the purchaser was
greater, for the vehicles sent could not be used at all in that country.
The assumption that the seller knows better than the buyer what
the latter wants is offensive if true. Generally it is not true. Mistakes
are unpardonable. Requests for particular colors, patterns, size of
bolt, and character of weave must be complied with if trade is
wanted. The willingness of the Germans to oblige in such matters
largely accounted for the rapid growth of their South American trade.
The Latin American business men are as acute and intelligent as
any. They know what they want and are discriminating buyers as to
quality and price.
Criticism of the shortcomings of the home office is the second of
the two forms previously referred to. Lack of accuracy and of
attention to details is a grievous fault, apparently arising from want of
discipline and thoroughness in our homes and schools, a fault
recognized by many heads of offices here. The dishonesty of
sending goods inferior to sample or order, a practice injurious to the
entire national trade as well as to the guilty individual, shows an utter
lack of patriotism, as well as folly if permanent trade is desired.
Another elementary matter is that of packing. Woful tales of
breakage and loss from bad packing have been rife for years, and
volumes have been written and spoken concerning it. In 1916 an
experienced traveling man told me that before his last trip, in view of
war conditions, he had taken on the agency of some new people and
received many orders for them. He had sent explicit instructions as
to packing and other export details. But now he found his new
customers swearing mad and was booking no more orders for his
new patrons: for they had paid not the slightest heed to his directions
either as to packing or forwarding, with disastrous results. In
February, 1919, a letter from Brazil said: “We cannot imagine why
your shippers ever accepted the travesty of an export bale dumped
on you by the spinners, and we must clearly state that our factory will
not accept any yarns which arrive in bad condition due to bad
packing.”
Unwillingness to profit by the knowledge and experience of others,
the belief that one knows everything without learning anything, is
called a peculiarly American trait, though happily it is not universal.
The British not only pack and handle goods in the best manner, but
they are careful to send and land them in all parts of the world by the
best route and with the least expense to the receiver, as the world
knows. Of course we can do the same if we take the trouble. The
packing department for the soldiers overseas showed the highest
excellence. The baling of clothes instead of boxing saved labor, box
material, and two thirds of the space, and goods arrived in better
condition. Fifty-five million dollars were saved at one plant in a year.
Forty-nine million dollars of this was cargo space, other things were
rent, freight, etc. Fifty-eight million feet of lumber of 30 years growth
were spared. The burlap required would be useful in South America.
Square packages instead of round are advantageous. Those who
wish a share in foreign trade must take the pains to do everything
right. The most careful man, familiar with the metric system, should
be in charge. The scales should show pounds and kilograms, and
figures be given for net weight, container, etc. Aside from careful
packing to avoid breakage or other injury as from water, dampness,
or pilfering, instructions are often given as to size and weight of
package. Mules, donkeys, and llamas usually carry two packages,
one on each side; the ordinary load of each is 200, 150, and 100 lbs.
respectively, though some mules will take 300 lbs. for a moderate
distance. For the interior, especially on the North and West Coasts
and in some sections on the East, these animals are the only means
of transport, and goods must be packed accordingly; machinery in
sections, etc. Many boxes of 1000 pounds weight have been left on
the dock or at a railway station, the goods a total loss.
To arrange the packing with an eye to the custom house is
important, both in order that the contents may be easily examined,
and so that fines or exorbitant imposts may be avoided. Directions
and governmental regulations as to giving separate weight of
container and goods, and the separation of different classes of the
latter must be scrupulously followed. Heavy fines are often imposed
for trivial errors in packing or invoice, and corrections of any
mistakes by cable are expensive if frequent.
Obligations of every kind should be fulfilled with fidelity though a
bad bargain has been made resulting in financial loss. On the other
hand consideration for the embarrassments of the buyer should be
shown, whether these are purely personal or the result of national
conditions such as followed the outbreak of the War or the
conclusion of the Armistice. After the unexpected cessation of War
many orders which had been placed here were suddenly cancelled
under the supposition that coöperation such as had always been
extended by European merchants would not be refused here. British
representatives promptly offered to cancel orders for goods that the
buyers might not care to receive under the changed circumstances,
while the majority of Americans made many difficulties: a contrast in
conduct liable to influence unfavorably future trade, especially when
added to the fact that vast numbers here cancelled orders and that
the average American manufacturer had taken advantage of the
situation created by the War to charge exorbitant prices in excess of
those applying to domestic trade. Thus some manufacturers who
have cried out about the bad faith of the South Americans, with no
consideration for their difficulties, have forfeited their confidence and
friendship, with a probable loss of future trade unless able to offer
remarkably attractive bargains.
The utmost care should be taken in the shipping of goods as well
as in the packing. Promptness is an important feature. Where regular
sailings occur space should be engaged in advance, and the
necessary papers accurately made out in good season, in view of
the many copies of the consular invoices, the bills of lading, the
clearance papers, and the short hours of some of the consulates. To
avoid the trouble of attending to these and other elaborate details,
many manufacturers find it convenient to employ a Freight
Forwarder who looks after such matters including insurance of
various kinds covering theft, damage, and total loss. He will know the
most favorable trade routes, look after transfer and storage, and fill
all requirements, if qualified for his job.
No dealings should be initiated in any country until after the
registration of patents and trademarks.
Trouble should be taken to adjust any bona fide complaint and to
satisfy reasonable customers. On account of length of time and
distance, especial pains should be taken to avoid possible difficulty
or disagreement.
The establishing of American banks in South America has been a
boon to manufacturers. The houses of Dun and of Bradstreet
perform much service for their clients in the line of credit information.
It has been suggested that the Government might collect information
for general private use. It may be said that experience shows losses
in foreign trade to be less than in domestic. Yet, as shysters exist
everywhere, suitable precaution should be exercised, guarantees
required, or the reliability of the house made certain.
The use of the trade acceptance, a negotiable note given by the
purchaser to the seller of goods, now becoming general, is of great
assistance to those who were deterred from entering South
American trade on account of the long credits which seemed
necessary. Foreign bankers invest in the commercial bills of other
countries, knowing them to be convertible into cash in those
countries. Private houses handling investments or commercial paper
have added departments for dealing in acceptances. The subject of
foreign exchange should be familiar, the fluctuations having an
important bearing on purchasing power and trade, while exchange
itself is dependent on foreign trade conditions, being an index of
international transactions. Careful consideration of this matter is
necessary in quoting prices. In normal times it was customary on
English imports to reckon the pound as $4.90, and in export as $4.80
to cover incidental expenses.
In certain lines, for example, in hand-made goods, it is impossible
for this country to face European or Asiatic competition. In some
kinds of machine-made goods we excel. In lines where competition
seems difficult the excellent suggestion has been made that costs
may be reduced. The lowering of the daily wage has in some cases
occurred; and more may be accomplished by diminishing overhead
expense. The high salaries of the heads and of numerous assistants
in plants of moderate size and the expenses of salesmen are often
unnecessarily large, giving rise to foolish and injurious extravagance,
which indeed has permeated all classes of society. Carnegie while
building up his Steel Company, and President McKinley smoked
cigars costing five cents each, while some modern salesmen pay 50
cents for one, with other things in proportion. Some hotels charge 40
cents for a potato not costing one; a Washington hotel asks 60 cents
for a slice of watermelon when a whole one is selling on the street
for 15 cents. The head of a company suggests that by reducing one-
third of the personal and family expenses for luxuries they will live
longer and be happier; that one-third of the middle men might be cut
out; that the office and supervising class could accomplish 25 per
cent more and cut down office expenses one-third; that the laboring
man could increase his efficiency and output one-third without injury
and come nearer to earning his wages; and that the unreasonable
waste of material should be diminished. I would however add that
many heads of establishments and departments work harder and
more hours than the ordinary office force or laborer.
One would naturally desire to have his firm name on such goods
as permit this; “Made in U.S.A.” seems desirable where practicable.
It has happened that Germans handling American machinery have
covered such marks with their own. It may be noted that in South
America many of the large mercantile establishments of various
kinds, dry goods and others, are in the hands of British or German
firms. A considerable portion of trade in the large cities is conducted
by other than the native born.
For the best development of our foreign trade it is necessary that
young men entering this field should be of higher type than the
average in domestic affairs, particularly those who will go to foreign
lands. The larger number may not be called upon to go outside of
their town or country, as many must be engaged in the export
department, at the factory or the seaport, or in commission houses
and banks, as export agents or freight forwarders, etc. Others will go
abroad as salesmen on tours, or to reside a few or many years in the
capacity of local agents, in branch houses of large companies, civil
and mining engineers, etc.
Many of both sexes have enough of the spirit of adventure to enjoy
the prospect of at least a temporary residence in another land. It is to
be hoped that those who desire the broader career will enter it not
solely for the pecuniary reward but with something of the spirit which
animated our soldiers, the knowledge that they may extend the
prestige of their country and uphold the best traditions of democracy;
with the feeling that their work, if well done, is patriotic in character,
an essential and splendid vocation, a dignified career for the
development of the commerce and the promotion of the welfare of a
great nation. Character, the manners of a gentleman, and
educational preparation are among the requisite qualifications. Of
prime necessity is a familiarity with one or two foreign languages;
also a training that will develop thoroughness and accuracy and the
consciousness that these are essential. Nothing will accomplish this
better than a good groundwork of Latin; which makes mere play the
acquisition of any derived language like Spanish, French, or
Portuguese. A sound understanding of Latin syntax is needed for
easy comprehension of these languages, with their varied forms and
constructions, so different from our simple English, which indeed one
who is ignorant of any other language hardly comprehends. The
ability to conduct business correspondence correctly and with at
least some degree of the elegance and courteous phraseology
current in other lands where our brusque letters and speech are
disliked if not resented: Knowledge of office routine especially as to
the various papers to be procured and prepared in connection with
foreign transactions: An acquaintance with the requirements of
shipping practices, trade routes, types of vessels, freight rates,
insurance of various kinds, loading and unloading facilities at
different ports, and details as to the arrival and despatch of cargoes
and vessels: A study of the principles of commercial law needed to
enable one to decide business questions, disputes and
misunderstandings, according to equity and international practice: A
close study of the economic conditions which govern the production
of the countries, of the social institutions and customs, of advertising
needs and methods, of shipping facilities, of banking facilities and
methods, credit practices and requirements, and any discrimination
in tariffs or regulations:
A study of the foreign trade practices and methods of those
countries already occupying these markets, the character and style
of their goods and their methods of securing and holding business:
Acquaintance with the financial and investment relations of other
countries as affecting international trade; with foreign banking
practices and with the mechanism of foreign exchange: A study of
physical geography including the natural resources, climatic
conditions, and characteristic peculiarities of each country: A
knowledge of the history and affiliations of the countries, with the
character of their governments as likely to bear on their commerce:
—All these are matters which must not be overlooked by any one
who wishes to become an expert in foreign trade. Some
acquaintance with the racial origin and relations of the nations, with
their social customs, religious tendencies, and traditions may at
times help in determining trade possibilities. It is important to realize
that the cultivation of tact, dignity, and judgment is necessary for
success as a foreign representative, and that such an one may
prove a more valuable ambassador than some of those occupying
such position, to whom a similar training would be of advantage.
Furthermore we must realize that no nation can sell largely abroad
unless it buys also, and that we must purchase from South America
if we expect to sell there. Fortunately they have many agricultural
products, which we do not produce, and other raw material of which
we have not sufficient. Yet probably we cannot take as much from
them as we should like to sell. We must therefore invest, now that
we are a creditor nation, in the securities of others, the bonds of the
countries and cities; we must send our capital to develop public
utilities where these are lacking, as for sewerage and water supply.
Electric lighting plants and power, docks and railways, have proved
excellent investments. The better banking facilities now provided
encourage these on our part. The British, French, and Belgians have
been beforehand in this matter. The British have invested more than
two billions in Argentina, $1,200,000,000 in Brazil, smaller sums in
Uruguay and Chile. The Germans have not invested much money,
their banks bringing chiefly credit and making money by taking part
of the business of local banks, a practice not conducive to popularity.
The United States, i.e., some people, have invested $175,000,000 or
more in Brazil, smaller sums in other countries. Large opportunities
lie open in this direction.
That loans should be made to foreign countries only on condition
that the money be spent here, seems a short-sighted policy, as also
restrictions on our export of gold, when our excessive holding of that
metal is a contributing cause of the unfortunate exchange situation.
Many Republics need railways, for which construction material and
equipment would be here purchased if here financed; but part of the
money must be spent on the ground; so with works of irrigation and
other public or private construction. If we must always be selfish, at
least our selfishness should be enlightened, and we should realize
that in the long run we shall gain more by manifesting a friendly spirit
of service and coöperation rather than by showing intense
eagerness for the “mighty dollar.”

You might also like