Professional Documents
Culture Documents
Disruptive Technologies For Big Data and Cloud Applications Proceedings of ICBDCC 2021 J Dinesh Peter Steven Lawrence Fernandes Amir H Alavi Editors
Disruptive Technologies For Big Data and Cloud Applications Proceedings of ICBDCC 2021 J Dinesh Peter Steven Lawrence Fernandes Amir H Alavi Editors
Disruptive Technologies For Big Data and Cloud Applications Proceedings of ICBDCC 2021 J Dinesh Peter Steven Lawrence Fernandes Amir H Alavi Editors
https://ebookmeta.com/product/machine-intelligence-and-data-
science-applications-proceedings-of-midas-2021-skala-vaclav/
https://ebookmeta.com/product/machine-learning-and-big-data-
analytics-proceedings-of-international-conference-on-machine-
learning-and-big-data-analytics-icmlbda-2021-1st-edition-rajiv-
misra/
https://ebookmeta.com/product/cloud-native-data-center-
networking-1st-edition-dinesh-g-dutt/
https://ebookmeta.com/product/cloud-computing-cloud-computing-
for-secure-and-smart-applications-2021-mohammad-r-khosravi/
Advanced Technologies Systems and Applications VI
Proceedings of the International Symposium on
Innovative and Interdisciplinary Applications of
Advanced Technologies IAT 2021 1st Edition Naida
Ademovi■
https://ebookmeta.com/product/advanced-technologies-systems-and-
applications-vi-proceedings-of-the-international-symposium-on-
innovative-and-interdisciplinary-applications-of-advanced-
technologies-iat-2021-1st-edition-naida-ademov/
https://ebookmeta.com/product/cissp-for-dummies-7th-edition-
lawrence-c-miller-peter-h-gregory/
https://ebookmeta.com/product/obtaining-value-from-big-data-for-
service-systems-big-data-technology-stephen-h-kaisler/
https://ebookmeta.com/product/towards-the-integration-of-iot-
cloud-and-big-data-services-applications-and-standards-1st-
edition-vinay-rishiwal/
https://ebookmeta.com/product/data-science%e2%80%95analytics-and-
applications-proceedings-of-the-5th-international-data-science-
conference-idsc2023-1st-edition-peter-haber/
Lecture Notes in Electrical Engineering 905
J. Dinesh Peter
Steven Lawrence Fernandes
Amir H. Alavi Editors
Disruptive
Technologies
for Big Data
and Cloud
Applications
Proceedings of ICBDCC 2021
Lecture Notes in Electrical Engineering
Volume 905
Series Editors
Leopoldo Angrisani, Department of Electrical and Information Technologies Engineering, University of Napoli
Federico II, Naples, Italy
Marco Arteaga, Departament de Control y Robótica, Universidad Nacional Autónoma de México, Coyoacán,
Mexico
Bijaya Ketan Panigrahi, Electrical Engineering, Indian Institute of Technology Delhi, New Delhi, Delhi, India
Samarjit Chakraborty, Fakultät für Elektrotechnik und Informationstechnik, TU München, Munich, Germany
Jiming Chen, Zhejiang University, Hangzhou, Zhejiang, China
Shanben Chen, Materials Science and Engineering, Shanghai Jiao Tong University, Shanghai, China
Tan Kay Chen, Department of Electrical and Computer Engineering, National University of Singapore,
Singapore, Singapore
Rüdiger Dillmann, Humanoids and Intelligent Systems Laboratory, Karlsruhe Institute for Technology,
Karlsruhe, Germany
Haibin Duan, Beijing University of Aeronautics and Astronautics, Beijing, China
Gianluigi Ferrari, Università di Parma, Parma, Italy
Manuel Ferre, Centre for Automation and Robotics CAR (UPM-CSIC), Universidad Politécnica de Madrid,
Madrid, Spain
Sandra Hirche, Department of Electrical Engineering and Information Science, Technische Universität
München, Munich, Germany
Faryar Jabbari, Department of Mechanical and Aerospace Engineering, University of California, Irvine, CA,
USA
Limin Jia, State Key Laboratory of Rail Traffic Control and Safety, Beijing Jiaotong University, Beijing, China
Janusz Kacprzyk, Systems Research Institute, Polish Academy of Sciences, Warsaw, Poland
Alaa Khamis, German University in Egypt El Tagamoa El Khames, New Cairo City, Egypt
Torsten Kroeger, Stanford University, Stanford, CA, USA
Yong Li, Hunan University, Changsha, Hunan, China
Qilian Liang, Department of Electrical Engineering, University of Texas at Arlington, Arlington, TX, USA
Ferran Martín, Departament d’Enginyeria Electrònica, Universitat Autònoma de Barcelona, Bellaterra,
Barcelona, Spain
Tan Cher Ming, College of Engineering, Nanyang Technological University, Singapore, Singapore
Wolfgang Minker, Institute of Information Technology, University of Ulm, Ulm, Germany
Pradeep Misra, Department of Electrical Engineering, Wright State University, Dayton, OH, USA
Sebastian Möller, Quality and Usability Laboratory, TU Berlin, Berlin, Germany
Subhas Mukhopadhyay, School of Engineering & Advanced Technology, Massey University,
Palmerston North, Manawatu-Wanganui, New Zealand
Cun-Zheng Ning, Electrical Engineering, Arizona State University, Tempe, AZ, USA
Toyoaki Nishida, Graduate School of Informatics, Kyoto University, Kyoto, Japan
Luca Oneto, Department of Informatics, Bioengineering, Robotics, University of Genova, Genova, Genova,
Italy
Federica Pascucci, Dipartimento di Ingegneria, Università degli Studi “Roma Tre”, Rome, Italy
Yong Qin, State Key Laboratory of Rail Traffic Control and Safety, Beijing Jiaotong University, Beijing, China
Gan Woon Seng, School of Electrical & Electronic Engineering, Nanyang Technological University,
Singapore, Singapore
Joachim Speidel, Institute of Telecommunications, Universität Stuttgart, Stuttgart, Germany
Germano Veiga, Campus da FEUP, INESC Porto, Porto, Portugal
Haitao Wu, Academy of Opto-electronics, Chinese Academy of Sciences, Beijing, China
Walter Zamboni, DIEM - Università degli studi di Salerno, Fisciano, Salerno, Italy
Junjie James Zhang, Charlotte, NC, USA
The book series Lecture Notes in Electrical Engineering (LNEE) publishes the
latest developments in Electrical Engineering - quickly, informally and in high
quality. While original research reported in proceedings and monographs has
traditionally formed the core of LNEE, we also encourage authors to submit books
devoted to supporting student education and professional training in the various
fields and applications areas of electrical engineering. The series cover classical and
emerging topics concerning:
• Communication Engineering, Information Theory and Networks
• Electronics Engineering and Microelectronics
• Signal, Image and Speech Processing
• Wireless and Mobile Communication
• Circuits and Systems
• Energy Systems, Power Electronics and Electrical Machines
• Electro-optical Engineering
• Instrumentation Engineering
• Avionics Engineering
• Control Systems
• Internet-of-Things and Cybersecurity
• Biomedical Devices, MEMS and NEMS
For general information about this book series, comments or suggestions, please
contact leontina.dicecco@springer.com.
To submit a proposal or request further information, please contact the Publishing
Editor in your country:
China
Jasmine Dou, Editor (jasmine.dou@springer.com)
India, Japan, Rest of Asia
Swati Meherishi, Editorial Director (Swati.Meherishi@springer.com)
Southeast Asia, Australia, New Zealand
Ramesh Nath Premnath, Editor (ramesh.premnath@springernature.com)
USA, Canada:
Michael Luby, Senior Editor (michael.luby@springer.com)
All other Countries:
Leontina Di Cecco, Senior Editor (leontina.dicecco@springer.com)
** This series is indexed by EI Compendex and Scopus databases. **
Disruptive Technologies
for Big Data and Cloud
Applications
Proceedings of ICBDCC 2021
Editors
J. Dinesh Peter Steven Lawrence Fernandes
Department of Computer Science Department of Computer Science
and Engineering Creighton University
Karunya Institute of Technology Omaha, NE, USA
and Sciences
Coimbatore, Tamil Nadu, India
Amir H. Alavi
Civil and Environmental Engineering
University of Pittsburgh
Pittsburgh, PA, USA
© The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Nature
Singapore Pte Ltd. 2022
This work is subject to copyright. All rights are solely and exclusively licensed by the Publisher, whether
the whole or part of the material is concerned, specifically the rights of translation, reprinting, reuse
of illustrations, recitation, broadcasting, reproduction on microfilms or in any other physical way, and
transmission or information storage and retrieval, electronic adaptation, computer software, or by similar
or dissimilar methodology now known or hereafter developed.
The use of general descriptive names, registered names, trademarks, service marks, etc. in this publication
does not imply, even in the absence of a specific statement, that such names are exempt from the relevant
protective laws and regulations and therefore free for general use.
The publisher, the authors, and the editors are safe to assume that the advice and information in this book
are believed to be true and accurate at the date of publication. Neither the publisher nor the authors or
the editors give a warranty, expressed or implied, with respect to the material contained herein or for any
errors or omissions that may have been made. The publisher remains neutral with regard to jurisdictional
claims in published maps and institutional affiliations.
This Springer imprint is published by the registered company Springer Nature Singapore Pte Ltd.
The registered company address is: 152 Beach Road, #21-01/04 Gateway East, Singapore 189721,
Singapore
Preface
This work comprises the proceedings of the International Conference on Big Data and
Cloud Computing (ICBDCC’21). This conference was organized with the primary
theme of promoting ideas that provide technological solutions to the big data and
cloud computing applications. ICBDCC provided a unique forum for the prac-
titioners, developers and users to exchange ideas and present their observations,
models, results and experiences with the researchers who are involved in real-time
projects that provide solutions for research problems of recent advancements in big
data and cloud computing technologies. In the last decade, a number of sophisticated
and new computing technologies have been developed. With the introduction of
new computing paradigms such as cloud computing, big data and other innovations,
ICBDCC provided a high-quality dissemination forum for new ideas, technology
focus, research results and discussions on the evolution of computing for the benefit
of both scientific and industrial developments. ICBDCC is supported by a panel of
reputed advisory committee members both from India and from all across the world.
This proceedings includes topics in the fields of big data, data analytics in cloud,
cloud security, cloud computing and big data and cloud computing applications.
The research papers featured in this proceedings provide novel ideas that contribute
to the growth of the society through computing technologies. The contents of this
proceedings will prove to be an invaluable asset to the researchers in the areas of big
data and cloud computing.
We appreciate the extensive time and effort put in by all the members of the
organizing committee for ensuring a high standard for the papers published in this
volume. We would like to express our thanks to the panel of experts who helped us to
review the papers and assisted us in selecting the candidate for the Best Paper Award.
v
vi Preface
We would like to thank the eminent keynote speakers who have shared their ideas
with the audience and all the researchers and academicians who have contributed
their research works, models and ideas to ICBDCC’21.
vii
viii Contents
Dr. Amir H. Alavi is an assistant professor in the Department of Civil and Environ-
mental Engineering and holds a courtesy appointment in the Department of Bioengi-
neering at the University of Pittsburgh. Dr. Alavi’s research interests include struc-
tural health monitoring, multifunctional structures, advanced sensors, low-power
energy harvesting, and engineering informatics. His research activities involve the
implementation of self-sustained and multifunctional sensing and structural systems
enhanced by engineering informatics in the fields of civil infrastructure, construc-
tion, aerospace, and biomedical engineering. Dr. Alavi has authored seven books
xv
xvi About the Editors
and over 200 publications in archival journals, chapters, and conference proceed-
ings. He has received several award certificates for his journal articles. He is among
the Google Scholar 200 Most Cited Authors in Civil Engineering, Web of Science
ESI’s World Top 1% Scientific Minds in 2018, and the Stanford University list of
Top 1% Scientists in the World 2020.
A Statistical Performance Analysis of
GPU WAH Range Querying
1 Introduction
Table 1 Left table shows a relation which records the number of different models of cars sold in
a year. The right shows a potential bitmap index for that relation
Cars Sold Model Bins Volume Bins
Model # sold m0 m1 m2 m3 m4 m 5 v0 v1 v2 v3
Tiago 51K 1 0 0 0 0 0 0 0 0 1
Nexon 48K 0 1 0 0 0 0 0 0 1 0
Alitroz 47K 0 0 1 0 0 0 0 0 1 0
Harrier 15K 0 0 0 1 0 0 0 1 0 0
Tigor 10K 0 0 0 0 1 0 0 1 0 0
Zest 220 0 0 0 0 0 1 1 0 0 0
(WAH) [17]. Here, we present an ANOVA analysis [7] applied to the results of a
rigorous empirical study of our engine. The products of this analysis indicate potential
features of the query framework that can be tuned to increase efficiency. Further, the
results provide guidance for the creation of a more generalized framework.
2 Background
Bitmap Indices: Bitmap indices are created by binning the tuples of the relation
being indexed. First, attribute domains are partitioned into sets of discrete values
bins or ranges of values bins depending on the domain. Then, each tuple is analyzed,
creating a row in the bitmap. The value of each attribute is inspected, and a 1 is
placed in the bitmap bin that corresponds to that value. The remaining bins in that
set are assigned 0.
A possible example of this binning process is shown in Table 1. The relation on
the right records the yearly sales of Tata car models. Since there are two attributes,
there are two sets of bins in the bitmap shown on the left. As the values of the Model
attribute are discrete, each value is assigned its own bin. These are the bins prefixed
with an m. A 1 in the m 0 bin indicates the corresponding tuple had the Tiago
value for the Model attribute, m 1 indicates a value of Nexon, and so on. The # sold
attribute holds values that fall in a continuum. Thus, it can be binned using range
bins. In this example, the bin v3 represents values ≥50 k, v2 represents the range
[20 k–50 k), v1 is [10 k, 20 k), and v0 is <10 k. The first tuple in Cars Sold results
in a bitmap row of 1 in m 0 representing that it records the number of Tiagos sold.
All other m bins are assigned 0. Since there were over 50 k Tiagos, a 1 is placed
in the v3 , and the remaining v bins are assigned 0. This process is applied to all the
tuples in Cars Sold.
A significant benefit of bitmap indices is the ability to query the bitmap directly
using hardware-enabled bitwise operations. Again, consider the example shown in
Table 1. An answer to the query “Which models had a sales volume between 15 and
55 k cars?” can be derived by bitwise ORing columns of the bitmap. Specifically,
A Statistical Performance Analysis of GPU WAH Range Querying 3
3 Evaluation Methodology
In this section, the testing methodology used to produce our results is described. All
tests were executed on a machine running Ubuntu 16.04.5 LTS equipped with dual
8-core Intel Xeon E5-2609 v4 CPUs (each at 1.70 GHz) and 32 GB of RAM. All tests
were developed using CUDA v9.0.176 and run on two different GPUs: an NVIDIA
GeForce GTX 1080 with 8 GB of memory and an NVIDIA Titan X with 12 GB of
memory.
4 M. Nelson et al.
We use the Zipf synthetic datasets for our evaluation. These datasets are created
using a Zipf distributions which represent a clustered approach to discretization.
This essentially creates a skewed distribution of 1’s in our bitmap, simulating the
types distribution seen in real data. The Zipfdistribution generator assigns each bit a
n
probability of: p(k, n, skew) = (1/k skew )/ i=1 (1/i skew ), where n is the number of
elements determined by cardinality, k is their rank, and the coefficient skew creates
an exponentially skewed distribution. We set k = 10, n = 10 and set skew = 0, 1, 2,
and 3. These different skew values create datasets of varying bit density. Using these
parameters, we generate 16 different datasets containing 100 bins (i.e., ten attributes
discretized into ten bins each) and 8, 16, 32, and 64 million rows.
We use a statistical ANOVA [7] approach to analyze our test results. This approach
quantifies the impact of each factor (from the beginning of this section) on perfor-
mance. An ANOVA analysis determines whether a statistically significant difference
exists among the means of each test. This is done by separating the total observed mea-
surement variation into two components: (1) the variation within a system (assumed
to be measurement error) and (2) the variation between systems (assumed to be due
to both actual differences between systems and to measurement error). Statistically
significant differences between systems are determined via an F-test that compares
variances across the systems.
A Statistical Performance Analysis of GPU WAH Range Querying 5
An m-factor ANOVA analysis is required as more than two factors are present in
the sets of dataset and architectural factors outlined at the beginning of this section.
When two or more factors are present in an ANOVA analysis, the interactions between
factors can be considered. Interactions are important to consider as the combination
of factors can be more impactful than simply summing the impact of each factor
independently (the whole can be greater than the sum of the parts). The impact of
each unique factor and interaction of factors are called effects. From the results of
the ANOVA analysis, it is possible to compute the percent impact of each effect by
forming the ratio of total variation in measurement due to each effect to the sum of
total variation of all effects and measurement errors.
4 Results
Here, we present the results of the experiments and ANOVA analyses described
above. We first present the results of the ANOVA analysis of all factors, then analyzes
of factors linked strictly to architectural details and dataset details. All results are
reported as a rank ordering of percent impact of each effect.
Rankings of effects for all factors are shown in Fig. 1. The two most significant
factors contributing to variations in performance are the use of shared memory on the
GPU and the number of rows in the dataset, accounting for 40.26% and 27.32% of the
overall variation in performance, respectively, and the interaction thereof accounting
for 20.66% of the overall variation in performance. The sum of the stand-alone and
interaction effects of these two factors accounts for 88.24% of the overall variation
in performance. All remaining effects each account for less than 3.5% of the overall
variation in performance.
Rankings of effects for architecturally linked factors are shown in Fig. 2. The two
most significant architectural factors are the use of shared memory on the GPU and the
base clock rate of the GPU, accounting for 90.79% and 5.04% of the overall variation
in performance, respectively, and the interaction thereof accounting for 3.2% of the
overall variation in performance. The sum of the stand-alone and interaction effects
of these two factors accounts for 99.03% of the overall variation in performance.
The sum of all remaining effects accounts for less than 1% of the overall variation
in performance.
Rankings of effects for dataset linked factors are shown in Fig. 3. The two most
significant factors associated with the dataset are the number of rows in the dataset
and the number of columns in the query, accounting for 63.06% and 23.68% of the
overall variation in performance, respectively, and the interaction thereof accounting
for 12.17% of the overall variation in performance. The sum of the stand-alone and
interaction effects of these two factors accounts for 98.81% of the overall variation
in performance. The sum of all remaining effects accounts for just over one percent
of the overall impact on performance.
6 M. Nelson et al.
Fig. 1 Five most influential factors to variations in performance ranked by percent of overall
variation
Fig. 2 Four most influential architectural factors to variations in performance ranked by percent
of overall variation
Fig. 3 Four most influential dataset and query factors to variations in performance ranked by
percent of overall variation
5 Discussion of Results
As seen in Fig. 1, the two most significant factors are the use of shared memory and
the number of rows (88.24% of the total variation in performance). Interestingly,
bit density has no significant effect on performance for the GPU query method
tested here. Figure 4 presents the approximate profiles of GPU execution time when
varying the two most significant factors, which demonstrate these effects in practice.
As shown, (A) does not use shared memory and has a large number of rows, (B) does
not use shared memory and has a small number of rows, (C) uses shared memory
A Statistical Performance Analysis of GPU WAH Range Querying 7
A)
B)
C)
D)
Fig. 4 Shown are profiles of four query executions when the two most significant factors to per-
formance (the use of shared memory and the number of rows in the database) are varied
Fig. 5 Effects of the three most significant factors (the use of shared memory on the GPU, the
number of rows in the dataset, and the interaction thereof) on execution time
and has a large number of rows, and (D) uses shared memory and has a small number
of rows.
Query performance derived from the primary factors (use of shared memory and
the number of rows) can also be visualized in Fig. 5. Following the arrows in this
figure results in performance enhancement. E.g., using shared memory, decrease the
total number of rows, or both. We can see the importance of examining interactions of
factors in this figure, as exploiting both factors to gain performance is more beneficial
than only using one.
6 Related Work
GPUs and CUDA have enabled the acceleration of many general-purpose computing
problems. Many times, these come from a focus on core mathematical routines [6, 12,
13] or parallel programming primitives [5, 8]. With these, researchers have been able
8 M. Nelson et al.
References
1 Introduction
Several studies for crowdsourcing have been proposed recently [3, 4, 18]. In crowd-
sourcing, a requester posts tasks, such as questionnaires, programming, and proof-
reading, and a worker selects and conducts a task while considering the complexity
and fee of the task. Crowdsourcing systems receive money from requesters in advance
and pay fees to workers for completed tasks. Therefore, personally identifiable infor-
mation (PII) of each requester and worker is registered to crowdsourcing systems.
In this article, we focus on questionnaires as tasks. A requester posts a question-
naire, which a worker can answer if he/she wishes to. Because the crowdsourcing
system has the PII of workers, it is preferable that workers send the answers of the
questionnaire to the requester directly when questionnaires need sensitive informa-
tion of workers, such as salary and religion. However, it is possible that requesters
can identify workers. Most questionnaires contain questions about the answerers’
attributes, such as age and sex, because the requester analyzes the results of the ques-
tionnaire in terms of the attributes of the answerers in many cases. The requester may
identify the worker from these basic items of questionnaires, such as age and sex.
Sweeney [17] found that 87% of the US population is uniquely identified by {date of
birth, sex, 5-digit ZIP}. Of course, other attributes can also be used for identification.
Rocher et al. [14] reported that 99.98% of Americans can be identified using 15
attributes. This problem can cause workers to avoid answering questionnaires.
© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 11
J. D. Peter et al. (eds.), Disruptive Technologies for Big Data and Cloud Applications,
Lecture Notes in Electrical Engineering 905,
https://doi.org/10.1007/978-981-19-2177-3_2
12 Y. Sei and A. Ohsuga
Moreover, requesters can obtain other information about workers from crowd-
sourcing systems. The requester can see the IDs of workers in crowdsourcing systems
because the requester should check the work result of each worker and tell the crowd-
sourcing system that the requester accepts to pay an agreed fee for the work if the
requester is satisfied with it. Based on the IDs, requesters can check several attribute
values registered in the crowdsourcing system of workers, such as state/province of
address, career, and skills. This information can also be used to identify workers. In
this study, we assume that a worker agent exists in each worker’s personal computer
or smartphone. The worker agent anonymizes the worker’s attributes.
Our previous work can collect information from workers while protecting their
privacy. However, it assumes that the number of answers to be collected is only one;
moreover, it assumes that the privacy-protection level among workers should be the
same. In this study, we propose a technique that can collect multiple answers under
the assumption that each worker can set a different privacy-protection level for each
answer.
The rest of this article is organized as follows. In Sect. 2, we introduce our appli-
cation and attack model. In Sect. 3, we define the privacy used in this work. In Sect. 4,
we describe the related methods. We present the design of the proposed algorithm in
Sect. 5, and we present the simulation results in Sect. 6. In Sect. 7, we conclude the
paper.
2 Assumptions
once. As usual, analyzers create several cross tabulations from selected attributes for
each purpose.
Further, we assume that the crowdsourcing system has the PII of workers. There-
fore, workers do not send the answers to the questionnaire to the crowdsourcing sys-
tem but send them to the requester after being anonymized. We assume that requesters
are semi-honest entities. That is, the requesters follow the proposed protocol but try
analyzing the individual information from each disguised data.
3 Privacy Metric
There are many important privacy metrics, such as k-anonymity [2, 10] and l-diversity
[11, 13]. In this study, we use differential privacy [6]; in particular, we focus on local
differential privacy [5], which has been widely studied recently.
Definition 1 [ε-local differential privacy] Let X be a set of sensitive values. A
randomized mechanism A satisfies ε-local differential privacy if for any x, x ∈ X
and y ∈ Y ⊂ Range(A),
We consider that each question is a database with only one record having only one
column. Each worker can set a privacy level for each question. Moreover, a requester
can set different fees for different questions and privacy levels. For instance, the
fees of questions about gender with high- and low-privacy levels are 2 and 3 cents,
respectively, and the fees of questions about disease names with high- and low-privacy
levels are 5 and 10 cents, respectively.
Let q be the number of questions in a questionnaire. When each database (i.e.,
each answer) of a worker satisfies ε1 , …, εq differential privacies, the set of answers
satisfies (ε1 + · · · + εq ) differential privacy [9]. Workers can consider not only each
privacy level of each answer but also the privacy level of the set of all answers if
needed.
4 Related Work
ordinal response that provides local differential privacy, and it was implemented in
Google Chrome. Murakami and Kawamoto [12] assumed that there were sensitive
and nonsensitive data and proposed a local differential privacy mechanism that could
enhance data utility by protecting only sensitive data.
5 Proposed Method
An overview of the proposed system is shown in Fig. 1. Each worker can determine
his/her privacy-protection level (i.e., the value of ε). Because it is difficult for a
layperson to understand the meaning of ε, we assume that the requester can determine
several values of ε in advance. For example, the requester can prepare four levels:
nonanonymization (ε = ∞), high-anonymization (ε = 0.1), middle-anonymization
(ε = 1), and low-anonymization (ε = 10). How to determine the privacy-protection
level is outside the scope of this study.
The worker protocol is based on S2Mb [16] although S2Mb does not assume multiple
data collection and different privacy levels. Let Fi be the number of options of a
question i of a questionnaire, Si be a set of options of question i, and ti, j be the
selected option of the true answer of worker w j for question i.
When worker w j determines to answer a questionnaire of the crowdsourcing,
he/she first specifies privacy levels ε1, j , . . . , εq, j for each of q questions of the ques-
tionnaire. Then, he/she calculates a set of parameters si, j and pi, j for all i = 1, . . . , q
on the basis of εi, j and Fi , using the following equation:
Investments
In view of the varied resources of Brazil, to enumerate the
possibilities for investors would be difficult. There is hardly a line of
industry which cannot there be carried on successfully. That of coffee
growing is so well developed as to be somewhat overcrowded, but in
almost any other line there is a field for the investor. Whether it be
mining of gold or diamonds, of coal, iron, or manganese, be it
agriculture, stock raising, the lumber industry, or manufacturing, the
harnessing of the waterfalls to produce hydro-electric power, the
construction of public works, the field for the capitalist, large or small,
is of infinite variety and excellent promise. The present Government
is planning a broad and active development of the electric power
available from its great and numerous water-falls.
CHAPTER LI
SOUTH AMERICAN TRADE