Professional Documents
Culture Documents
Download Computational Sciences And Artificial Intelligence In Industry New Digital Technologies For Solving Future Societal And Economical Challenges And Automation Science And Engineering 76 Tero Tuo online ebook texxtbook full chapter pdf
Download Computational Sciences And Artificial Intelligence In Industry New Digital Technologies For Solving Future Societal And Economical Challenges And Automation Science And Engineering 76 Tero Tuo online ebook texxtbook full chapter pdf
Download Computational Sciences And Artificial Intelligence In Industry New Digital Technologies For Solving Future Societal And Economical Challenges And Automation Science And Engineering 76 Tero Tuo online ebook texxtbook full chapter pdf
https://ebookmeta.com/product/automation-and-autonomy-labour-
capital-and-machines-in-the-artificial-intelligence-industry-
steinhoff/
https://ebookmeta.com/product/semantic-web-technologies-research-
and-applications-computational-intelligence-in-engineering-
problem-solving-1st-edition-archana-patel/
https://ebookmeta.com/product/primary-mathematics-3a-hoerst/
Artificial Intelligence and Digital Diplomacy
Challenges and Opportunities 1st Edition Fatima Roumate
https://ebookmeta.com/product/artificial-intelligence-and-
digital-diplomacy-challenges-and-opportunities-1st-edition-
fatima-roumate/
https://ebookmeta.com/product/artificial-intelligence-
technologies-applications-and-challenges-1st-edition-lavanya-
sharma-editor/
https://ebookmeta.com/product/computational-intelligence-and-
data-sciences-paradigms-in-biomedical-engineering-1st-edition-
taylor-francis-group/
https://ebookmeta.com/product/software-engineering-artificial-
intelligence-networking-and-parallel-distributed-computing-
studies-in-computational-intelligence-1012-roger-lee-editor/
https://ebookmeta.com/product/artificial-intelligence-and-
machine-learning-in-public-healthcare-opportunities-and-societal-
impact-kc-santosh/
Intelligent Systems, Control and Automation:
Science and Engineering
Tero Tuovinen
Jacques Periaux
Pekka Neittaanmäki Editors
Computational
Sciences and Artificial
Intelligence in
Industry
New Digital Technologies for
Solving Future Societal
and Economical Challenges
Intelligent Systems, Control and Automation:
Science and Engineering
Volume 76
Series Editor
Kimon P. Valavanis, Department of Electrical and Computer Engineering,
University of Denver, Denver, CO, USA
Advisory Editors
P. Antsaklis, University of Notre Dame, IN, USA
P. Borne, Ecole Centrale de Lille, France
R. Carelli, Universidad Nacional de San Juan, Argentina
T. Fukuda, Nagoya University, Japan
N.R. Gans, The University of Texas at Dallas, Richardson, TX, USA
F. Harashima, University of Tokyo, Japan
P. Martinet, Ecole Centrale de Nantes, France
S. Monaco, University La Sapienza, Rome, Italy
R.R. Negenborn, Delft University of Technology, The Netherlands
António Pascoal, Institute for Systems and Robotics, Lisbon, Portugal
G. Schmidt, Technical University of Munich, Germany
T.M. Sobh, University of Bridgeport, CT, USA
C. Tzafestas, National Technical University of Athens, Greece
Intelligent Systems, Control and Automation: Science and Engineering book series
publishes books on scientific, engineering, and technological developments in this
interesting field that borders on so many disciplines and has so many practical
applications: human-like biomechanics, industrial robotics, mobile robotics, service
and social robotics, humanoid robotics, mechatronics, intelligent control, industrial
process control, power systems control, industrial and office automation, unmanned
aviation systems, teleoperation systems, energy systems, transportation systems,
driverless cars, human-robot interaction, computer and control engineering, but also
computational intelligence, neural networks, fuzzy systems, genetic algorithms,
neurofuzzy systems and control, nonlinear dynamics and control, and of course
adaptive, complex and self-organizing systems. This wide range of topics,
approaches, perspectives and applications is reflected in a large readership of
researchers and practitioners in various fields, as well as graduate students who
want to learn more on a given subject.
The series has received an enthusiastic acceptance by the scientific and
engineering community, and is continuously receiving an increasing number of
high-quality proposals from both academia and industry. The current Series Editor
is Kimon Valavanis, University of Denver, Colorado, USA. He is assisted by an
Editorial Advisory Board who help to select the most interesting and cutting edge
manuscripts for the series:
Pekka Neittaanmäki
Editors
Computational Sciences
and Artificial Intelligence
in Industry
New Digital Technologies for Solving Future
Societal and Economical Challenges
123
Editors
Tero Tuovinen Jacques Periaux
Faculty of Information Technology CIMNE, International Center for Numerical
University of Jyväskylä Methods in Engineering
Jyväskylä, Finland Barcelona, Spain
School of Technology
JAMK University of Applied Science
Jyväskylä, Finland
Pekka Neittaanmäki
Faculty of Information Technology
University of Jyväskylä
Jyväskylä, Finland
This Springer imprint is published by the registered company Springer Nature Switzerland AG
The registered company address is: Gewerbestrasse 11, 6330 Cham, Switzerland
Preface
v
vi Preface
These goals are ambitious but they are necessary steps in the creation of an AI
industry. CSAI 2019 has opened the door to these goals by activating dual net-
working of two majors computational and computer sciences disciplines: compu-
tational science and artificial intelligence. The aim of CSAI was to provide an
overview of the state-of-the-art and the technology trends in the innovative
hybridization of computational methods and digitalization of industrial and societal
applications.
In 1997, an International Conference took place in Tours, France, titled
“Computational Science for the 21st Century”, with a collection of papers reflecting
the “State of the Art” in computational science. Two decades after this event, in
2019, a second wave of “a marriage `a la mode” with the digitalization of industry
and society has been launched with “Computational Sciences and Artificial
Intelligence” in Jyväskylä, Finland: CSAI 2019.
The advances in technology and the ever-growing role of digital sensors and
computers in science have led to an exponential growth in the amount and com-
plexity of (big) data that scientists and industries collect. Artificial intelligence
(AI) is able to create added value for all stakeholders, not only for the industry and
users but for society as well. One amazing example where all actors benefit from the
use of AI can be observed in healthcare.
Concurrently, the data algorithmics of computational sciences is already
improving its performance and productivity in several industrial sectors. For
example, the proliferation of sensors installed on complex systems at every level of
industry help both to fine-tune service in real time and to regulate all production
sequences at the highest possible level. While these principles are mainly applied in
the industry, there is no reason why they would not work for society as well.
Data is appearing as a key competitive advantage in the global CS-AI race!
This first book in the series, assembled with the content of ECCOMAS Thematic
Conferences on Applied Sciences, is exploring methodology development and the
application of computational sciences and AI expert systems in the industrial sector,
healthcare, and technology.
The book is addressed to young researchers and engineers in the fields of
computational science and artificial intelligence, ranging from innovative compu-
tational methods to digital machine learning tools and their coupling used for
solving challenging industrial and societal problems.
The content of this volume is organized into four sections with 17 contributions
in the disciplines classified as follows:
Part I Overview
Part II Methodology
AI of the future will contribute to creating the greatest values for all stakeholders:
industry, services, and users. To achieve this, the industry will have to be con-
nected, and this is already a reality today. Progress is central to such change, which
is made possible by the digital revolution. The Internet of Things, the cloud, and big
data are an integral part of the industry’s transformation as they gain momentum,
affecting every aspect of society.
The world of data-driven algorithms with computational sciences is greatly
improving performance and productivity in several industrial sectors. The prolif-
eration of sensors installed on complex systems found at every level of industry
help both to fine-tune service in real time and to regulate all production sequences at
the highest possible level. All stakeholders in the industry and society sectors are
now up to speed with the digital age. The connected industry is hastening the pace
of transformations that are already underway.
But these perspectives come at a price. All connected objects, soon to be linked
to developments in artificial intelligence, are potential points of entry for digital
attacks. Improved performance has the negative effect of increasing exposure to
cyber-threats. Cyber-attack can penetrate and exploit the logic of networks,
spreading to contaminate the most established industrial systems, such as social
networks and governmental systems.
Cyber-security is a topic of utmost importance since the progress achieved in
connected industry and society is so great that no one can imagine any move back.
The continuity of innovation relies largely on the confidence placed in the data
generated and systems that suspend them. Security is now a major issue for the
sustainability of states, smart territories, and businesses. The rapid increase in
surface and air transport, medicine, and finance sectors, among others, is the most
tangible demonstration of the above issue.
For better or worse, AI is predicted to have a huge impact on the future of
humanity (Steve Pinker, Professor of Psychology, Harvard University,
Enlightenment now: The case for Reason, Science, Humanism and Progress, in
Scientific Foresight Unit, EPRS, Should we fear artificial intelligence? 2018)!
Our sincere acknowledgments go to our invited plenary and semi-plenary
speakers, parallel sessions organizers, round tables organizers and panelists, session
chair (wo)men, and the contributors of this volume. Offering such a high scientific
and industrial quality level of the CSAI 2019 event would not have been achieved
without their expertise in the AI fields.
We would also like to finally express our gratitude to Dr. Mayra Castro, Senior
Editor, and Prof. Kimon Valavanis, the editor of Intelligent Systems, Control and
Automation: Science and Engineering, for their kind agreement to publish the latest
advanced results in the above series.
Preface ix
Part I Overview
Co-development of Methodology, Applications, and Hardware
in Computational Science and Artificial Intelligence . . . . . . . . . . . . . . . 3
Pekka Neittaanmäki, Matti Savonen, Jacques Periaux, and Tero Tuovinen
Part II Methodology
Novel Strategies for Data-Driven Evolutionary Optimization . . . . . . . . . 11
Swagata Roy and Nirupam Chakraborti
Artificial Intelligence and Computational Science . . . . . . . . . . . . . . . . . . 27
Pekka Neittaanmäki and Sergey Repin
Supervised Learning and Applied Mathematics . . . . . . . . . . . . . . . . . . . 37
Olivier Pironneau
Application of the Topological Gradient to Parsimonious Neural
Networks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47
Kateryna Bashtova, Mathieu Causse, Cameron James, Florent Masmoudi,
Mohamed Masmoudi, Houcine Turki, and Joshua Wolff
Generation of Error Indicators for Partial Differential Equations
by Machine Learning Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63
Alexey Muzalevskiy, Pekka Neittaanmäki, and Sergey Repin
Newton Method for Minimal Learning Machine . . . . . . . . . . . . . . . . . . 97
Joonas Hämäläinen and Tommi Kärkkäinen
Limited Memory Bundle Method for Clusterwise Linear
Regression . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 109
Napsu Karmitsa, Adil M. Bagirov, Sona Taheri, and Kaisa Joki
xi
xii Contents
xiii
xiv Contributors
P. Neittaanmäki · M. Savonen
Faculty of Information Technology, University of Jyväskylä, University of Jyväskylä, P.O. Box
35, 40014 Jyväskylä, Finland
e-mail: pekka.neittaanmaki@jyu.fi
M. Savonen
e-mail: matti.j.savonen@jyu.fi
J. Periaux (B)
International Centre for Numerical Methods in Engineering (CIMNE), Barcelona, Spain
e-mail: jperiaux@gmail.com
T. Tuovinen
Faculty of Information Technology, University of Jyväskylä, Jyväskylä, Finland
e-mail: tero.tuovinen@jyu.fi
School of Technology, JAMK University of Applied Science, Jyväskylä, Finland
the historical progression of selected scientific paradigms (see Fig. 1). In Fig. 1,
paradigms are presented in correlation with hardware and methodology develop-
ments that have enabled them. It is easy to observe the progression from empirical
to theoretical, from theoretical to computational, from computational to machine
learning/artificial intelligence (AI) and, finally, to data science/big data analysis. It
is possible to confidently predict that the introduction of functional quantum com-
puting solutions as part of data science will be one of the next major development
steps.
Neuvo [3] presented the delay and interaction from mathematical theory and mod-
els to practical products through the example of telecommunications (see Fig. 2). The
timeline begins with the Fourier analysis in 1822 and the telegraph in 1844. Figure 2
shows us the interaction between theory and application and culminates in the first
GSM call. Telecommunications have since progressed to a new era and the applica-
tions of mobile technology have changed the world we live in. Similar world-defining
developments can be identified by creating timelines starting with, for example, quan-
tum theory and the work of Max Planck. This timeline would prominently feature
such applications as nuclear power, transistors, and lasers. Similarly, we can trace
developments from Albert Einstein’s special and general relativity to satellites and
the mobile communications world of today. In each case, we see a significant delay
from new groundbreaking theoretical work to world-changing applications.
Development of hardware design and scientific computing algorithms and soft-
ware go hand in hand. Namely,
If the used method (algorithm + software) does not utilize the power of hardware
and new technology in the most efficient way possible, only partial potential from the
investments will be returned. This is especially true with new challenges emerging
from modern-day challenges and in a time when we are entering an era of hybridized
computing where quantum computing will be an integral part of problem-solving.
Co-development of Methodology, Applications, and Hardware … 5
The field of artificial intelligence is huge, and its correct classification is a general
topic of discussion. Villani et al. [4] represent that AI is at the crossroads of several
disciplines: computer science, mathematics (logic, optimization, analysis, proba-
bilities, and linear algebra), and cognitive science. The algorithms that underpin it
are based on equally varied approaches: semantic analysis, symbolic representation,
statistical and exploratory learning, and neural networks.
Despite these difficulties, we have outlined our view related to various compo-
nents, terminology, and derivative technologies, which is commonly known as weak
artificial intelligence (see Fig. 3). First, support systems for decision-making, expert
systems, planning, scheduling, optimization, robotics, and computer vision have been
developed.
Thanks to progress in deep learning, we have had major progress in machine
learning techniques:
Deep learning allows the computer to build complex concepts out of simpler concepts.
(Goodfellow et al. [2])
The knowledge of computational sciences should not ignore the knowledge of arti-
ficial intelligence but has to benefit from its new technologies in a win-win alliance.
While the impact of AI can provide a major competitive advantage for businesses
and even for societal issues, computational sciences can provide new and innovative
solutions for problem-solving with algorithms boosted with the use of big data and
AI. Together, these scientific fields have great potential to offer solutions not only to
industrial issues but also to the vast societal challenges of today.
Artificial intelligence of the future will make it possible to create the greatest values
for all stakeholders: industry, services, and users. The Internet of Things, Cloud
Computing, and Big Data are an integral part of the industry’s transformation, as
they gain momentum, affecting every aspect of society. In this section, we will
present five challenges in the contemporary field of scientific computing and artificial
intelligence-powered applications.
Unbalanced Development
The first challenge we identify stems from the unbalanced development of computa-
tional methods and hardware. The development of hardware has followed Moore’s
law up until the last few years. Unfortunately, methods (algorithms and software)
Co-development of Methodology, Applications, and Hardware … 7
have developed significantly slower than hardware. Because of the fast hardware
development and cheap price of processors, companies and the research commu-
nity have not had enough interest in methodology development. Furthermore, most
research on energy-efficient computing has revolved around hardware instead of
algorithms and methods.
Understanding the Limitations of the Methodology
The second challenge is that many engineering computations use black box software
and therefore the user is not aware of the limitations on methodology. Very attractive
visualization can give a wrong illusion about the quality of computing. In the scientific
community, the research in the mathematical, statistical, IT- and engineering fields
is separated. Researchers are not aware of the current development in neighbouring
fields.
Time-Criticality
The third challenge is related to the inherent slowness of machine learning algorithms.
They are not suitable for time critical applications, such as cyber attacks, automation
of transportation, and classification of very noisy signals. Noisy signal classification
is required in, for example, neuroscience, space technology development, industrial
IoT, and sensor signal analysis.
Cybersecurity
The fourth challenge is cybersecurity. The world of data algorithms with Computa-
tional Sciences is improving performance and productivity in several industrial sec-
tors. The proliferation of sensors installed on complex systems found at every level
of industry helps both to fine-tune service in real time and to regulate all production
sequences at the highest possible level. This progression includes increasing risks
and requires proactive action. All connected objects are potential points of entry for
digital attacks. Improved performance has a paradoxical effect on increasing expo-
sure to cyber threats. Cyber attacks can exploit network logic to contaminate the most
established industrial systems, such as social networks and governmental systems.
Security is now a major issue for the sustainability of states, smart territories, and
businesses.
Big Data
The fifth challenge is that of the fast-increasing quantity of data. Methodology on
big data has not kept up with the increase in available data. In order to catch up with
the current day quantities of data, we need to increase the effectiveness of computing
and improve the quality of computing results. Moreover, we need new methods of
extracting key features from data sets. Many scientific challenges involve a large
amount of high-dimensional data. Yet, it is known that there are always a small
number of unidentified parameters that encode the crucial part of the data. The
question is how to identify and extract these parameters so that the computing can
be focused on key aspects of the data set.
8 P. Neittaanmäki et al.
4 Conclusions
Artificial intelligence of the future will generate the greatest values for all stake-
holders: industry, services, and users. To achieve this, the industry will have to be
connected, and this is already a reality today. Progress is central to such change,
made possible by the digital revolution. The Internet of Things, the Cloud Comput-
ing, and Big Data are an integral part of the industry’s transformation as they gain
momentum, affecting every aspect of society.
The world of data algorithms with Computational Sciences is improving perfor-
mance and productivity in several industrial sectors. All stakeholders in the industry
and society sectors are now up to speed with the digital age. The connected industry
is hastening the pace of transformations that are already underway.
But this comes at a price. All connected objects, soon to be linked to developments
in Artificial Intelligence, are potential points of entry for digital attacks. Cyber attacks
can exploit network logic to contaminate the most established industrial systems. This
is a topic of utmost importance, since the progress achieved in connected industry
and society is so great that no one can imagine any going back.
The pace and continuity of innovation rely largely on the confidence placed in
the data generated and systems that suspend them. Security is now a major issue for
the sustainability of states, smart territories, and businesses. The rapid increase in
surface and air transport, medicine, and finance sectors, among others, is the most
tangible demonstration of the above issue.
Researchers, engineers, and entrepreneurs who contribute to the design, develop-
ment, and commercialization of AI systems will play a decisive role in the digital
society of tomorrow. To ensure this, it is necessary to make them aware, from the start
of their training, of the ethical issues linked to the development of digital technologies
(Villani et al. [4], Bentley et al. [1]).
In this book, the co-development of methodology, applications, and hardware
is considered through the various research studies written by well-known experts
around the world.
References
1. Bentley PJ, Brundage M, Häggström O, Metzinger T (2018) Should we fear artificial intelli-
gence? In-depth analysis. European Union, Brussels
2. Goodfellow I, Bengio Y, Courville A (2016) Deep learning. MIT Press, Cambridge
3. Neuvo Y (2008) Industry needs universities and vice versa. In: Engwall L, Weaire D (eds) The
university in the market: proceedings from a symposium held in Stockholm, 1–3 November
2007. Portland Press, pp 119–126
4. Villani C, Schoenauer M, Bonnet Y, Berthet C, Cornut A-C, Levin F, Rondepierre B (2018) For a
meaningful artificial intelligence: towards a French and European strategy. French Government
Part II
Methodology
Novel Strategies for Data-Driven
Evolutionary Optimization
Abstract Novel learning algorithms like Evolutionary Neural Net (EvoNN), Bi-
objective Genetic Programming (BioGP), and Evolutionary Deep Neural Net
(EvoDN2) developed in our laboratory are being widely used in diverse areas of engi-
neering metamodeling and multi-objective optimization of practical interest. These
are intelligent algorithms, based on a nature-inspired approach, trying to mimic some
basic aspects of evolutionary biology in a non-biological context, and follow the prin-
ciples of multi-objective optimization. In this article, the basic working principles of
these algorithms are explained.
1 Introduction
Complex multi-faceted problems are ubiquitous in science and industry, where sev-
eral objectives need to be optimized simultaneously to determine their optimum
trade-offs for the best possible outcomes and benefits. Such problems are not only
complex but also their objectives influence each other. Often a little tinkering with
one may lead to a huge deviation in another. A synchronized optimization of all
the objectives to obtain an optimal set of solutions is defined as multi-objective
optimization problems (MOPs) [4].
Further, these problems come with a feasibility set beyond which solutions have
no practical importance, and there could be some additional constraints as well. These
constraints need to be handled carefully as well. In case of MOPs we usually obtain
a set of Pareto optimal points [4] instead of an unique optimum that we normally
get during a single objective optimization. When two such optimum solutions are
compared, if one solution is better in terms of one objective, it will be inevitably
worse in terms of another. These solutions are called non-dominated to each other.
However, when one solution is better or equal in terms of all the objectives but strictly
better in terms of at least one objective, then it is said that this solution is dominating
the other one.
For a multi-objective problem the dominance criterion [4, 6] can be defined math-
ematically. Consider a minimization problem
f i (X ), i = 1, 2, . . . , I (1)
subject to constraints
g j (X ) ≥ 0, j = 1, 2, . . . , J, (2)
When neither l nor m dominates the other, they are considered to be non-
dominating to each other.
Evolutionary Algorithms (EA) are one of the techniques which use these criteria
in many fields to achieve optimal solutions successfully [6]. Genetic Algorithms
(GA) are a class of such evolutionary optimization algorithms which are designed
to mimic the biological process of evolution. GAs generally imitate three major
biological processes [4, 6]:
1. selection operation for identifying the candidates for the next generation,
2. crossover operation for the probabilistic exchange of genetic information between
two randomly picked parents, and
3. the mutation operation inducing a small, probabilistic change in the genetic
makeup, resulting in a local search.
Different multi-objective evolutionary algorithms (MOEAs) are constructed based
on the GA strategy [6]. Predator Prey Genetic Algorithm (PPGA) is one such algo-
rithm [15]. PPGA is population based and computes multiple solutions approaching
the Pareto-optimal front simultaneously. Coupling GAs, Neural Networks [30], and
Genetic Programming [25], we have designed a number of training and optimiza-
tion algorithms. The idea is to create optimum metamodels out of non-linear data
containing random noise and to use those models as objectives in a multi-objective
optimization framework. These optimum models should neither overfit nor underfit
the data [5]. The major algorithms developed in our group are named as Evolution-
ary Neural Network (EvoNN) [23, 24], Bi-objective Genetic Algorithm (BioGP) [9],
and Evolutionary Deep Neural Network (EvoDN2) [26], and all of them construct
models by training available data irrespective of problem physics without using any
gradient information. These trained models are further incorporated in MOEAs to
obtain Pareto optimal solutions.
A model that misses the implicit relevant relations in the data is said to have
a high bias and is not very good at predicting the behavior it is modeling. On the
Novel Strategies for Data-Driven Evolutionary Optimization 13
other hand, a model that includes the random noise is said to have a high variance.
While such a model gives good results on the data it is trained on, it fails to make
good predictions in other situations. Hence, a balance between bias and variance is
desired to achieve models that predict the system behavior accurately, while ignoring
the random noise of the dataset. In machine learning and statistics, this is known as
the Bias-Variance Dilemma [6]. In our algorithms, this is simply implemented as the
accuracy versus complexity problem, where a set of optimum models are obtained
through a trade-off between them. The problem of creating surrogate models hence
becomes a bi-objective problem in our approach.
All the three abovementioned algorithms generate models based on this strategy.
EvoNN selects the best model from the non-dominated set of models using Akaike
information criteria (AIC) or the corrected Akaike information criteria (AICc) [1],
while both BioGP and EvoDN2 recommend the model based on the least error or
most accuracy, as their architectures prohibit direct application of those information
criteria. EvoDN2, based on deep neural networks, is a step forward in data-driven
modeling. Both EvoNN and BioGP are not designed to learn from a dataset that is
excessively large.
In recent years, with the advent of successful deep learning, the use of Deep Neural
Networks [29] in learning a large set of data has become feasible. Nowadays, a large
volume of information can be easily obtained from industrial automation systems
that routinely store the process data. Mining relevant information out of these large
piles of datasets needs some very efficient modeling, which deep learning is able
to provide. In the backdrop of stagnation of EvoNN while training some very large
data, EvoDN2 was constructed and so far, it has been successful in dealing with large
datasets with as many as 10,000 inputs [26]. The code can be easily upgraded for
even a larger volume of input data.
All these novel strategies have been rigorously tested using a large number of test
functions [7, 12, 22, 32], and they performed very well. Once the objective functions
are created, BioGP and EvoNN are coded to use PPGA as well as a constraint- based
Reference Vector guided Evolutionary Algorithm (cRVEA) [3] for carrying out the
optimization task, while EvoDN2 uses cRVEA only. In addition, all these algorithms
are capable of highlighting the impact of changing any individual variable on the
objective function, even when the variables are interrelated in a complex manner. This
procedure, Single Variable Response (SVR), is incorporated into all these algorithms.
Major studies using these strategies have been conducted in the materials science
field involving blast furnace process [16, 17], alloy development [26], and even in
optimizing potential parameters for molecular dynamics study [27].
These novel algorithms are discussed in the subsequent sections in detail.
2 Algorithms
Modeling noisy data is a very difficult task, particularly so if the noise is random in
nature. Systematic noise can be filtered easily, but the same cannot be said for random
14 S. Roy and N. Chakraborti
noise. However, data with random noise are ubiquitous in real-life applications, the
source being various industries and experiments and simulations dealing with large
uncertainty. The major challenge for modeling or metamodeling in such cases is to
avoid the possibilities of overfitting and underfitting as mentioned before. In between
these two extremes somewhere lies the actual model, which the algorithms presented
here attempt to capture based upon the notion of Pareto optimality [18], implemented
in an evolutionary way. The models here are rewarded for their accuracy and penalized
for their parameterization and a set of models showing the best possible trade-offs
between those conflicting requirements ultimately emerge.
As stated before, three modeling strategies are presented here. The Predator-
Prey genetic algorithm (PPGA) [15] is primarily the backbone of BioGP, EvoNN,
and EvoDN2. A population of preys signifies the various models or solutions to
the problem at hand. Weaker preys, i.e., relatively inferior models or solutions, are
exterminated by the predators, which are some artificially introduced entities in the
system for the task of annihilating underperforming solutions following a set of rules.
The predators and preys are placed in a toroidal lattice, in which they are allowed
to roam following certain rules. A concept of neighborhood, akin to what is used in
cellular automata [31], is also introduced. The major steps of PPGA are
1. Define parameters such as the lattice size, number of preys preferred, number of
predators, number of generations, and the probabilities of crossover and mutation.
2. Generate random individuals (models or solutions, as the case may be) and place
them randomly in the lattice.
3. Generate predators with certain fitness criteria based upon objectives and place
the predators randomly. For example, in the case of metamodels, our conflicting
criteria are the error (E) and the complexity (A) of the models. Each predator will
be tested for a fitness value ( f i ) such that for the ith predator,
f i = xi E + (1 − xi )A, xi = 1i . (4)
After the above initialization, the predator-prey model proceeds in the following
steps:
1. Each prey is allowed to move in a random direction, i.e., it gets one of the eight
cells in an 8-cell Moore neighborhood (north, south, east, west, and plus the four
diagonal neighbors), to move into. If the cells they are attempting to move into
are occupied by another prey or predator, then they can try again. Each prey is
allowed 10 such attempts. If the prey is still unable to find a place to move, it
remains where it is.
2. After the preys have moved, they are then allowed to breed. If the prey has no
neighbors, it is not allowed to breed. Otherwise, the prey is allowed to breed with
another randomly selected neighbor to produce an offspring using real crossover
and mutation operators [8]. The offspring is randomly placed anywhere in the
lattice, which can be seen as migration among different clusters of prey across
the solution space. Ten attempts are made to place the child on the lattice. If all
the attempted cells are occupied, the child is discarded.
Novel Strategies for Data-Driven Evolutionary Optimization 15
3. The prey population is under constant threat from the predators, which are initially
allocated at random across the lattice. The predators hunt in series. Selection
pressure is exerted upon the prey population through the predator-prey interaction,
that is, predators are given the task of killing the least-fit prey in their vicinity.
The predators first check their neighborhood to see if there is any prey. If so, the
predator selects the least-fit prey and kills it. The predator then moves onto the
cell held by that prey. If a predator has no neighboring prey, it moves in exactly the
same way as a prey. At the beginning of each generation, the maximum number
of moves for the predators, or for that matter the maximum number of kills is
calculated as
numprey,actual − numprey,prefered
n moves = , (5)
numpredators
where numprey,actual is the number of preys present at the beginning of each gen-
eration and numprey,prefered is the number of preys specified at the very beginning.
item At repeated intervals, a Pareto dominance-based, ranking procedure [4] is
applied and the weaker members of the population are eliminated. This involves a
direct application of the criteria presented in (1)–(3). The process repeats until the
maximum number of generations are attained and then only the non-dominated
members identified through the same ranking strategy are picked up. Such solu-
tions present the best trade-off between the objectives.
2.1 BioGP
Bi-objective Genetic Algorithm (BioGP) [9] had its base on Genetic Programming
(GP) [25]. In GP, tree encoding replaces binary or real data encoding that is more
common in the parlance of genetic and evolutionary computation. Concepts of pop-
ulation, fitness, selection, crossover, and mutation are a bit different in GP due to
this change in environment. GP involves a function set which contains user-defined
mathematical operations like division, square root, etc., and a terminal set where all
the variables and constants are kept. In a binary tree, the operators of the function
set are placed at the nodes, while the members of the terminal set form the leaves.
In recent times, GP has become a highly efficient tool for data-driven modeling
and unlike neural networks, it does not require any pre-defined configuration of
weights, biases, and transfer functions. Thus, it can evolve any mathematical function
representing the system being modeled and can also use the logical conditionals when
needed. In conventional GP, the trees are selected based upon their minimum root
mean square error (RMSE). However, the tree with minimum error often may lead to
overfitting, hence Bi-objective genetic algorithm (BioGP) [9] deals with maximizing
the accuracy of the tree (in other words, minimizing its RMSE) and simultaneously
minimizing its complexity (measured as an average of the depth of the tree and the
number of function nodes used) so that the overfitting can be avoided. This leads to
the bi-objective optimization problem and for that, it follows the PPGA algorithm.
16 S. Roy and N. Chakraborti
Furthermore, the conventional GP suffers from a problem called bloat [5] where
the trees become very large and become insensitive to any further crossover or muta-
tion. Also, often some sort of execution errors like division by zero happen in a tree,
which would be inevitably quite cumbersome to fix in the case of a large tree. To
circumvent this problem, BioGP grows a number of small sub-trees and agglomerate
them through a set of weights and biases using the Linear Least Square Approach
(LLSQ) [19]. A carefully designed error reduction ratio [5] keeps a tab on the per-
formance of each sub-tree. This prevents trees from growing unmanageably large,
thereby prevents bloat and also ensures mathematically acceptable convergence,
since it is obtained through the LLSQ algorithm instead of GA. Fixing the prob-
lems of any rogue tree also becomes simpler here. Upon convergence, a set of trees
constitute an approximate the best possible trade-offs and among them, as indicated
before, the one with the minimum error is taken as the best possible metamodel and
can be further used as an objective in PPGA or cRVEA algorithms.
An adaptive version of BioGP is also incorporated in a commercial software
named KIMEME developed by Cyber Dyne Srl [11]. The software serves an interac-
tive environment for modeling and optimization of multi-objective problems using
various paradigms including BioGP. A typical KIMEME interface with BioGP is
shown in Fig. 1. In KIMEME, the encoding is done in JAVA, which the users cannot
access or modify. The open-source versions of all the three algorithms presented here
are, however, available in both MATLAB and Python.
2.2 EvoNN
In the EvoNN algorithm, a population of Artificial Neural Networks (ANN) [23, 24]
of various architecture acts as preys in PPGA. The architecture of the Evolutionary
Neural Network (EvoNN) model is quite simple. ANN consists of nodes, divided
into the input layer (which takes the input from the variables), the output layer (which
gives the output based on the inputs), and one hidden layer. Each node is connected
to one or more nodes above it, i.e., toward the direction of the output node. Each
connection has certain weights attached to it. The more the number of connections,
the more complicated the neural network becomes. The number of active connections
and the values of the weights vary between the members of the population. The hidden
nodes take the values provided by the nodes connected to them, multiplied by the
weight of the connection, and use a transfer function to map an output, which it
forwards to the next layer.
In EvoNN, the lower level connections are optimized using PPGA. The two objec-
tives to be optimized here are the error in the outputs of the model, and the complexity
of the model, measured by counting the number of active connections in the lower
part of the network. However, for using the corrected Akaike Information criteria
(AICc) [1], the weights in both upper and lower layers and also the biases were con-
sidered. A crossover operation, in this case, is defined as swapping of some of the
connections between two neural networks, whereas the mutation operation involves
a change in the value of the weights. For this, the weight to be mutated is pro-
vided with a perturbation through the corresponding weights of two other randomly
selected individuals.
The upper part of the network, i.e., the output layer, uses a linear transfer function
and is optimized using the LLSQ algorithm. This ensures mathematical convergence
of the algorithm. A pseudo-code of EvoNN is shown in Algorithm 1.
Algorithm 1 Pseudocode of EvoNN
begin train
Scale data in between [0, 1]
Generate 2-D toroidal lattice with dimension defined by user
Create a random population of Neural Networks, Prey, defined by
mentioned architecture and nodes in hidden layer
begin PPGA
for each member in Prey do
Deactivate some connections based on a fixed probability
Place member in random locations
end for
Create a population of predators, linearly distributed in [0, 1]
Place predators in random lattice positions
for all generations do
for all layers in Prey do
Calculate standard deviation in chromosome.
end for
Create empty new preys++
for all members in Prey do
18 S. Roy and N. Chakraborti
Find an empty lattice position in the Moore neighborhood for Prey member
Move Prey member to new location based on probability
end for
for all members in Prey do
Find Prey members in Moore neighborhood
Choose one Prey member
end for
for all layers in Prey member do
Create two offsprings by performing Crossover and Mutation
repeat
Choose random lattice position
if empty then
Place offsprings
Add offsprings to Prey
end if
until offsprings are placed or given 10 tries
end for
for all members in Prey do
Evaluate Error and Complexity
end for
Find rank of Prey members
if kill interval condition satisfied then
Kill Prey with ranks worse than Maximum rank
Create new random population of ANNs for new Preys
end if
for all members in Predators do
Calculate number of predator moves
for number of moves do
Move Predator
end for
end for
for all members in new Prey set do
repeat
Choose random lattice position
if empty then
place member
end if
until members are placed or given 10 tries
end for
end for
end PPGA
Find and save Prey members at Pareto Front
Find Prey member with least AICc
Display and save Training
end train
Novel Strategies for Data-Driven Evolutionary Optimization 19
2.3 EvoDN2
EvoNN is not capable of training a large dataset with data points of 10,000 or more.
Neural Networks with one layer are not enough. For this purpose, EvoNN is aug-
mented to a Deep Neural Networks (DNN) giving rise to the Evolutionary Deep
Neural Network (EvoDN) algorithm [26]. Instead of one hidden layer as in EvoNN,
more than one hidden layer is used here with a flexible number of nodes for each
layer. Similar to EvoNN, connections are made from input variables to the nodes
with random weights, some of which are provided with a zero value to discard those
connections. The last layer is converged using the same LLSQ method. Technically,
the EvoDN code needed some modification in handling the DNN preys compared
to ANN preys of EvoNN. In EvoNN, a 2-D matrix of connections for each prey is
enough, but in EvoDN, with different dimensions of 2-D matrices for each layer, the
concept of cell structure is introduced in the code.
Further modification is done to EvoDN leading to Evolutionary Deep Neural
Network coupled with subnets (EvoDN2) [26]. With this feature, we can divide the
dataset variables into subsets, which are individually passed through deep neural nets
with their individual layers and nodes. At the final layer, these subnets are converged
using LLSQ. This leads to much better training, as it results in high flexibility. The
number of layers for each subnet or how to divide the variable set into subsets are
all defined by users as per preference. Increasing layers or subnets may lead to high
fitting, however, that may also lead to overfitting, and hence PPGA is used to strike
a balance between complexity and accuracy. A schematic of EvoDN2 is shown in
Fig. 2.
For ANNs, the number of connections with non-zero weights in between input and
the hidden nodes is summed up to get the complexity. This is redefined for EvoDN2.
Here, it is taken as
20 S. Roy and N. Chakraborti
C= n1 [|wi |] , (6)
where C is the complexity, and wi is the matrix of the weights of the connections
between the ith and (i + 1)th layer and n denotes the total number of layers. This
will ignore inactive connections which have no effect on the final output and also will
give certain importance to the connections with larger weights. The mutation and
crossover operators are also modified in EvoDN2 to speed up the computing process
[26]. Unlike EvoNN, EvoDN2 does not use corrected Akaike Information criteria
(AICc) as it is not applicable for this structure with this new definition of complexity.
Here, the model with the least error is chosen. A pseudo-code of EvoDN2 is shown
in Algorithm 2.
2.4 cRVEA
3 Test Functions
These algorithms were tested by training models on datasets created using a plethora
of standard testing suites [7, 12, 22, 32]. Each suite comprises a number of opti-
mization problems, specially designed to test various special cases of optimization.
The functions available in these test suites were used to check the efficacy of training
through BioGP, EvoNN, and EvoDN2. The functions in these test problems are fairly
complicated and are often quite difficult to model because of various reasons. The
number of independent variables can range from anywhere between 10 and 30 in the
default configurations of these problems.
Many problems are discrete in the objectives space, and many are non-uniform,
meaning that the density of solutions in the objectives space varies from region to
region. These algorithms performed very well with most of such functions and in the
case of some functions with a huge number of local optima, overfitting was clearly
avoided. The details of these numerical experiments are being uploaded in a public
domain platform [21] and are not repeated here; instead, the gist of a paradigm case
is included. The results of training by EvoNN, BioGP, and EvoDN2 are compared
to see how they fare. Dataset of 100, 1000, and 10000 entries are produced to get an
idea of how these algorithms can train them.
All three training algorithms can cover the entire spectrum of data points quite
well. The spread of the calculated ZDT3 second objective function [7] along with
the same predicted by EvoDN2 are shown in Fig. 3. It was observed that EvoDN2
can predict the entire spectrum for 1000 data points quite well. BioGP and EvoNN
fare similarly for 100 or 1000 data points but for a larger dataset of 10,000 entries,
EvoDN2 is far better than both of them. This can be seen from the predicted second
objective of the ZDT2 function versus calculated by all the three algorithms presented
in Fig. 4. It can be seen that for this large dataset, EvoDN2 can predict best with a
correlation coefficient of ≈0.92, while the corresponding correlation coefficients for
EvoNN and BioGP are ≈0.67 and ≈0.81, respectively. Though both EvoNN and
BioGP work quite nicely for 100 and 1000 data, handling a large set of data using
them is quite difficult. However, EvoDN2 can predict 10,000 data quite adequately.
Thus, the development of EvoDN2 based on deep neural nets proves to be quite
useful.
22 S. Roy and N. Chakraborti
Fig. 4 Trained second objective of ZDT2 versus calculated by (from top to bottom) a BioGP, b
EvoNN, and c EvoDN2
Novel Strategies for Data-Driven Evolutionary Optimization 23
The real-life applications of extant BioGP and EvoNN are already reported in numer-
ous publications [8, 9, 13, 14, 20, 23, 24]. EvoDN2 is a newer algorithm, its appli-
cations have also begun. Since the details of those specific applications are beyond
the scope of this article, a few are briefly mentioned below.
The application of EvoNN in the bi-objective study of an ironmaking blast furnace
[23] or analysis of leaching data of low-grade manganese ores [24] and that of BioGP
in simulated moving bed processing [9] are already documented. EvoDN2 has been
used in our laboratory to predict optimum mechanical properties of microalloyed
steel [26], to study complex Blast furnace data [16, 17] which involves as many as
eight objectives including the total gas flow in the furnace, coke rate, plate cooling
heat loss, etc., and also [27] in computing the parameters for Modified Embedded
Atom Method (MEAM) potential [2] for Aluminum used in materials design through
molecular dynamics (MD) [10] focused on reducing errors between calculated phys-
ical properties and actual Density Functional Theory (DFT) results by Voter and
Chan [28].
Many of these problems need dealing with more than three objectives where due to
a lack of selection pressure, many common evolutionary algorithms fail. The cRVEA
[3] algorithm, however, can handle such a situation and therefore, it has been used
extensively to optimize them. A single variable response (SVR) procedure is also
embedded in these algorithms [19]. The SVR strategy provides large perturbations
to a single variable, while keeping the others fixed at a base level, thus registering
the effect of that particular variable on the property under study. The application of
these algorithms in the field of materials research is thus listed.
5 Concluding Remarks
References
1. Akaike H (2011) Akaike’s information criterion. In: Lovric M (ed) International encyclopedia
of statistical science. Springer, Berlin
24 S. Roy and N. Chakraborti
2. Baskes MI (1992) Modified embedded-atom potentials for cubic materials and impurities. Phys
Rev B 46(5):2727–2742
3. Cheng R, Jin Y, Olhofer M, Sendhoff BA (2016) A reference vector guided evolutionary
algorithm for many-objective optimization. IEEE Trans Evol Comput 20(5):773–791
4. Coello Coello CA, Becerra RL (2009) Evolutionary multiobjective optimization in materials
science and engineering. Mater Manuf Process 24(2):119–129
5. Collet P (2007) Genetic programming. In: Rennard J-P (ed) Handbook of research on nature-
inspired computing for economics and management, vol 1, Chapter V. IGI Global, Pennsylva-
nia, pp 59–73
6. Deb K (2001) Multi-objective optimization using evolutionary algorithms. Wiley, Hoboken
7. Deb K, Thiele L, Laumanns M, Zitzler E (2002) Scalable multi-objective optimization test
problems. In: Proceedings of the 2002 congress on evolutionary computation – CEC’02. IEEE,
pp 825–830
8. Forrester A, Sóbester A, Keane A (2008) Engineering design via surrogate modelling: a prac-
tical guide. Wiley, Hoboken
9. Giri BK, Hakanen J, Miettinen K, Chakraborti N (2013) Genetic programming through bi-
objective genetic algorithms with a study of a simulated moving bed process involving multiple
objectives. Appl Soft Comput 13(5):2613–2623
10. Hansson T, Oostenbrink C, van Gunsteren W (2002) Molecular dynamics simulations. Curr
Opin Struct Biol 12(2):190–196
11. Iacca G, Mininno E (2015) Introducing kimeme, a novel platform for multi-disciplinary multi-
objective optimization. In: Rossi F, Mavelli F, Stano P, Caivano D (eds) Advances in artificial
life, evolutionary computation and systems chemistry. 10th Italian Workshop, WIVACE 2015
(Bari, 2015). Springer, Cham, pp 40–52
12. Jiménez F, Gómez-Skarmeta AF, Sánchez G, Deb K (2002) An evolutionary algorithm for
constrained multi-objective optimization. In: Proceedings of the 2002 congress on evolutionary
computation – CEC’02. IEEE, pp 1133–1138
13. Kant A, Suman PK, Giri BK, Tiwari MK, Chatterjee C, Nayak PC, Kumar S (2013) Comparison
of multi-objective evolutionary neural network, adaptive neuro-fuzzy inference system and
bootstrap-based neural network for flood forecasting. Neural Comput Appl 23(1):231–246
14. Kumar Sahu R, Halder C, Sen PK (2016) Optimization of top gas recycle blast furnace emissions
with implications of downstream energy. Steel Res Int 87: 1190–1202. https://doi.org/10.1002/
srin.201500312
15. Li X (2003) A real-coded predator-prey genetic algorithm for multiobjective optimization. In:
2nd international conference on evolutionary multi-criterion optimization (EMO 2003) (Faro,
2003). Proceedings. Springer, Berlin, pp 207–221
16. Mahanta BK, Chakraborti N (2018) Evolutionary data driven modeling and multi objective
optimization of noisy data set in blast furnace iron making process. Steel Res Int 89:1800121
(11 p)
17. Mahanta BK, Chakraborti N (2020) Tri-objective optimization of noisy dataset in blast furnace
iron-making process using evolutionary algorithms. Mater Manuf Process 35(6):677–686
18. Miettinen K (1998) Nonlinear multiobjective optimization. Springer, New York
19. Mondal DN, Sarangi K, Pettersson F, Sen PK, Saxén H, Chakraborti N (2011) Cu-Zn sep-
aration by supported liquid membrane analyzed through multi-objective genetic algorithms.
Hydrometallurgy 107(3–4):112–123
20. Nguyen TN, Siegmund T, Tsutsui W, Liao H, Chen W (2016) Bi-objective optimal design of
a damage-tolerant multifunctional battery system. Mater Design 105:51–65
21. Ojalehto V, Miettinen K (2019) DESDEO: an open framework for interactive multiobjective
optimization. In: Huber S, Geiger MJ, de Almeida AT (eds) Multiple criteria decision making
and aiding: cases on models and methods with computer implementations. Springer, Cham, pp
67–94
22. Osyczka A, Kundu S (1995) A new method to solve generalized multicriteria optimization
problems using the simple genetic algorithm. Struct Optim 10(2):94–99
Novel Strategies for Data-Driven Evolutionary Optimization 25
23. Pettersson F, Chakraborti N, Saxén H (2007) A genetic algorithms based multi-objective neural
net applied to noisy blast furnace data. Appl Soft Comput 7(1):387–397
24. Pettersson F, Biswas A, Sen PK, Saxén H, Chakraborti N (2009) Analyzing leaching data
for low-grade manganese ore using neural nets and multiobjective genetic algorithms. Mater
Manuf Process 24(3):320–330
25. Poli R, Langdon WB, McPhee NF (2008) A field guide to genetic programming. http://www.
gp-field-guide.org.uk
26. Roy S, Saini B, Chakrabarti D, Chakraborti N (2020) Mechanical properties of micro-alloyed
steels studied using an evolutionary deep neural network. Mater Manuf Process 35(6):611–624
27. Roy S, Dutta A, Chakraborti N (2021) A novel method of determining interatomic potential for
Al and Al-Li alloys and studying strength of Al-Al3 Li interphase using evolutionary algorithms.
Comput Mater Sci 190:110258
28. Voter AF, Chen SP (1986) Accurate interatomic potentials for Ni, Al and Ni3 Al. MRS Proc
82:175–180
29. Wason R (2018) Deep learning: evolution and expansion. Cogn Syst Res 52:701–708
30. Wilson B (2014) The machine learning dictionary. http://www.cse.unsw.edu.au/~billw/mldict.
html
31. Wolfram S (1983) Statistical mechanics of cellular automata. Rev Mod Phys 55(3):601–644
32. Zitzler E, Deb K, Thiele L (2000) Comparison of multiobjective evolutionary algorithms:
empirical results. Evol Comput 8(2):173–195
Artificial Intelligence and Computational
Science
Abstract In this note, we discuss the interaction between two ways of scientific
analysis. The first (classical) way is known as Mathematical Modeling (MM). It is
based on a model created by humans and presented in mathematical terms. Scientific
Computing (SC) is an important tool of MM developed to quantitatively analyze the
model. Artificial Intelligence (AI) forms a new way of scientific analysis. AI systems
arise as a result of a different process. Here, we take a sequence of correct input–output
data, perform Machine Learning (ML), and get a model (hidden in a network). In this
process, computational methods are used to create a network type model. We briefly
discuss special methods used for this purpose (such as evolutionary algorithms), give
a concise overview of results related to applications of AI in computer simulation of
real-life problems, and discuss several open problems.
The scientific approach that currently dominates in science and technology is the
result of a long evolution of human knowledge. It has a long history dating back
to antiquity. At the core of this approach is what in modern terminology is called
model. First, models were very simple. They were intuitively motivated and verified
in simple physical experiments (as, e.g., those done in Pisa by Galileo Galilei). The
means of elementary mathematics were quite enough to formalize such models and
their verification was done by direct comparison with experimental data.
As science developed, the corresponding mathematical models became more and
more complex. Being initially algebraic, they later start using differential and inte-
gral calculus. Nowadays, mathematical models are typically formed by systems that
include partial differential equations coupled with integral and algebraic equations
and other relations. Mathematical models often give good quantitative results that
are useful for analysis and forecasting, but getting them is impossible without serious
computations based on powerful computers, correct approximation methods, adap-
tive numerical algorithms, and error control. Therefore, the complexity of mathemat-
ical models has generated a new scientific direction, computational science, which is
also called scientific computing (SC). It is an essential part of the classical scientific
method focused on a computational model and analysis of numerical results. We can
define it as follows:
related to the model is completely in the hands of humans. The figure depicts one
cycle of a repeatable process, which starts with an original mathematical model based
on certain theoretical analysis. Certainly, the model can be specified and modified,
but this is exclusively a matter of the researcher. After the model has passed the
necessary verification, it can be used independently so that the Mathematical Model
and Computations replace the Experiment.
The scientific method has been using for many centuries and will undoubtedly be
successfully used further. If we talk about complex and highly intellectual problems,
then at present and in the near future no competitors are foreseen for this modus
operandi. Nevertheless, there are many other important and interesting tasks for the
study (and use) of which a different approach can be applied.
2 Genetic Algorithms
It is not surprising that the idea of transferring the technology developed by life-
forms to manufactured objects came to many scientists and engineers. A science
called bionics (or biomimetics) studies possible applications of “technological solu-
tions” encompassed in biological objects to various engineering problems (see, e.g.,
[3]). A form of this idea in the application to computational sciences is as follows:
Combine methods of computational mathematics with principles of natural selection.
The first works in this direction have appeared in the mid-60s. They were mostly
concerned with relatively simple optimization problems solved by the so-called evo-
lutionary algorithms (EA) (also called genetic algorithms). These methods use selec-
tion principles analogous to those that regulate the successful evolution of spices (see,
e.g., [4–7]). A genetic algorithm operates with “populations” and generates a new
population by modification of the current population and selection of those “indi-
viduals” that are most acceptable with respect to a certain selection principle. Each
successive population is considered a “new generation”. The selection principle is
defined such that only the “individuals” that satisfy it can be considered as solutions
to the problem studied. Hence, the core of this method is an artificial system of
competitors, which consists of mathematical objects.
There are many different modifications of genetic algorithms that differ in how a
“generation” is changed (from simple stochastic disturbance to crossbreeding of most
successful individuals) and how a new set of individuals is formed. One of the first
areas of application was the problem of minimizing functions, where the criterion for
selection is the value of this function. Despite the large number of different studies of
a complete theory of evolutionary algorithms has not yet been created. In the majority
of cases, the algorithms are purely heuristic, and the results related to convergence
and its rate are very rare. Moreover, EAs are rather slow and cannot compete with
well-known deterministic algorithms (if for the corresponding optimization problem
such an algorithm is applicable).
Sometimes, EAs are considered as a special class of probabilistic optimization
methods, which is worth trying only if no other method can be used. This viewpoint
seems to be too limited. There is no doubt that EAs have a much wider scope than
optimization. From a mathematical point of view, they are closer to discrete dynam-
ical systems and have some typical features, e.g., a sequence of “populations” may
not converge to the desired solution (as in a deterministic algorithm) but fluctuates
around it. The latter behavior is typical for a dynamic system at the vicinity of an
attractor.
The invention of EAs has made an important step toward AI technologies because
they solve complicated problems of very different origin without using deep math-
ematical models. Not being very effective for well-studied classes of optimization
problems, they form a basis for new methods commonly used in machine learning
of networks.
Artificial Intelligence and Computational Science 31
A strict and commonly accepted definition of artificial intelligence (AI) does not
yet exist. Sometimes, it is understood that AI arises in systems that are able to find
acceptable solutions to complex problems in conditions of incomplete information.
Such types of problems arise in the recognition of images, classification of objects,
decision-making, and many other areas. The ability to analyze and solve such prob-
lems determines the cognitive properties of the system. Therefore, sometimes the
question of the presence or absence of AP in a system is associated with the level of
its cognitive abilities. Biological systems often have a high level of recognition in
areas where it is essential for survival.
Artificial neural networks (ANN) are inspired by ideas of biomimetics. (These
issues are examined in depth and in detail in [8, 9]). However, the question of where
the border between intelligent and non-intelligent systems lies remains open. If we
talk about technical systems, then we can offer the following feature, which allows
us to select a system with AI: it is impossible (or very difficult) to establish why
such a system gave a concrete answer. Moreover, often when reprocessing exactly
the same initial data, the system can give a somewhat different answer. Computing
systems without intellectual properties have completely different properties: they act
according to completely defined rules and always produce the same result for the
same input data.
In modern literature, methods for creating artificial systems with some intelli-
gence (in the above sense) are called machine learning (ML). Image recognition and
inverse problems were the first classes of real-life problems where these methods
have demonstrated high efficiency (see, e.g., [10, 11] and the references cited in these
publications). Recently, similar approaches have been used as new tools of scientific
computing and mathematical modeling, in particular, for getting approximations of
differential equations (see [12–19]), quantitative analysis of energy type mathemat-
ical models in mechanics [20–22], automatic differentiation [23], and optimization
of a robotic system [24].
Here, the principal scheme of data processing and generation of a “model” differs
essentially from the scheme in Fig. 1. In essence, human participation is limited to
three things. They are
1. network structure and its size, i.e., the researcher defines a certain class of models
among which a suitable model must be found (e.g., in many cases, the class of
deep neural networks (DNN) having multiple layers of neurones is used);
2. the quality criterion used for comparing the actual (correct) output data with the
data generated by a network. Typically, the quality criterion is a version of the
least squares principle (see, e.g., [9, 10, 13]);
3. the adaptation or optimization algorithm used in the iterative procedure of net-
work changing (methods of nonlinear programming and structural optimization
are often used as the basic tools).
32 P. Neittaanmäki and S. Repin
In the case of the supervised machine learning method, we also need a sufficiently
wide set of actual data that can be used in the teaching process. Therefore, an impor-
tant component of this technology is the formation of such a set.
The scheme in Fig. 2 presents one cycle of the learning process: generator of input
data. Usually, the learning process occurs fully automatically without human inter-
vention. As a result, we obtain a network, whose structure and weights are adapted
to solve a particular problem. Thus, machine learning creates a model presented in
terms of the network structure.
A comparison of Figs. 1 and 2 shows the fundamental differences between the two
methods of analysis. The main idea of Mathematical Modeling (and its part Scientific
Computing) is Correct mathematical model supplied with a proper computational
method provides the required information about a process or phenomenon. The main
concept of ML is rather different: Using a sufficiently representative set of correct
input–output data, create a network type model of a process or phenomenon.
AI methods are at the very beginning of their development and a unified theory of
them is yet to be created. We believe that such a theory will appear on the crossroad of
discrete mathematics, group theory, representation theory, and computational math-
Artificial Intelligence and Computational Science 33
ematics. Below, we briefly discuss three open problems, which have a fundamental
meaning for the understanding of what is AI and how to create AI systems. To the
best of our knowledge, they are not solved. New results related to the problems could
form a basis of the forthcoming theory of AI.
Problem of Teaching
The learning process is usually related to the optimization of a nonlinear functional
(set of functionals) having large dimensions. Usually, this functional is nonconvex
and has a complex structure with numerous local extrema and stationary points.
Therefore, known minimization algorithms (based on additional assumptions such
as, e.g., convexity or unimodality) may not be efficient enough, and we do not know
how to guarantee that the teaching process used is indeed efficient and generates a
network with the structure close to the best possible (among other networks of the
same size).
Selection of a Suitable Class of Networks
One of the major tasks to be solved first of all is to set the structure and size of the
network, which we are going to train. The problem is how to adjust the topological
structure/parameters of a network to the complexity of the problem in question. It
would be nice to have some quantitative a priori criteria able to define the amount
of layers and neurones depending on the amount of output parameters, variability of
the data, and desired accuracy.
Why a Neural Network Model Works?
If a network has a simple structure and consists of a small amount of elements, then
it is usually possible to understand why it is functioning correctly. However, simple
networks can match only simple problems. Future development of AI technologies
will inevitably lead to very complicated networks that will be created for the analysis
of serious scientific (technological, medical, and social) problems. It is quite probable
that some advanced AI systems will work more effectively than standard methods of
mathematical modeling. In this case, scientists will be interested to understand why
a particular AI system works better than their models. To answer this question, it
is necessary to “decode” the information encompassed in the “black box” structure
and translate it into notations and terms available for humans. This problem does not
attract much attention now but may become one of the fundamental problems in the
close future. We must admit that there is no guarantee that it will always be solvable.
5 Conclusions
In the month of April, 1872, I had the honor to attend and preside
over a National Convention of colored citizens, held in New Orleans. It
was a critical period in the history of the Republican party, as well as in
that of the country. Eminent men who had hitherto been looked upon
as the pillars of Republicanism had become dissatisfied with President
Grant’s administration, and determined to defeat his nomination for a
second term. The leaders in this unfortunate revolt were Messrs.
Trumbull, Schurz, Greeley, and Sumner. Mr. Schurz had already
succeeded in destroying the Republican party in the State of Missouri,
and it seemed to be his ambition to be the founder of a new party, and
to him more than to any other man belongs the credit of what was
once known as the Liberal Republican party which made Horace
Greeley its standard bearer in the campaign of that year.
At the time of the Convention in New Orleans the elements of this
new combination were just coming together. The division in the
Republican ranks seemed to be growing deeper and broader every
day. The colored people of the country were much affected by the
threatened disruption, and their leaders were much divided as to the
side upon which they should give their voice and their votes. The
names of Greeley and Sumner, on account of their long and earnest
advocacy of justice and liberty to the blacks, had powerful attractions
for the newly enfranchised class; and there was in this Convention at
New Orleans naturally enough a strong disposition to fraternize with
the new party and follow the lead of their old friends. Against this policy
I exerted whatever influence I possessed, and, I think, succeeded in
holding back that Convention from what I felt sure then would have
been a fatal political blunder, and time has proved the correctness of
that position. My speech on taking the chair on that occasion was
telegraphed from New Orleans in full to the New York Herald, and the
key-note of it was that there was no path out of the Republican party
that did not lead directly into the Democratic party—away from our
friends and directly to our enemies. Happily this Convention pretty
largely agreed with me, and its members have not since regretted that
agreement.
From this Convention onward, until the nomination and election of
Grant and Wilson, I was actively engaged on the stump, a part of the
time in Virginia with Hon. Henry Wilson, in North Carolina with John M.
Longston and John H. Smyth, and in the State of Maine with Senator
Hamlin, Gen. B. F. Butler, Gen. Woodford, and Hon. James G. Blaine.
Since 1872 I have been regularly what my old friend Parker
Pillsbury would call a “field hand” in every important political campaign,
and at each National Convention have sided with what has been called
the stalwart element of the Republican party. It was in the Grant
Presidential campaign that New York took an advanced step in the
renunciation of a timid policy. The Republicans of that State not having
the fear of popular prejudice before their eyes placed my name as an
Elector at large at the head of their Presidential ticket. Considering the
deep-rooted sentiment of the masses against negroes, the noise and
tumult likely to be raised, especially among our adopted citizens of
Irish descent, this was a bold and manly proceeding, and one for which
the Republicans of the State of New York deserve the gratitude of
every colored citizen of the Republic, for it was a blow at popular
prejudice in a quarter where it was capable of making the strongest
resistance. The result proved not only the justice and generosity of the
measure, but its wisdom. The Republicans carried the State by a
majority of fifty thousand over the heads of the Liberal Republican and
the Democratic parties combined.
Equally significant of the turn now taken in the political sentiment of
the country, was the action of the Republican Electoral College at its
meeting in Albany, when it committed to my custody the sealed up
electoral vote of the great State of New York, and commissioned me to
bring that vote to the National Capital. Only a few years before, any
colored man was forbidden by law to carry a United States mail bag
from one post-office to another. He was not allowed to touch the
sacred leather, though locked in “triple steel,” but now, not a mail bag,
but a document which was to decide the Presidential question with all
its momentous interests, was committed to the hands of one of this
despised class; and around him, in the execution of his high trust, was
thrown all the safeguards provided by the Constitution and the laws of
the land. Though I worked hard and long to secure the nomination and
the election of Gen. Grant in 1872, I neither received nor sought office
under him. He was my choice upon grounds altogether free from
selfish or personal considerations. I supported him because he had
done all, and would do all, he could to save not only the country from
ruin, but the emancipated class from oppression and ultimate
destruction; and because Mr. Greeley, with the Democratic party
behind him, would not have the power, even if he had the disposition,
to afford us the needed protection which our peculiar condition
required. I could easily have secured the appointment as Minister to
Hayti, but preferred to urge the claims of my friend, Ebenezer Bassett,
a gentleman and a scholar, and a man well fitted by his good sense
and amiable qualities to fill the position with credit to himself and his
country. It is with a certain degree of pride that I am able to say that my
opinion of the wisdom of sending Mr. Bassett to Hayti has been fully
justified by the creditable manner in which, for eight years, he
discharged the difficult duties of that position; for I have the assurance
of Hon. Hamilton Fish, Secretary of State of the United States, that Mr.
Bassett was a good Minister. In so many words, the ex-Secretary told
me, that he “wished that one-half of his ministers abroad performed
their duties as well as Mr. Bassett.” To those who knew Hon. Hamilton
Fish, this compliment will not be deemed slight, for few men are less
given to exaggeration and are more scrupulously exact in the
observance of law, and in the use of language, than is that gentleman.
While speaking in this strain of complacency in reference to Mr.
Bassett, I take pleasure also in bearing my testimony based upon
knowledge obtained at the State Department, that Mr. John Mercer
Langston, the present Minister to Hayti, has acquitted himself with
equal wisdom and ability to that of Mr. Bassett in the same position.
Having known both these gentlemen in their youth, when the one was
at Yale, and the other at Oberlin College, and witnessed their efforts to
qualify themselves for positions of usefulness, it has afforded me no
limited satisfaction to see them rise in the world. Such men increase
the faith of all in the possibilities of their race, and make it easier for
those who are to come after them.
The unveiling of Lincoln Monument in Lincoln Park, Washington,
April 14th, 1876, and the part taken by me in the ceremonies of that
grand occasion, takes rank among the most interesting incidents of my
life, since it brought me into mental communication with a greater
number of the influential and distinguished men of the country than any
I had before known. There were present the President of the United
States and his Cabinet, Judges of the Supreme Court, the Senate and
House of Representatives, and many thousands of citizens to listen to
my address upon the illustrious man in whose memory the colored
people of the United States had, as a mark of their gratitude, erected
that impressive monument. Occasions like this have done wonders in
the removal of popular prejudice, and in lifting into consideration the
colored race; and I reckon it one of the high privileges of my life, that I
was permitted to have a share in this and several other like
celebrations.
The progress of a nation is sometimes indicated by small things.
When Henry Wilson, an honored Senator and Vice-President of the
United States, died in the capitol of the nation, it was a significant and
telling indication of national advance, when three colored citizens, Mr.
Robert Purvis, Mr. James Wormley, and myself, were selected with the
Senate committee, to accompany his honored remains from
Washington to the grand old commonwealth he loved so well, and
whom in turn she had so greatly loved and honored. It was meet and
right that we should be represented in the long procession that met
those remains in every State between here and Massachusetts, for
Henry Wilson was among the foremost friends of the colored race in
this country, and this was the first time in its history when a colored
man was made a pall-bearer at the funeral, as I was in this instance, of
a Vice-President of the United States.
An appointment to any important and lucrative office under the
United States government, usually brings its recipient a large measure
of praise and congratulation on the one hand, and much abuse and
disparagement on the other; and he may think himself singularly
fortunate if the censure does not exceed the praise. I need not dwell
upon the causes of this extravagance, but I may say there is no office
of any value in the country which is not desired and sought by many
persons equally meritorious and equally deserving. But as only one
person can be appointed to any one office, only one can be pleased,
while many are offended; unhappily, resentment follows
disappointment, and this resentment often finds expression in
disparagement and abuse of the successful man. As in most else I
have said, I borrow this reflection from my own experience.
My appointment as United States Marshal of the District of
Columbia, was in keeping with the rest of my life, as a freeman. It was
an innovation upon long established usage, and opposed to the
general current of sentiment in the community. It came upon the
people of the District as a gross surprise, and almost a punishment;
and provoked something like a scream—I will not say a yell—of
popular displeasure. As soon as I was named by President Hayes for
the place, efforts were made by members of the bar to defeat my
confirmation before the Senate. All sorts of reasons against my
appointment, but the true one, were given, and that was withheld more
from a sense of shame, than from a sense of justice. The
apprehension doubtless was, that if appointed marshal, I would
surround myself with colored deputies, colored bailiffs, colored
messengers, and pack the jury box with colored jurors; in a word,
Africanize the courts. But the most dreadful thing threatened, was a
colored man at the Executive Mansion in white kid gloves, sparrow-
tailed coat, patent leather boots, and alabaster cravat, performing the
ceremony—a very empty one—of introducing the aristocratic citizens
of the republic to the President of the United States. This was
something entirely too much to be borne; and men asked themselves
in view of it, to what is the world coming? and where will these things
stop? Dreadful! Dreadful!
It is creditable to the manliness of the American Senate, that it was
moved by none of these things, and that it lost no time in the matter of
my confirmation. I learn, and believe my information correct, that
foremost among those who supported my confirmation against the
objections made to it, was Hon. Roscoe Conkling of New York. His
speech in executive session is said by the senators who heard it, to
have been one of the most masterly and eloquent ever delivered on
the floor of the Senate; and this too I readily believe, for Mr. Conkling
possesses the ardor and fire of Henry Clay, the subtlety of Calhoun,
and the massive grandeur of Daniel Webster.
The effort to prevent my confirmation having failed, nothing could
be done but to wait for some overt act to justify my removal; and for
this my unfriends had not long to wait. In the course of one or two
months I was invited by a number of citizens of Baltimore to deliver a
lecture in that city in Douglass Hall—a building named in honor of
myself, and devoted to educational purposes. With this invitation I
complied, giving the same lecture which I had two years before
delivered in the city of Washington, and which was at the time
published in full in the newspapers, and very highly commended by
them. The subject of the lecture was, “Our National Capital,” and in it I
said many complimentary things of the city, which were as true as they
were complimentary. I spoke of what it had been in the past, what it
was at that time, and what I thought it destined to become in the future;
giving it all credit for its good points, and calling attention to some of its
ridiculous features. For this I got myself pretty roughly handled. The
newspapers worked themselves up to a frenzy of passion, and
committees were appointed to procure names to a petition to President
Hayes demanding my removal. The tide of popular feeling was so
violent, that I deemed it necessary to depart from my usual custom
when assailed, so far as to write the following explanatory letter, from
which the reader will be able to measure the extent and quality of my
offense:
“To the Editor of the Washington Evening Star:
“Sir:—You were mistaken in representing me as being off on a
lecturing tour, and, by implication, neglecting my duties as United
States Marshal of the District of Columbia. My absence from
Washington during two days was due to an invitation by the
managers to be present on the occasion of the inauguration of the
International Exhibition in Philadelphia.
“In complying with this invitation, I found myself in company
with other members of the government who went thither in
obedience to the call of patriotism and civilization. No one interest
of the Marshal’s office suffered by my temporary absence, as I had
seen to it that those upon whom the duties of the office devolved
were honest, capable, industrious, painstaking, and faithful. My
Deputy Marshal is a man every way qualified for his position, and
the citizens of Washington may rest assured that no unfaithful man
will be retained in any position under me. Of course I can have
nothing to say as to my own fitness for the position I hold. You
have a right to say what you please on that point; yet I think it
would be only fair and generous to wait for some dereliction of
duty on my part before I shall be adjudged as incompetent to fill
the place.
“You will allow me to say also that the attacks upon me on
account of the remarks alleged to have been made by me in
Baltimore, strike me as both malicious and silly. Washington is a
great city, not a village nor a hamlet, but the capital of a great
nation, and the manners and habits of its various classes are
proper subjects for presentation and criticism, and I very much
mistake if this great city can be thrown into a tempest of passion
by any humorous reflections I may take the liberty to utter. The city
is too great to be small, and I think it will laugh at the ridiculous
attempt to rouse it to a point of furious hostility to me for any thing
said in my Baltimore lecture.
“Had the reporters of that lecture been as careful to note what
I said in praise of Washington as what I said, if you please, in
disparagement of it, it would have been impossible to awaken any
feeling against me in this community for what I said. It is the
easiest thing in the world, as all editors know, to pervert the
meaning and give a one-sided impression of a whole speech by
simply giving isolated passages from the speech itself, without any
qualifying connections. It would hardly be imagined from anything
that has appeared here that I had said one word in that lecture in
honor of Washington, and yet the lecture itself, as a whole, was
decidedly in the interest of the national capital. I am not such a fool
as to decry a city in which I have invested my money and made
my permanent residence.
“After speaking of the power of the sentiment of patriotism I
held this language: ‘In the spirit of this noble sentiment I would
have the American people view the national capital. It is our
national center. It belongs to us; and whether it is mean or
majestic, whether arrayed in glory or covered with shame, we
cannot but share its character and its destiny. In the remotest
section of the republic, in the most distant parts of the globe, amid
the splendors of Europe or the wilds of Africa, we are still held and
firmly bound to this common center. Under the shadow of Bunker
Hill monument, in the peerless eloquence of his diction, I once
heard the great Daniel Webster give welcome to all American
citizens, assuring them that wherever else they might be
strangers, they were all at home there. The same boundless
welcome is given to all American citizens by Washington.
Elsewhere we may belong to individual States, but here we belong
to the whole United States. Elsewhere we may belong to a
section, but here we belong to a whole country, and the whole
country belongs to us. It is national territory, and the one place
where no American is an intruder or a carpet-bagger. The new
comer is not less at home than the old resident. Under its lofty
domes and stately pillars, as under the broad blue sky, all races
and colors of men stand upon a footing of common equality.
“‘The wealth and magnificence which elsewhere might oppress
the humble citizen has an opposite effect here. They are felt to be
a part of himself and serve to ennoble him in his own eyes. He is
an owner of the marble grandeur which he beholds about him,—as
much so as any of the forty millions of this great nation. Once in
his life every American who can should visit Washington: not as
the Mahometan to Mecca; not as the Catholic to Rome; not as the
Hebrew to Jerusalem, nor as the Chinaman to the Flowery
kingdom, but in the spirit of enlightened patriotism, knowing the
value of free institutions and how to perpetuate and maintain them.
“‘Washington should be contemplated not merely as an
assemblage of fine buildings; not merely as the chosen resort of
the wealth and fashion of the country; not merely as the honored
place where the statesmen of the nation assemble to shape the
policy and frame the laws; not merely as the point at which we are
most visibly touched by the outside world, and where the
diplomatic skill and talent of the old continent meet and match
themselves against those of the new, but as the national flag itself
—a glorious symbol of civil and religious liberty, leading the world
in the race of social science, civilization, and renown.’
“My lecture in Baltimore required more than an hour and a half
for its delivery, and every intelligent reader will see the difficulty of
doing justice to such a speech when it is abbreviated and
compressed into a half or three-quarters of a column. Such
abbreviation and condensation has been resorted to in this
instance. A few stray sentences, called out from their connections,
would be deprived of much of their harshness if presented in the
form and connection in which they were uttered; but I am taking up
too much space, and will close with the last paragraph of the
lecture, as delivered in Baltimore. ‘No city in the broad world has a
higher or more beneficent mission. Among all the great capitals of
the world it is preëminently the capital of free institutions. Its fall
would be a blow to freedom and progress throughout the world.
Let it stand then where it does now stand—where the father of his
country planted it, and where it has stood for more than half a
century; no longer sandwiched between two slave States; no
longer a contradiction to human progress; no longer the hot-bed of
slavery and the slave trade; no longer the home of the duelist, the
gambler, and the assassin; no longer the frantic partisan of one
section of the country against the other; no longer anchored to a
dark and semi-barbarous past, but a redeemed city, beautiful to
the eye and attractive to the heart, a bond of perpetual union, an
angel of peace on earth and good will to men, a common ground
upon which Americans of all races and colors, all sections, North
and South, may meet and shake hands, not over a chasm of
blood, but over a free, united, and progressive republic.’”
* * * * *