Professional Documents
Culture Documents
Full Chapter Artificial Intelligence Systems Based On Hybrid Neural Networks Theory and Applications Michael Zgurovsky PDF
Full Chapter Artificial Intelligence Systems Based On Hybrid Neural Networks Theory and Applications Michael Zgurovsky PDF
https://textbookfull.com/product/progresses-in-artificial-
intelligence-and-neural-systems-anna-esposito/
https://textbookfull.com/product/artificial-neural-networks-for-
engineering-applications-1st-edition-alma-alanis/
https://textbookfull.com/product/artificial-neural-networks-with-
java-tools-for-building-neural-network-applications-1st-edition-
igor-livshin/
https://textbookfull.com/product/matlab-deep-learning-with-
machine-learning-neural-networks-and-artificial-intelligence-1st-
edition-phil-kim/
Artificial Neural Networks and Machine Learning ICANN
2014 24th International Conference on Artificial Neural
Networks Hamburg Germany September 15 19 2014
Proceedings 1st Edition Stefan Wermter
https://textbookfull.com/product/artificial-neural-networks-and-
machine-learning-icann-2014-24th-international-conference-on-
artificial-neural-networks-hamburg-germany-
september-15-19-2014-proceedings-1st-edition-stefan-wermter/
Michael Zgurovsky
Victor Sineglazov
Elena Chumachenko
Artificial
Intelligence
Systems Based
on Hybrid Neural
Networks
Theory and Applications
Studies in Computational Intelligence
Volume 904
Series Editor
Janusz Kacprzyk, Polish Academy of Sciences, Warsaw, Poland
The series “Studies in Computational Intelligence” (SCI) publishes new develop-
ments and advances in the various areas of computational intelligence—quickly and
with a high quality. The intent is to cover the theory, applications, and design
methods of computational intelligence, as embedded in the fields of engineering,
computer science, physics and life sciences, as well as the methodologies behind
them. The series contains monographs, lecture notes and edited volumes in
computational intelligence spanning the areas of neural networks, connectionist
systems, genetic algorithms, evolutionary computation, artificial intelligence,
cellular automata, self-organizing systems, soft computing, fuzzy systems, and
hybrid intelligent systems. Of particular value to both the contributors and the
readership are the short publication timeframe and the world-wide distribution,
which enable both wide and rapid dissemination of research output.
The books of this series are submitted to indexing to Web of Science,
EI-Compendex, DBLP, SCOPUS, Google Scholar and Springerlink.
Elena Chumachenko
Artificial Intelligence
Systems Based on Hybrid
Neural Networks
Theory and Applications
123
Michael Zgurovsky Victor Sineglazov
Kyiv, Ukraine Kyiv, Ukraine
Elena Chumachenko
Kyiv, Ukraine
This Springer imprint is published by the registered company Springer Nature Switzerland AG
The registered company address is: Gewerbestrasse 11, 6330 Cham, Switzerland
Preface
According to the analysis of many think tanks of the world, over the next five years,
the total volume of artificial intelligence (AI) technologies market will increase by
at least 4 times. Compound Annual Growth Rate (CAGR) in the forecast period will
exceed 30%. It can be concluded that AI in the near future will become an integral
part of the personal and professional people activities.
In particular, in the field of healthcare, AI is increasingly being used to identify
regularities in the medical data of patients, which permits significantly to improve
the process of establishing their diagnoses and to increase the effectiveness of
treatment. In the field of cybersecurity, AI provides the protection against infor-
mation threats that cannot be implemented with traditional network security tools.
In the current decade, steady growth of AI technologies is also observed in the
aerospace and defense industries, energy, unmanned vehicles, robotics, ICT,
banking and finance, the video game industry, retail, cognitive, neuromorphic,
quantum and large-scale computing and in many other areas of human activities.
At the same time, the main limitations of the known methods and technologies
of AI are due to the lack of their training effectiveness, the difficulty of setting up
and adapting to the problem area in the context of incomplete and inaccurate initial
information, the difficulty of accumulating expert knowledge and other features.
Thus, one of the actual problems in the development of modern AI systems is the
development of integrated, hybrid systems based on deep learning. Unfortunately,
today there is no methodology for hybrid neural networks (HNN) topologies design
and hybrid technologies of their structural-parametric synthesis using deep learning.
The main factor that contributes to the development of such systems is the
expansion of neural networks (NN) use for solving tasks of recognition, classifi-
cation, optimization and nother problems. Using the other technologies for solving
such class of problems leads to difficult symbolic calculations or to the big com-
puting problems.
The monograph is devoted to an important direction in the development of
artificial intelligence systems, based on the creation of a unified methodology for
constructing hybrid neural networks (HNN) with the possibility of choosing models
of artificial neurons. In order to increase the efficiency of solving the tasks, it is
v
vi Preface
proposed to gradually increase the complexity of the structure of these models and
use hybrid learning algorithms, including deep learning algorithms.
Unfortunately, today there is no methodology for constructing topologies of
hybrid neural networks and hybrid technologies for their structural and parametric
synthesis using deep learning. In recent years, there has been an active growth in
the successful use of hybrid intelligent systems in various fields such as robotics,
medical diagnostics, speech recognition, fault diagnosis of industrial equipment,
monitoring (control of production processes) and applications in the field of finance.
The main factor contributing to the development of intelligent hybrid systems is the
increased use of neural networks for recognition, classification and optimization.
The ability of neural networks to perform tasks that, if applied by other tech-
nologies, would be difficult to solve or those that are difficult to symbolic calcu-
lations, is now recognized, and they are often used as modules in intelligent hybrid
systems.
The monograph is designed for specialists in the field of artificial intelligence,
information technology, students and graduate students in these areas and can be
useful to a wide range of readers involved in solving applied problems and inter-
ested in expanding the functionality of existing systems through the use of elements
of artificial intelligence.
vii
viii Introduction
ix
x Contents
Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 511
Chapter 1
Classification and Analysis Topologies
Known Artificial Neurons and Neural
Networks
n
s= Xi wi (1.1)
i=1
Fig. 1.5 Family modifications ReLU a ai is fixed; b ai based on data; c aji randomly generated
from a given time interval under study and remains constant during testing
y = f (s). (1.2)
Neurons network in some way arranged in layers. The input layer is used to
input values of input variables. Each of the hidden and output neuron is connected
to all elements of the previous layer. When the (use of) incoming network elements
submitted values of input variables that consistently treated neurons intermediate and
output layers. Each activation calculates its value using a weighted sum of outputs
of the previous layer and subtracting from it the threshold. Then the value of the
activation function becomes activated, and the result is the output neuron. After a
run for the entire network, the output value output layer elements are taken out for
the entire network as a whole.
Currently, there are many different topologies NN, each of which has different
qualities. The next point will be given a general classification of existing ANN.
4 1 Classification and Analysis Topologies Known Artificial …
As activation function can use the following functions: sigmoid function (Fig. 1.2),
hyperbolic tangent (Fig. 1.3) and ReLU (Figs. 1.4 and 1.5).
Sigmoid function (sigmoid) expressed by the following formula:
σ (x) = 1 1 + ex .
This function takes inlet arbitrary real number, and the output provides real
number between 0 and 1. In particular, large (in absolute value) negative numbers
are converted to zero and positive large—in the unit. Historically sigmoid function
found wide application because of its good yield interpreted as the activation level
of the neuron from the absence of activation (0) to completely saturated activation
(1).
Currently sigmoid function lost its former popularity and is used very rarely. This
feature has two major drawbacks:
1. Saturation sigmoid function leads to attenuation gradients. It is undesirable char-
acteristic sigmoid function is that the saturation of the functions of a hand (0 or
1), the gradient in these areas becomes close to zero. Recall that in the present
back propagation (local) multiplied by the total gradient. So if the local gradient
is very small, it actually clears generic gradient. As a result, almost no signal
will pass through the neuron to its weights and recursively its data. Furthermore,
it should be very careful when initializing weights sigmoid neurons to prevent
saturation. For example, if the initial weight values are too high, most neurons
pass into a state of saturation, resulting in poor network will learn.
2. Exit sigmoid function is not centered relative to zero. This property is undesirable
because the neurons in these layers receive a value that is not centered relative
to the ground, affecting the dynamics of gradient descent (gradient descent). If
the value received by the neuron is always positive (x > 0, f = ωT x + b),
then the process back propagation all gradient scales ω will be either positive
or negative (depending on the gradient of the entire expression f ). This can
lead to unwanted zigzag dynamics. Note, however, that when these gradients
are summed for package updates the final weights may have different signs,
which partly eliminates the described disadvantage. Thus, no alignment is an
inconvenience, but has less serious consequences, compared with the problem
of saturation.
Hyperbolic tangent (hyperbolic tangent, tanh) accepts arbitrary entrance real
number, and the output provides real number in the range from minus 1 to 1. Like the
sigmoid function, hyperbolic tangent can be satisfied. However, unlike the sigmoid
function, the output of this function centered on zero. Thus, in practice, always better
to use hyperbolic tangent, not sigmoid function.
1.2 Classification Activation Function 5
ReLU
In recent years gained much popularity activation function called “rectifier” (rectifier,
similar to the half-wave rectifier in electrical engineering). Neurons this activation
function called ReLU (rectified linear unit). ReLU has the following formula f (x) =
max(0, x) and implements a simple threshold shift at zero (Fig. 1.4).
Consider the positive and negative sides ReLU.
Positive aspects
1. Calculating the hyperbolic tangent sigmoid function and performance requires
intensive operations, such as exponentiation, while ReLU can be implemented
using a simple matrix transformation activation threshold at zero. In addition,
ReLU not prone to saturation.
2. Application ReLU significantly increases the rate of convergence of stochastic
gradient descent (in some cases up to six times) compared to the sigmoid function
and hyperbolic tangent. It is believed that this is due to the linear nature of
saturation and lack of function.
Disadvantages
Unfortunately, ReLU not always sufficiently reliable and the learning process may go
down (“dead”). For example, a large gradient, through ReLU, can lead to the upgrade
of weights that this neuron is never activated. If this happens, then, starting from this
moment gradient that passes through the neuron is always zero. Accordingly, this
neuron is irreversibly incapacitated. For example, if a high speed training (learning
rate), it may be that 40% ReLU “dead” (i.e., never activated). This problem is solved
by selecting the proper speed training.
Currently, there is a whole family of different modifications ReLU. Next, consider
their features (Fig. 1.5).
Leaky ReLU
ReLU the “source” (leaky ReLU, LReLU) is one of the attempts to solve the problem
described above failure of conventional ReLU. Straight ReLU the interval x < 0 gives
zero output, while LReLU this range is small negative (angular coefficient of about
0.01). That feature for LReLU has the form f (x) = ax if x < 0 and f (x) = x if
x ≥ 0, where a is the was constant.
Parametric ReLU
For parametric ReLU (parametric ReLU, PReLU) on a negative slope coefficient
range is not set in advance, and based on the data. Back propagation process and
updates to PReLU is simple and similar to the corresponding process for traditional
ReLU.
Randomized ReLU
For randomized ReLU (randomized ReLU, RReLU) slope coefficient on the negative
range during the study randomly generated from a given interval, and during testing
6 1 Classification and Analysis Topologies Known Artificial …
remained constant. Within Kaggle-race National Data Science Bowl (NDSB) RReLU
allowed to reduce the conversion due to the inherent element of randomness [7–9].
The above analytical functions have the following form:
f (x) = 1
, f (x) = tanh(x) = 1+e2−2x − 1 f (x) = arctg −1 (x)
1+e−x
0 for x < 0 αx for x < 0 α(ex − 1) for x < 0
f (x) = , f (x) = , f (x) = .
x for x ≥ 0 x for x ≥ 0 x for x ≥ 0
Artificial neurons constitute the neural network and that of their properties and
connectivity options depend on the properties of neural networks from which they
are formed.
Classification scheme neural networks [8, 9] Author modification shown in Fig. 1.6.
Neural networks can be divided into several grounds.
(1) The number of layers:
Neural network training is a process, during which the network settings are configured
by modeling environment in which embedded network [8]. Product training is defined
way to configure these settings.
General classification scheme training methods shown in Fig. 1.7.
Training includes the following sequence of events.
1. In the neural network receives stimuli (inputs) to the environment.
2. As a result of receipt of incentive changing values of free parameters of the neural
network, such as weights.
3. After changing the internal structure of the network responds to stimuli otherwise.
4. The above sequence is called learning algorithm. Universal learning algorithm
networks exist because different neural networks in terms of architecture and
1.4 Training Neural Networks 9
the tasks for which they are used. The same reason has generated set of learning
algorithms that are different way to configure the synaptic weights, each of which
has both advantages and disadvantages.
There are two paradigms of education: teacher training and learning without a
teacher.
For further work is necessary to bring all data types into a single type, which will
operate the network, i.e. to clear data. In this paper we propose conversion unit that
converts fuzzy binary and linguistic variables in distinct variables [10]. Conversion
unit consists of two elements and has the form shown in Fig. 1.8.
Fuzzy variables
The second element of the conversion unit is the defuzer, that is, the fuzzy number
converter, to clear numbers.
10 1 Classification and Analysis Topologies Known Artificial …
Example
Suppose
we havea fuzzy variable x given as. Then the converted value:
X = 1 0.5, 2 0.9, 3 0.4 .
k
X j=1 aj Xij 1 × 0.6 + 2 × 0.9 + 3 × 0.5 3.9
= = = = 1.95
k
j=1 aj
0.6 + 0.9 + 0.5 2
Binary data
To handle binary variable data normalized by the formula
(x − xmin )(d2 − d1 )
Xnorm = + d1
xmax − xmin
where x a binary value to be normalization; xmax is the maximum input; xmin is the
minimum value of the input data.
After this procedure the data are reduced to the desired interval [d1 , d2 ].
(1 − 0)(50 − 25)
Xnorm = + 25 = 50
1−0
Linguistic variables. The first element unit conversions (“transducer”) with the
knowledge base in correspondence Terme linguistic variable fuzzy variable.
⎧
⎨ A1 (x) if Xi = T1
Xi = ···
⎩
An (x) if Xi = Tn
where T1 is the term linguistic variable; A1 (x) is the fuzzy subset that matches the
input terms (given by an expert).
1.5 Synthesis Converting Unit 11
The second block (“Defrag”) converts the fuzzy variables into clear ones.
This work uses the centroid conversion method.
Centroid Method—a clear value is found using one of the following formulas:
a
xμ(x)dx
b
X = a (1.3)
μ(x)dx
b
where X is the obtained a clear value; x is the fuzzy variable; μ(x) is the function of
belonging to a given fuzzy variable; [a, b] is the scope for defining a fuzzy variable;
k
j=1 aj Xij
X = k
, (1.4)
j=1 aj
where X is the obtained a clear value; Xij is the fuzzy plural element; aj is the value of
the membership function of the corresponding element; k is the number of discretes.
Example Let linguistic variables are given the following set of terms. X = “water
temperature” X = {“Cold”, “Warm”, “Hot”}. Each term in the base set on fuzzy
variable
1. Let the input unit conversion comes to “hot”. Based on the available knowledge
base of information experts put this value into line fuzzy variable A
(“Hot”) = 60 0.5, 70 0.6, 80 0.9, 90 1, 100 1 .
2. Use the defuzzifier to convert the resulting fuzzy variable to a clear one:
The type of activation function largely determines the properties of the artificial
neuron and often the name of the neuron, the mathematical model of which is shown
in Fig. 1.1, is determined by the type of activation function used, for example, a
ReLU neuron, etc.
Function activation is not the only thing that can be changed in order to learn
neuron approximating functions or properties. New discoveries in biology quickly
showed that even the human brain is far from being a structure of neurons, and thus
it makes sense in further studies. Not all topologies artificial neurons that were later
invented based on the ideas inherited in nature. Much of it is based on mathematical
and logical reasoning or experiments.
For further consideration of hybrid neural systems, their synthesis, consider
constructing first and try to classify types of neurons of interest in this work.
Neurons can be roughly classified according to different criteria, we list only
the most important ones. The type of computing: those that perform calculations
and those that carry the signal intensifying or weakening. The position activation
function, those containing activation function of synapses and after adder. The type
of logic: clear and unclear. Consider the main ones.
1.6.1 N-Neuron
∧ ij ij ij ij ij ij
y1 = wl0 + wl1 xi +wl2 xi2 +wl3 xi xj + wl4 xj2 + wl5 xj .
ij ij ij ij ij ij T
where wl (N ) = wl0 (N ), wl2 (N ), wl3 (N ), wl4 (N ), wl5 (N )
T
ϕ ij (x(k)) = 1, xi (k), xi2 (k), xi (k) xj (k), xj2 (k), xj (k) , k = 1, 2, . . . , N
1.6.2 Q-Neuron
Janne olikin kelpo mies, johon saattoi luottaa. Kaikessa koetti hän
tehdä isäntänsä mieliksi ja aina olisi kovimpaan kohtaan käynyt, jos
Tuomas vain olisi antanut. Janne oli kohta huomannut isännän ja
emännän epäselvät välit ja alkanut kysyvin katsein seurailla
Tuomasta. Ja raskaina hetkinään oli Tuomas vähitellen kertonutkin
hänelle, miten asiat olivat, saadakseen hetkeksi helpotusta painaville
ajatuksilleen.
*****
*****
Iida oli vienyt pojan ulos tuvasta ja Anna oli Tuomaan kanssa
kahden.
— Mitä sinä puhut? Uskalla? Kyllä sitä… kun sinä vain edes
kerrankaan…
*****
— Miten niin?
— Puhutaan? Kylilläkö?
— Se niljainen perkele!
— Sepä se.
— Kuulehan!
Tuomas seisahti.
— No mitä nyt?
Sitä vain, että parasta se olisi sinulle, kun muuttaisit muille maille.
Möisit torppasi ja tavarasi ja painuisit pakoon pahoja ihmisiä.
— Onhan se niinkin.
Tuomas oli kuullut, että Heikki aikoi myydä talonsa. Ja kun sattui
olemaan hyvää aikaa, päätti hän käydä kysymässä asian
todenperäisyyttä. Tuomasta oli viime päivinä alkanut kyllästyttää
toisen maan viljeleminen. Vaikka Heinämäki olikin kuin omansa ja
vielä pitkäaikainen vuokrasopimus, oli kyllästyminen tullut sitä
voimakkaammaksi, kuta enemmän alkoi ajatella, että maa, jota viljeli,
oli Isotalon maata, metsät, joissa liikkui, Isotalon metsiä.
— Ihanko tosissa?
— Älähän.
— Kyllähän minä siinä mielessä tulin, että ostan vaikka heti, jos
vain hyväksyt tämmöisen ostajan. Olisihan noita rahojakin joku
tuhansinen aluksi.
— Mitäpä minä niillä… pääasia on, kun saan jättää talon oikeisiin
käsiin.
Sovittiin, että kauppa päätetään heti. Tuomas saa talon
huomisesta lähtien käsiinsä.
Nyt se jää toiselle tuokin. Mitähän isä sanoisi, jos eläisi? Ja äiti?
Sanoisivatkohan: Suotta jätit. Olisit pysynyt vanhoilla juurillasi.
Mutta eihän hänellä oikeastaan mitään juuria ollutkaan. Vieraassa
maaperässähän hän eli. Toisen maata. Ikuisen vihamiehen ja
häpäisijän.
Jääköön toiselle!
*****
— Minuako?
— Niin.
— Mitä?
— Se myydään.
— Ole hulluttelematta.
— Työlläni.
Anna naurahti.
Niin ei olisi saanut tapahtua. Eikä saakaan vasta enää. Kun saa
oikeutensa, niin vähemmän epäilee tästä lähtien. Eikähän Isotalo ole
koskaan kieltänytkään. Ja viis', jos olisikin! Alkoi vistottaa joskus se
Isotalon kotielämä. Naisia yksi melkein joka sormelle. Ei sano heistä
välittävänsä, vaan pitäisikö se niitä talossaan muuten, jos ei välittäisi.