Professional Documents
Culture Documents
0001
0001
by 103.178.218.4 on 08/16/23. Re-use and distribution is strictly not permitted, except for Open Access articles.
Introduction
Machine Learning with Python Downloaded from www.worldscientific.com
We are constantly dealing with all kinds of problems every day, and would
like to solve these problems for timely decisions and actions. We may notice
that for many of the daily-life problems, our decisions are often made spon-
taneously, swiftly without much consciousness. This is because we have been
constantly learning to solve such problems in the past since we were born,
and therefore the solutions have already been encoded in the neuron cells in
our brain. When facing similar problems, our decision is spontaneous.
For many complicated problems, especially in science and engineering,
one would need to think harder and even conduct extensive research and
study on the related issues before we can provide a solution. What if we want
to give spontaneous reliable solutions to these types of problems as well?
Some scientists and engineers may be able to do this for some problems, but
not many. Those scientists are intensively trained or educated in specially
designed courses for dealing with complicated problems.
What if a normal layman would also like to be able solve these challenging
types of problems? One way is to go through a special learning process.
The alternative may be through machine learning, to develop a special
computer model with a mechanism that can be trained to extract features
from experience or data to provide a reliable and instantaneous solution for
a type of problem.
Problems in science and engineering are usually much more difficult to solve.
This is because we humans can only experience or observe the phenomena
1
2 Machine Learning with Python: Theory and Applications
associated with the problem. However, many phenomena are not easily
observable and have very complicated underlying logic. Scientists have been
trying to unveil the underlying logic by developing some theories (or laws or
principles) that can help to best describe these phenomena. These theories
are then formulated in the form of algebraic, differential, or integral system
equations that govern the key variables involved in the phenomena. The
next step is then to find a method that can solve these equations for these
by 103.178.218.4 on 08/16/23. Re-use and distribution is strictly not permitted, except for Open Access articles.
variables varying in space and with time. The final step is to find a way
to validate the theory by observation and/or experiments to measure the
values of these variables. The validated theory is used to build models to
solve problems that exhibit the same phenomena. This type of model is
Machine Learning with Python Downloaded from www.worldscientific.com
Note that there are many problems in nature, engineering, and society
for which it is difficult to describe and find proper physics laws to accurately
and effectively solve them. Alternative means are thus needed.
biology, and daily-life) that do not yet have known governing physics laws, or
the solutions to the governing laws’ equations are too expensive to obtain. For
this type of problem, on the other hand, we often have some data obtained
and accumulated through observations or measurements or historic records.
Machine Learning with Python Downloaded from www.worldscientific.com
When the data are sufficiently large and of good quality, it is possible to
develop computer models to learn from these data. Such a model can then be
used to find a solution for this type of problem. This kind of computer model
is defined as a data-based model or machine learning model in this book.
Different types of effective artificial Neural Networks (NNs) with various
configurations have been developed and widely used for practical problems
in sciences and engineering, including multilayer perceptron (MLP) [6–9],
Convolutional Neural Networks (CNNs) [10–14], and Recurrent Neural
Networks (RNNs) [15–17]. TrumpetNets [8] and TubeNets [9, 18–20] were
also recently proposed by the author for creating two-way deepnets using
physics-law-based models as trainers, such as the FEM [1] and S-FEM [2].
The unique feature of TrumpetNets and TubeNets is their effectiveness for
both forward and inverse problems [5]. It has a unique net architecture.
Most importantly, solutions to inverse problems can be analytically derived
in explicit formulae for the first time. This implies that when a data-based
model is built properly, one can find solutions very efficiently.
Machine learning is essentially to mimic the natural learning process
occurring in biological brains that can have a huge number of neurons. In
terms of usage of data, we may have three major categories:
This book will cover most of these algorithms, but our focus will be
more on neural network-based models because rigorous theory and predictive
models can be established.
Machine learning is a very active area of research and development. New
models, including the so-called cognitive machine learning models, are being
studied. There are also techniques for manipulating various ML models. This
book, however, will not cover those topics.
1. Obtaining the dataset for the problem, by your own means of data
generation, or imported from other existing sources, or computer
syntheses.
2. Clean up the dataset if there are objectively known defaults in it.
3. Determine the type of hypothesis for the model.
Introduction 5
4. Develop or import proper module for the needed algorithm for the
problem. The learning ability (number of the learning parameters) of
the model and the size of the dataset shall be properly balanced, if
possible. Otherwise, consider the use of regularization techniques.
5. Randomly initialize the learning parameters, or import some known pre-
trained learning parameter.
6. Perform the training with proper optimization techniques and monitor-
by 103.178.218.4 on 08/16/23. Re-use and distribution is strictly not permitted, except for Open Access articles.
ing measures.
7. Test the trained model using an independent test dataset. This can also
be done during the training.
8. Deploy the trained and tested model to the same type of problems, where
Machine Learning with Python Downloaded from www.worldscientific.com
We shall define variables and spaces often used in this book for ease of
discussion. We first state that this book deals with only real numbers, unless
specified when geometrically closed operations are required. Let us introduce
two toy examples.
randomly selected fruits of these two types from the market, and create a
dataset with 8,000 paired data-points. Each data-point records the values
of these three features and pairs with two labels (ground truth) of yes-or-no
for apple or yes-or-no for orange. The dataset is also called labeled dataset
for model training.
With an understanding of these two typical types of examples, it should
be easy to extend this to many other types of problems for which a machine
by 103.178.218.4 on 08/16/23. Re-use and distribution is strictly not permitted, except for Open Access articles.
Figure 1.1: Data-points in a 2D feature space X2 with blue vectors: xi = [xi1 , xi2 ]; and
2
the same data-points in the augmented feature space X , called affine space, with red
vectors: xi = [1, xi1 , xi2 ]; i = 1, 2, 3, 4.
p
A X space can be created by first spanning Xp by one dimension to Xp+1
via introduction of a new variable x0 as
[x0 , x1 , x2 , . . . , xp ] (1.5)
and then set x0 = 1. These 4 red vectors shown in Fig. 1.1 live in an affine
2
space X .
by 103.178.218.4 on 08/16/23. Re-use and distribution is strictly not permitted, except for Open Access articles.
p
Note that the affine space X is neither Xp+1 nor Xp , and is quite
p
special. A vector in a X is in Xp+1 , but the tip of the vector is confined
in “hyperplane” of x0 = 1. For convenience of discussion in this book, we
say that an affine space has a pseudo-dimension that is p + 1. Its true
Machine Learning with Python Downloaded from www.worldscientific.com
y = [y1 , y2 , . . . , yk ], y ∈ Yk ∈ Rk (1.6)
For the toy example-1, yij (i = 1, 2, . . . , 8000; j = 1, 2) are 8,000 real numbers
in 2D space Y2 . For the toy example-2, each label, yi1 or yi2 , has a value of
0 or 1 (or −1 or 1), but the labels can still be viewed living in Y2 .
These labels yi (i = 1, 2, . . . , m) can be stacked to form a label set Y ∈ Yk ,
although we may not really do so in computation.
Introduction 9
ables that live in a hypothesis space noted as WP over the real numbers.
Learning parameters are also called training or trainable parameters. We
use these terms interchangeably. The learning parameters include weights
and biases in each and all the layer. The hat above w implying that it
Machine Learning with Python Downloaded from www.worldscientific.com
ŵ = [W 0 , W 1 , . . . , W P ] ∈ WP (1.8)
We will discuss in later chapters the details about WP for various models
including estimation of the dimension P .
10 Machine Learning with Python: Theory and Applications
It reads that the ML model M uses a given dataset X with Y to train its
learning parameters ŵ, and produces a map (or giant functions) that makes
a prediction in the label space for any point in the feature space.
The ML model shown in Eq. (1.9) is in fact a data-parameter
Machine Learning with Python Downloaded from www.worldscientific.com
Note also that there are ML models for discontinuous feature variables,
and the learning parameters may not need to be continuous. Such methods
are often developed based on proper intuitive rules and techniques, and
we will discuss some of those. The concepts on spaces may not be directly
applicable but can often help.
by 103.178.218.4 on 08/16/23. Re-use and distribution is strictly not permitted, except for Open Access articles.
Data are the key to any data-based models. There are many types of data
available for different types of problems that one may make use of as follows:
Note that the quality and the sampling domain of the dataset play
important roles in training reliable machine learning models. Use of a trained
model beyond the data sampling domain requires a special caution, because
it can go wrong unexpectedly, and hence be very dangerous.
Data-based Models
but are slow in prediction. This is because the strategies for physics-law-
based models and those for data-based models are quite different. ML models
use datasets to train the parameters, but physics-law-based models use laws
to determine the parameters.
However, at the detailed computational methodology level, many tech-
niques used in both models are in fact the same or quite similar. For example,
when we express a variable as a function of other variables, both models
use basis functions (polynomial, or radial basis function (RBF), or both).
In constructing objective functions, the least squares error formulation is
used in both. In addition, the regularization methods used are also quite
similar. Therefore, one should not study these models in total isolation. The
ideas and techniques may be deeply connected and mutually adaptable. This
realization can be useful in better understanding and further development
of more effective methods for both models, by exchanging the ideas and
techniques from one to another. In general, for physics-law-based computa-
tional methods, such as the general form of meshfree methods, we understand
reasonably well why and how a method works in theory [3]. Therefore, we are
quite confident about what we are going to obtain when a method is used
for a problem. For data-based methods, however, this is not always true.
Therefore, it is of importance to develop fundamental theories for data-based
methods. The author made some attempts [21] to reveal the relationship
between physics-law-based and data-based models, and to establish some
theoretical foundation for data-based models. In this book, we will try to
discuss the similarities and differences, when a computational method is
used in both models.
no need to simply reproduce these materials. In the opinion of the author, the
best learning approach is to learn the most essential basics and build a strong
foundation, which is sufficient to learn other related topics, methods, and
algorithms. Most importantly, readers with strong fundamentals can even
Machine Learning with Python Downloaded from www.worldscientific.com
develop innovative and more effective machine models for their problems.
Based on this philosophy, the highlights of the book that cannot be found
easily or in good completion in the open literature are listed as follows, many
of which are the outcomes of author’s study in the past years:
The author has made substantial effort to write Python codes to demonstrate
the essential and difficult concepts and formulations, which allows readers
to comprehend each chapter earlier. Based on the learning experience of the
author, this can make the learning more effective.
The chapters of this book are written, in principle, readable indepen-
dently, by allowing some duplicates. Necessary cross-references between
chapters provided are kept minimum.
by 103.178.218.4 on 08/16/23. Re-use and distribution is strictly not permitted, except for Open Access articles.
The book is written for beginners interested to learn the basics of machine
learning, including university students who have completed their first
year, graduate students, researchers, and professionals in engineering and
sciences. Engineers and practitioners who want to learn to build machine
learning models may also find the book useful. Basic knowledge of college
mathematics is helpful in reading this book smoothly.
This book may be used as a textbook for undergraduates (3rd year or
senior) and graduate students. If this book is adopted as a textbook, the
instructor may contact the author (liugr100@gmail.com) directly for some
homework and course projects and solutions.
Machine learning is still a fast developing area of research. There still exist
many challenging problems, which offer ample opportunities for research to
develop new methods and algorithms. Currently, it is a hot topic of research
and applications. Different techniques are being developed every day, and
new businesses are formed constantly. It is the hope of the author that this
book can be helpful in studying existing and developing machine learning
models.
The book has been written using Jupiter Notebook with codes.
Readers who purchased the book may contact the author directly
(mailto:liugr100@gmail.com) to request a softcopy of the book with codes
(which may be updated), free for academic use after registration. The
conditions for use of the book and codes developed by the author, in both
hardcopy and softcopy, are as follows:
1. Users are entirely at their own risk using any of part of the codes and
techniques.
Introduction 15
2. The book and codes are only for your own use. You are not allowed to
further distribute without permission from the author of the code.
3. There will be no user support.
4. Proper reference and acknowledgment must be given for the use of the
book, codes, ideas, and techniques.
Note that the handcrafted codes provided in the book are mainly for
by 103.178.218.4 on 08/16/23. Re-use and distribution is strictly not permitted, except for Open Access articles.
are often run with various packages/modules. Therefore, care is needed when
using these codes, because the behavior of the codes often depends on the
versions of Python and all these packages/modules. When the codes do not
run as expected, version mismatch could be one of the problems. When this
book was written, the versions of Python and some of the packages/modules
were as follows:
For example,
import keras
print('keras version',keras.__version__)
import tensorflow as tf
print('tensorflow version',tf.version.VERSION)
by 103.178.218.4 on 08/16/23. Re-use and distribution is strictly not permitted, except for Open Access articles.
If the version is indeed an issue, one would need to either modify the code
to fit the version or install the correct version in your system, by may be
creating an alternative environment. It is very useful to query on the web
using the error message, and solutions or leads can often be found. This is
the approach the author often takes when encountering an issue in running
a code. Finally, this book has used materials and information available on
the web with links. These links may change over time, because of the nature
of the web. The most effective way (and often used by the author) to dealing
with this matter is to use keywords to search online, if the link is lost.
References
[1] G.R. Liu and S.S. Quek, The Finite Element Method: A Practical Course,
Butterworth-Heinemann, London, 2013.
[2] G.R. Liu and T.T. Nguyen, Smoothed Finite Element Methods, Taylor and Francis
Group, New York, 2010.
[3] G.R. Liu, Mesh Free Methods: Moving Beyond the Finite Element Method, Taylor
and Francis Group, New York, 2010.
[4] G.R. Liu and Gui-Yong Zhang, Smoothed Point Interpolation Methods: G Space
Theory and Weakened Weak Forms, World Scientific, New Jersey, 2013.
[5] G.R. Liu and X. Han, Computational Inverse Techniques in Nondestructive Evalua-
tion, Taylor and Francis Group, New York, 2003.
[6] F. Rosenblatt, Principles of Neurodynamics: Perceptrons and the Theory of
Brain Mechanisms, New York, 1962. https://books.google.com/books?id=7FhRAA
AAMAAJ.
[7] D.E. Rumelhart, G.E. Hinton and R.J. Williams, Learning Internal Representations
by Error Propagation, 1986.
[8] G.R. Liu, FEA-AI and AI-AI: Two-way deepnets for real-time computations for both
forward and inverse mechanics problems, International Journal of Computational
Methods, 16(08), 1950045, 2019.
[9] G.R. Liu, S.Y. Duan, Z.M. Zhang et al., TubeNet: A special trumpetnet for explicit
solutions to inverse problems, International Journal of Computational Methods,
18(01), 2050030, 2021. https://doi.org/10.1142/S0219876220500309.
Introduction 17