Professional Documents
Culture Documents
Full download Introduction to Algorithms for Data Mining and Machine Learning Xin-She Yang file pdf all chapter on 2024
Full download Introduction to Algorithms for Data Mining and Machine Learning Xin-She Yang file pdf all chapter on 2024
https://ebookmass.com/product/introduction-to-algorithms-for-
data-mining-and-machine-learning-yang/
https://ebookmass.com/product/fundamentals-of-machine-learning-
for-predictive-data-analytics-algorithms/
https://ebookmass.com/product/machine-learning-for-signal-
processing-data-science-algorithms-and-computational-statistics-
max-a-little/
https://ebookmass.com/product/big-data-analytics-introduction-to-
hadoop-spark-and-machine-learning-raj-kamal/
Machine Learning for Biometrics: Concepts, Algorithms
and Applications (Cognitive Data Science in Sustainable
Computing) Partha Pratim Sarangi
https://ebookmass.com/product/machine-learning-for-biometrics-
concepts-algorithms-and-applications-cognitive-data-science-in-
sustainable-computing-partha-pratim-sarangi/
https://ebookmass.com/product/machine-learning-algorithms-for-
signal-and-image-processing-suman-lata-tripathi/
https://ebookmass.com/product/learn-data-mining-through-excel-a-
step-by-step-approach-for-understanding-machine-learning-
methods-2nd-edition-hong-zhou/
https://ebookmass.com/product/absolute-beginners-guide-to-
algorithms-a-practical-introduction-to-data-structures-and-
algorithms-in-javascript-for-true-epub-kirupa-chinnathambi/
https://ebookmass.com/product/absolute-beginners-guide-to-
algorithms-a-practical-introduction-to-data-structures-and-
algorithms-in-javascript-kirupa-chinnathambi/
Xin-She Yang
Introduction to
Algorithms for Data Mining
and Machine Learning
Introduction to Algorithms for Data Mining and
Machine Learning
This page intentionally left blank
Introduction to
Algorithms for Data
Mining and Machine
Learning
Xin-She Yang
Middlesex University
School of Science and Technology
London, United Kingdom
Academic Press is an imprint of Elsevier
125 London Wall, London EC2Y 5AS, United Kingdom
525 B Street, Suite 1650, San Diego, CA 92101, United States
50 Hampshire Street, 5th Floor, Cambridge, MA 02139, United States
The Boulevard, Langford Lane, Kidlington, Oxford OX5 1GB, United Kingdom
Copyright © 2019 Elsevier Inc. All rights reserved.
No part of this publication may be reproduced or transmitted in any form or by any means, electronic or
mechanical, including photocopying, recording, or any information storage and retrieval system, without
permission in writing from the publisher. Details on how to seek permission, further information about the
Publisher’s permissions policies and our arrangements with organizations such as the Copyright Clearance Center
and the Copyright Licensing Agency, can be found at our website: www.elsevier.com/permissions.
This book and the individual contributions contained in it are protected under copyright by the Publisher (other
than as may be noted herein).
Notices
Knowledge and best practice in this field are constantly changing. As new research and experience broaden our
understanding, changes in research methods, professional practices, or medical treatment may become necessary.
Practitioners and researchers must always rely on their own experience and knowledge in evaluating and using
any information, methods, compounds, or experiments described herein. In using such information or methods
they should be mindful of their own safety and the safety of others, including parties for whom they have a
professional responsibility.
To the fullest extent of the law, neither the Publisher nor the authors, contributors, or editors, assume any liability
for any injury and/or damage to persons or property as a matter of products liability, negligence or otherwise, or
from any use or operation of any methods, products, instructions, or ideas contained in the material herein.
ISBN: 978-0-12-817216-2
1 Introduction to optimization 1
1.1 Algorithms 1
1.1.1 Essence of an algorithm 1
1.1.2 Issues with algorithms 3
1.1.3 Types of algorithms 3
1.2 Optimization 4
1.2.1 A simple example 4
1.2.2 General formulation of optimization 7
1.2.3 Feasible solution 9
1.2.4 Optimality criteria 10
1.3 Unconstrained optimization 10
1.3.1 Univariate functions 11
1.3.2 Multivariate functions 12
1.4 Nonlinear constrained optimization 14
1.4.1 Penalty method 15
1.4.2 Lagrange multipliers 16
1.4.3 Karush–Kuhn–Tucker conditions 17
1.5 Notes on software 18
2 Mathematical foundations 19
2.1 Convexity 20
2.1.1 Linear and affine functions 20
2.1.2 Convex functions 21
2.1.3 Mathematical operations on convex functions 22
2.2 Computational complexity 22
2.2.1 Time and space complexity 24
2.2.2 Complexity of algorithms 25
2.3 Norms and regularization 26
2.3.1 Norms 26
2.3.2 Regularization 28
2.4 Probability distributions 29
2.4.1 Random variables 29
2.4.2 Probability distributions 30
vi Contents
3 Optimization algorithms 45
3.1 Gradient-based methods 45
3.1.1 Newton’s method 45
3.1.2 Newton’s method for multivariate functions 47
3.1.3 Line search 48
3.2 Variants of gradient-based methods 49
3.2.1 Stochastic gradient descent 50
3.2.2 Subgradient method 51
3.2.3 Conjugate gradient method 52
3.3 Optimizers in deep learning 53
3.4 Gradient-free methods 56
3.5 Evolutionary algorithms and swarm intelligence 58
3.5.1 Genetic algorithm 58
3.5.2 Differential evolution 60
3.5.3 Particle swarm optimization 61
3.5.4 Bat algorithm 61
3.5.5 Firefly algorithm 62
3.5.6 Cuckoo search 62
3.5.7 Flower pollination algorithm 63
3.6 Notes on software 64
Bibliography 163
Index 171
About the author
Xin-She Yang obtained his PhD in Applied Mathematics from the University of Ox-
ford. He then worked at Cambridge University and National Physical Laboratory (UK)
as a Senior Research Scientist. Now he is Reader at Middlesex University London, and
an elected Bye-Fellow at Cambridge University.
He is also the IEEE Computer Intelligence Society (CIS) Chair for the Task Force
on Business Intelligence and Knowledge Management, Director of the International
Consortium for Optimization and Modelling in Science and Industry (iCOMSI), and
an Editor of Springer’s Book Series Springer Tracts in Nature-Inspired Computing
(STNIC).
With more than 20 years of research and teaching experience, he has authored
10 books and edited more than 15 books. He published more than 200 research pa-
pers in international peer-reviewed journals and conference proceedings with more
than 36 800 citations. He has been on the prestigious lists of Clarivate Analytics and
Web of Science highly cited researchers in 2016, 2017, and 2018. He serves on the
Editorial Boards of many international journals including International Journal of
Bio-Inspired Computation, Elsevier’s Journal of Computational Science (JoCS), In-
ternational Journal of Parallel, Emergent and Distributed Systems, and International
Journal of Computer Mathematics. He is also the Editor-in-Chief of the International
Journal of Mathematical Modelling and Numerical Optimisation.
This page intentionally left blank
Preface
Both data mining and machine learning are becoming popular subjects for university
courses and industrial applications. This popularity is partly driven by the Internet and
social media because they generate a huge amount of data every day, and the under-
standing of such big data requires sophisticated data mining techniques. In addition,
many applications such as facial recognition and robotics have extensively used ma-
chine learning algorithms, leading to the increasing popularity of artificial intelligence.
From a more general perspective, both data mining and machine learning are closely
related to optimization. After all, in many applications, we have to minimize costs,
errors, energy consumption, and environment impact and to maximize sustainabil-
ity, productivity, and efficiency. Many problems in data mining and machine learning
are usually formulated as optimization problems so that they can be solved by opti-
mization algorithms. Therefore, optimization techniques are closely related to many
techniques in data mining and machine learning.
Courses on data mining, machine learning, and optimization are often compulsory
for students, studying computer science, management science, engineering design, op-
erations research, data science, finance, and economics. All students have to develop
a certain level of data modeling skills so that they can process and interpret data for
classification, clustering, curve-fitting, and predictions. They should also be familiar
with machine learning techniques that are closely related to data mining so as to carry
out problem solving in many real-world applications. This book provides an introduc-
tion to all the major topics for such courses, covering the essential ideas of all key
algorithms and techniques for data mining, machine learning, and optimization.
Though there are over a dozen good books on such topics, most of these books are
either too specialized with specific readership or too lengthy (often over 500 pages).
This book fills in the gap with a compact and concise approach by focusing on the key
concepts, algorithms, and techniques at an introductory level. The main approach of
this book is informal, theorem-free, and practical. By using an informal approach all
fundamental topics required for data mining and machine learning are covered, and
the readers can gain such basic knowledge of all important algorithms with a focus
on their key ideas, without worrying about any tedious, rigorous mathematical proofs.
In addition, the practical approach provides about 30 worked examples in this book
so that the readers can see how each step of the algorithms and techniques works.
Thus, the readers can build their understanding and confidence gradually and in a
step-by-step manner. Furthermore, with the minimal requirements of basic high school
mathematics and some basic calculus, such an informal and practical style can also
enable the readers to learn the contents by self-study and at their own pace.
This book is suitable for undergraduates and graduates to rapidly develop all the
fundamental knowledge of data mining, machine learning, and optimization. It can
xii Preface
also be used by students and researchers as a reference to review and refresh their
knowledge in data mining, machine learning, optimization, computer science, and data
science.
Xin-She Yang
January 2019 in London
Acknowledgments
I would like to thank all my students and colleagues who have given valuable feedback
and comments on some of the contents and examples of this book. I also would like to
thank my editors, J. Scott Bentley and Michael Lutz, and the staff at Elsevier for their
professionalism. Last but not least, I thank my family for all the help and support.
Xin-She Yang
January 2019
This page intentionally left blank
Introduction to optimization
Contents
1.1 Algorithms
1 1
1.1.1 Essence of an algorithm 1
1.1.2 Issues with algorithms 3
1.1.3 Types of algorithms 3
1.2 Optimization 4
1.2.1 A simple example 4
1.2.2 General formulation of optimization 7
1.2.3 Feasible solution 9
1.2.4 Optimality criteria 10
1.3 Unconstrained optimization 10
1.3.1 Univariate functions 11
1.3.2 Multivariate functions 12
1.4 Nonlinear constrained optimization 14
1.4.1 Penalty method 15
1.4.2 Lagrange multipliers 16
1.4.3 Karush–Kuhn–Tucker conditions 17
1.5 Notes on software 18
This book introduces the most fundamentals and algorithms related to optimization,
data mining, and machine learning. The main requirement is some understanding of
high-school mathematics and basic calculus; however, we will review and introduce
some of the mathematical foundations in the first two chapters.
1.1 Algorithms
An algorithm is an iterative, step-by-step procedure for computation. The detailed
procedure can be a simple description, an equation, or a series of descriptions in
combination with equations. Finding the roots of a polynomial, checking if a natu-
ral number is a prime number, and generating random numbers are all algorithms.
Example 1
As an example, if x0 = 1 and a = 4, then we have
1 4
x1 = (1 + ) = 2.5. (1.2)
2 1
Similarly, we have
1 4 1 4
x2 = (2.5 + ) = 2.05, x3 = (2.05 + ) ≈ 2.0061, (1.3)
2 2.5 2 2.05
x4 ≈ 2.00000927, (1.4)
√
which is very close to the true value of 4 = 2. The accuracy of this iterative formula or algorithm
is high because it achieves the accuracy of five decimal places after four iterations.
The convergence is very quick if we start from different initial values such as
x0 = 10 and even x0 = 100. However, for an obvious reason, we cannot start with
x0 = 0 due to division by
√zero.
Find the root of x = a is equivalent to solving the equation
f (x) = x 2 − a = 0, (1.5)
which is again equivalent to finding the roots of a polynomial f (x). We know that
Newton’s root-finding algorithm can be written as
f (xk )
xk+1 = xk − , (1.6)
f (xk )
where f (x) is the first derivative or gradient of f (x). In this case, we have
f (x) = 2x. Thus, Newton’s formula becomes
(xk2 − a)
xk+1 = xk − , (1.7)
2xk
1.2 Optimization
V = πr 2 h. (1.12)
There are only two design variables r and h and one objective function S to be min-
imized. Obviously, if there is no capacity constraint, then we can choose not to build
the container, and then the cost of materials is zero for r = 0 and h = 0. However,
Introduction to optimization 5
the constraint requirement means that we have to build a container with fixed volume
V0 = πr 2 h = 10 m3 . Therefore, this optimization problem can be written as
πr 2 h = V0 = 10. (1.14)
To solve this problem, we can first try to use the equality constraint to reduce the
number of design variables by solving h. So we have
V0
h= . (1.15)
πr 2
Substituting it into (1.13), we get
S = 2πr 2 + 2πrh
V0 2V0
= 2πr 2 + 2πr 2 = 2πr 2 + . (1.16)
πr r
This is a univariate function. From basic calculus we know that the minimum or max-
imum can occur at the stationary point, where the first derivative is zero, that is,
dS 2V0
= 4πr − 2 = 0, (1.17)
dr r
which gives
V0 3 V0
r3 = , or r = . (1.18)
2π 2π
Thus, the height is
h V0 /(πr 2 ) V0
= = 3 = 2. (1.19)
r r πr
Another random document with
no related content on Scribd:
The Project Gutenberg eBook of Le Maître du
Navire
This ebook is for the use of anyone anywhere in the United
States and most other parts of the world at no cost and with
almost no restrictions whatsoever. You may copy it, give it away
or re-use it under the terms of the Project Gutenberg License
included with this ebook or online at www.gutenberg.org. If you
are not located in the United States, you will have to check the
laws of the country where you are located before using this
eBook.
Language: French
LOUIS CHADOURNE
LE MAITRE
DU NAVIRE
L’ÉDITION FRANÇAISE ILLUSTRÉE
30, RUE DE PROVENCE — PARIS
1919
DU MÊME AUTEUR
En préparation :
(Sept de ces exemplaires, — les numéros 1 à 7, — n’ont pas été mis dans le
commerce.)
PARIS
L’ÉDITION FRANÇAISE ILLUSTRÉE
30, Rue de Provence, 30
1919
AVANT-PROPOS
négligeable
A L’ANCIENNE MODE
Lecteur,
CHAPITRE PREMIER
L’homme aux lunettes vertes.
Euripide.