Professional Documents
Culture Documents
ebook download Introduction to Algorithms for Data Mining and Machine Learning 1st edition - eBook PDF all chapter
ebook download Introduction to Algorithms for Data Mining and Machine Learning 1st edition - eBook PDF all chapter
https://ebooksecure.com/download/machine-learning-for-biometrics-
concepts-algorithms-and-applications-cognitive-data-science-in-
sustainable-computing-ebook-pdf/
https://ebooksecure.com/download/big-data-analytics-introduction-
to-hadoop-spark-and-machine-learning-ebook-pdf/
http://ebooksecure.com/product/ebook-pdf-introduction-to-machine-
learning-with-python-a-guide-for-data-scientists/
http://ebooksecure.com/product/ebook-pdf-introduction-to-
business-data-mining-1st-edition/
(eBook PDF) Machine Learning Refined: Foundations,
Algorithms, and Applications
http://ebooksecure.com/product/ebook-pdf-machine-learning-
refined-foundations-algorithms-and-applications/
http://ebooksecure.com/product/ebook-pdf-introduction-to-data-
mining-global-edition-2nd-edition/
https://ebooksecure.com/download/big-data-mining-for-climate-
change-ebook-pdf/
http://ebooksecure.com/product/ebook-pdf-introduction-to-data-
mining-2nd-edition-by-pang-ning-tan/
https://ebooksecure.com/download/machine-learning-for-planetary-
science-ebook-pdf/
Xin-She Yang
Introduction to
Algorithms for Data Mining
and Machine Learning
Introduction to Algorithms for Data Mining and
Machine Learning
This page intentionally left blank
Introduction to
Algorithms for Data
Mining and Machine
Learning
Xin-She Yang
Middlesex University
School of Science and Technology
London, United Kingdom
Academic Press is an imprint of Elsevier
125 London Wall, London EC2Y 5AS, United Kingdom
525 B Street, Suite 1650, San Diego, CA 92101, United States
50 Hampshire Street, 5th Floor, Cambridge, MA 02139, United States
The Boulevard, Langford Lane, Kidlington, Oxford OX5 1GB, United Kingdom
Copyright © 2019 Elsevier Inc. All rights reserved.
No part of this publication may be reproduced or transmitted in any form or by any means, electronic or
mechanical, including photocopying, recording, or any information storage and retrieval system, without
permission in writing from the publisher. Details on how to seek permission, further information about the
Publisher’s permissions policies and our arrangements with organizations such as the Copyright Clearance Center
and the Copyright Licensing Agency, can be found at our website: www.elsevier.com/permissions.
This book and the individual contributions contained in it are protected under copyright by the Publisher (other
than as may be noted herein).
Notices
Knowledge and best practice in this field are constantly changing. As new research and experience broaden our
understanding, changes in research methods, professional practices, or medical treatment may become necessary.
Practitioners and researchers must always rely on their own experience and knowledge in evaluating and using
any information, methods, compounds, or experiments described herein. In using such information or methods
they should be mindful of their own safety and the safety of others, including parties for whom they have a
professional responsibility.
To the fullest extent of the law, neither the Publisher nor the authors, contributors, or editors, assume any liability
for any injury and/or damage to persons or property as a matter of products liability, negligence or otherwise, or
from any use or operation of any methods, products, instructions, or ideas contained in the material herein.
ISBN: 978-0-12-817216-2
1 Introduction to optimization 1
1.1 Algorithms 1
1.1.1 Essence of an algorithm 1
1.1.2 Issues with algorithms 3
1.1.3 Types of algorithms 3
1.2 Optimization 4
1.2.1 A simple example 4
1.2.2 General formulation of optimization 7
1.2.3 Feasible solution 9
1.2.4 Optimality criteria 10
1.3 Unconstrained optimization 10
1.3.1 Univariate functions 11
1.3.2 Multivariate functions 12
1.4 Nonlinear constrained optimization 14
1.4.1 Penalty method 15
1.4.2 Lagrange multipliers 16
1.4.3 Karush–Kuhn–Tucker conditions 17
1.5 Notes on software 18
2 Mathematical foundations 19
2.1 Convexity 20
2.1.1 Linear and affine functions 20
2.1.2 Convex functions 21
2.1.3 Mathematical operations on convex functions 22
2.2 Computational complexity 22
2.2.1 Time and space complexity 24
2.2.2 Complexity of algorithms 25
2.3 Norms and regularization 26
2.3.1 Norms 26
2.3.2 Regularization 28
2.4 Probability distributions 29
2.4.1 Random variables 29
2.4.2 Probability distributions 30
vi Contents
3 Optimization algorithms 45
3.1 Gradient-based methods 45
3.1.1 Newton’s method 45
3.1.2 Newton’s method for multivariate functions 47
3.1.3 Line search 48
3.2 Variants of gradient-based methods 49
3.2.1 Stochastic gradient descent 50
3.2.2 Subgradient method 51
3.2.3 Conjugate gradient method 52
3.3 Optimizers in deep learning 53
3.4 Gradient-free methods 56
3.5 Evolutionary algorithms and swarm intelligence 58
3.5.1 Genetic algorithm 58
3.5.2 Differential evolution 60
3.5.3 Particle swarm optimization 61
3.5.4 Bat algorithm 61
3.5.5 Firefly algorithm 62
3.5.6 Cuckoo search 62
3.5.7 Flower pollination algorithm 63
3.6 Notes on software 64
Bibliography 163
Index 171
About the author
Xin-She Yang obtained his PhD in Applied Mathematics from the University of Ox-
ford. He then worked at Cambridge University and National Physical Laboratory (UK)
as a Senior Research Scientist. Now he is Reader at Middlesex University London, and
an elected Bye-Fellow at Cambridge University.
He is also the IEEE Computer Intelligence Society (CIS) Chair for the Task Force
on Business Intelligence and Knowledge Management, Director of the International
Consortium for Optimization and Modelling in Science and Industry (iCOMSI), and
an Editor of Springer’s Book Series Springer Tracts in Nature-Inspired Computing
(STNIC).
With more than 20 years of research and teaching experience, he has authored
10 books and edited more than 15 books. He published more than 200 research pa-
pers in international peer-reviewed journals and conference proceedings with more
than 36 800 citations. He has been on the prestigious lists of Clarivate Analytics and
Web of Science highly cited researchers in 2016, 2017, and 2018. He serves on the
Editorial Boards of many international journals including International Journal of
Bio-Inspired Computation, Elsevier’s Journal of Computational Science (JoCS), In-
ternational Journal of Parallel, Emergent and Distributed Systems, and International
Journal of Computer Mathematics. He is also the Editor-in-Chief of the International
Journal of Mathematical Modelling and Numerical Optimisation.
This page intentionally left blank
Preface
Both data mining and machine learning are becoming popular subjects for university
courses and industrial applications. This popularity is partly driven by the Internet and
social media because they generate a huge amount of data every day, and the under-
standing of such big data requires sophisticated data mining techniques. In addition,
many applications such as facial recognition and robotics have extensively used ma-
chine learning algorithms, leading to the increasing popularity of artificial intelligence.
From a more general perspective, both data mining and machine learning are closely
related to optimization. After all, in many applications, we have to minimize costs,
errors, energy consumption, and environment impact and to maximize sustainabil-
ity, productivity, and efficiency. Many problems in data mining and machine learning
are usually formulated as optimization problems so that they can be solved by opti-
mization algorithms. Therefore, optimization techniques are closely related to many
techniques in data mining and machine learning.
Courses on data mining, machine learning, and optimization are often compulsory
for students, studying computer science, management science, engineering design, op-
erations research, data science, finance, and economics. All students have to develop
a certain level of data modeling skills so that they can process and interpret data for
classification, clustering, curve-fitting, and predictions. They should also be familiar
with machine learning techniques that are closely related to data mining so as to carry
out problem solving in many real-world applications. This book provides an introduc-
tion to all the major topics for such courses, covering the essential ideas of all key
algorithms and techniques for data mining, machine learning, and optimization.
Though there are over a dozen good books on such topics, most of these books are
either too specialized with specific readership or too lengthy (often over 500 pages).
This book fills in the gap with a compact and concise approach by focusing on the key
concepts, algorithms, and techniques at an introductory level. The main approach of
this book is informal, theorem-free, and practical. By using an informal approach all
fundamental topics required for data mining and machine learning are covered, and
the readers can gain such basic knowledge of all important algorithms with a focus
on their key ideas, without worrying about any tedious, rigorous mathematical proofs.
In addition, the practical approach provides about 30 worked examples in this book
so that the readers can see how each step of the algorithms and techniques works.
Thus, the readers can build their understanding and confidence gradually and in a
step-by-step manner. Furthermore, with the minimal requirements of basic high school
mathematics and some basic calculus, such an informal and practical style can also
enable the readers to learn the contents by self-study and at their own pace.
This book is suitable for undergraduates and graduates to rapidly develop all the
fundamental knowledge of data mining, machine learning, and optimization. It can
xii Preface
also be used by students and researchers as a reference to review and refresh their
knowledge in data mining, machine learning, optimization, computer science, and data
science.
Xin-She Yang
January 2019 in London
Acknowledgments
I would like to thank all my students and colleagues who have given valuable feedback
and comments on some of the contents and examples of this book. I also would like to
thank my editors, J. Scott Bentley and Michael Lutz, and the staff at Elsevier for their
professionalism. Last but not least, I thank my family for all the help and support.
Xin-She Yang
January 2019
This page intentionally left blank
Introduction to optimization
Contents
1.1 Algorithms
1 1
1.1.1 Essence of an algorithm 1
1.1.2 Issues with algorithms 3
1.1.3 Types of algorithms 3
1.2 Optimization 4
1.2.1 A simple example 4
1.2.2 General formulation of optimization 7
1.2.3 Feasible solution 9
1.2.4 Optimality criteria 10
1.3 Unconstrained optimization 10
1.3.1 Univariate functions 11
1.3.2 Multivariate functions 12
1.4 Nonlinear constrained optimization 14
1.4.1 Penalty method 15
1.4.2 Lagrange multipliers 16
1.4.3 Karush–Kuhn–Tucker conditions 17
1.5 Notes on software 18
This book introduces the most fundamentals and algorithms related to optimization,
data mining, and machine learning. The main requirement is some understanding of
high-school mathematics and basic calculus; however, we will review and introduce
some of the mathematical foundations in the first two chapters.
1.1 Algorithms
An algorithm is an iterative, step-by-step procedure for computation. The detailed
procedure can be a simple description, an equation, or a series of descriptions in
combination with equations. Finding the roots of a polynomial, checking if a natu-
ral number is a prime number, and generating random numbers are all algorithms.
Example 1
As an example, if x0 = 1 and a = 4, then we have
1 4
x1 = (1 + ) = 2.5. (1.2)
2 1
Similarly, we have
1 4 1 4
x2 = (2.5 + ) = 2.05, x3 = (2.05 + ) ≈ 2.0061, (1.3)
2 2.5 2 2.05
x4 ≈ 2.00000927, (1.4)
√
which is very close to the true value of 4 = 2. The accuracy of this iterative formula or algorithm
is high because it achieves the accuracy of five decimal places after four iterations.
The convergence is very quick if we start from different initial values such as
x0 = 10 and even x0 = 100. However, for an obvious reason, we cannot start with
x0 = 0 due to division by
√zero.
Find the root of x = a is equivalent to solving the equation
f (x) = x 2 − a = 0, (1.5)
which is again equivalent to finding the roots of a polynomial f (x). We know that
Newton’s root-finding algorithm can be written as
f (xk )
xk+1 = xk − , (1.6)
f (xk )
where f (x) is the first derivative or gradient of f (x). In this case, we have
f (x) = 2x. Thus, Newton’s formula becomes
(xk2 − a)
xk+1 = xk − , (1.7)
2xk
1.2 Optimization
V = πr 2 h. (1.12)
There are only two design variables r and h and one objective function S to be min-
imized. Obviously, if there is no capacity constraint, then we can choose not to build
the container, and then the cost of materials is zero for r = 0 and h = 0. However,
Introduction to optimization 5
the constraint requirement means that we have to build a container with fixed volume
V0 = πr 2 h = 10 m3 . Therefore, this optimization problem can be written as
πr 2 h = V0 = 10. (1.14)
To solve this problem, we can first try to use the equality constraint to reduce the
number of design variables by solving h. So we have
V0
h= . (1.15)
πr 2
Substituting it into (1.13), we get
S = 2πr 2 + 2πrh
V0 2V0
= 2πr 2 + 2πr 2 = 2πr 2 + . (1.16)
πr r
This is a univariate function. From basic calculus we know that the minimum or max-
imum can occur at the stationary point, where the first derivative is zero, that is,
dS 2V0
= 4πr − 2 = 0, (1.17)
dr r
which gives
V0 3 V0
r3 = , or r = . (1.18)
2π 2π
Thus, the height is
h V0 /(πr 2 ) V0
= = 3 = 2. (1.19)
r r πr
6 Introduction to Algorithms for Data Mining and Machine Learning
This means that the height is twice the radius: h = 2r. Thus, the minimum surface is
It is worth pointing out that this optimal solution is based on the assumption or re-
quirement to design a cylindrical container. If we decide to use a sphere with radius R,
we know that its volume and surface area is
4π 3
V0 = R , S = 4πR 2 . (1.21)
3
We can solve R directly
3V0 3 3V0
R =
3
, or R = , (1.22)
4π 4π
which gives the surface area
3V 2/3 √
0 4π 3 9 2/3
S = 4π =√ 3
V0 . (1.23)
4π 16π 2
√3 √ √ 3
Since 6π/ 4π 2 ≈ 5.5358 and 4π 3 9/ 16π 2 ≈ 4.83598, we have S < S∗ , that is, the
surface area of a sphere is smaller than the minimum surface area of a cylinder with
the same volume. In fact, for the same V0 = 10, we have
√
4π 3 9 2/3
S(sphere) = √ 3
V0 ≈ 22.47, (1.24)
16π 2
which is smaller than S∗ = 25.69 for a cylinder.
This highlights the importance of the choice of design type (here in terms of shape)
before we can do any truly useful optimization. Obviously, there are many other fac-
tors that can influence the choice of design, including the manufacturability of the
design, stability of the structure, ease of installation, space availability, and so on. For
a container, in most applications, a cylinder may be much easier to produce than a
sphere, and thus the overall cost may be lower in practice. Though there are so many
factors to be considered in engineering design, for the purpose of optimization, here
we will only focus on the improvement and optimization of a design with well-posed
mathematical formulations.
Introduction to optimization 7
where f (x), φj (x), and ψk (x) are scalar functions of the design vector x. Here the
components xi of x = (x1 , . . . , xD )T are called design or decision variables, and they
can be either continuous, discrete, or a mixture of these two. The vector x is often
called the decision vector, which varies in a D-dimensional space RD .
It is worth pointing out that we use a column vector here for x (thus with trans-
pose T ). We can also use a row vector x = (x1 , . . . , xD ) and the results will be the
same. Different textbooks may use slightly different formulations. Once we are aware
of such minor variations, it should cause no difficulty or confusion.
In addition, the function f (x) is called the objective function or cost function,
φj (x) are constraints in terms of M equalities, and ψk (x) are constraints written as
N inequalities. So there are M + N constraints in total. The optimization problem
formulated here is a nonlinear constrained problem. Here the inequalities ψk (x) ≤ 0
are written as “less than”, and they can also be written as “greater than” via a simple
transformation by multiplying both sides by −1.
The space spanned by the decision variables is called the search space RD , whereas
the space formed by the values of the objective function is called the objective or
response space, and sometimes the landscape. The optimization problem essentially
maps the domain RD or the space of decision variables into the solution space R (or
the real axis in general).
The objective function f (x) can be either linear or nonlinear. If the constraints φj
and ψk are all linear, it becomes a linearly constrained problem. Furthermore, when
φj , ψk , and the objective function f (x) are all linear, then it becomes a linear pro-
gramming problem [35]. If the objective is at most quadratic with linear constraints,
then it is called a quadratic programming problem. If all the values of the decision
variables can be only integers, then this type of linear programming is called integer
programming or integer linear programming.
On the other hand, if no constraints are specified and thus xi can take any values
in the real axis (or any integers), then the optimization problem is referred to as an
unconstrained optimization problem.
As a very simple example of optimization problems without any constraints, we
discuss the search of the maxima or minima of a univariate function.
8 Introduction to Algorithms for Data Mining and Machine Learning
2
Figure 1.2 A simple multimodal function f (x) = x 2 e−x .
Example 2
For example, to find the maximum of a univariate function f (x)
f (x) = x 2 e−x ,
2
−∞ < x < ∞, (1.26)
is a simple unconstrained problem, whereas the following problem is a simple constrained mini-
mization problem:
subject to
x1 ≥ 1, x2 − 2 = 0. (1.28)
It is worth pointing out that the objectives are explicitly known in all the optimiza-
tion problems to be discussed in this book. However, in reality, it is often difficult to
quantify what we want to achieve, but we still try to optimize certain things such as the
degree of enjoyment or service quality on holiday. In other cases, it may be impossible
to write the objective function in any explicit form mathematically.
From basic calculus we know that, for a given curve described by f (x), its gradient
f (x) describes the rate of change. When f (x) = 0, the curve has a horizontal tangent
at that particular point. This means that it becomes a point of special interest. In fact,
the maximum or minimum of a curve occurs at
f (x∗ ) = 0, (1.29)
Example 3
To find the minimum of f (x) = x 2 e−x (see Fig. 1.2), we have the stationary condition
2
f (x) = 0 or
Figure 1.3 (a) Feasible domain with nonlinear inequality constraints ψ1 (x) and ψ2 (x) (left) and linear
inequality constraint ψ3 (x). (b) An example with an objective of f (x) = x 2 subject to x ≥ 2 (right).
f (x) = 2e−x (1 − 5x 2 + 2x 4 ),
2
two maxima that occur at x∗ = ±1 with fmax = e−1 . At x = 0, we have f (0) = 2 > 0, thus
the minimum of f (x) occurs at x∗ = 0 with fmin (0) = 0.
Whatever the objective is, we have to evaluate it many times. In most cases, the
evaluations of the objective functions consume a substantial amount of computational
power (which costs money) and design time. Any efficient algorithm that can reduce
the number of objective evaluations saves both time and money.
In mathematical programming, there are many important concepts, and we will
first introduce a few related concepts: feasible solutions, optimality criteria, the strong
local optimum, and weak local optimum.
f (x∗ ) = f (0) = 0.
In fact, f (x) = x 3 has a saddle point x∗ = 0 because f (0) = 0 but f changes sign
from f (0+) > 0 to f (0−) < 0 as x moves from positive to negative.
Example 4
For example, to find the maximum or minimum of a univariate function
we first have to find its stationary points x∗ when the first derivative f (x) is zero, that is,
x∗ = −1, x∗ = 2, x∗ = 0.
From the basic calculus we know that the maximum requires f (x∗ ) ≤ 0 whereas the minimum
requires f (x∗ ) ≥ 0.
At x∗ = −1, we have
f (x∗ ) = −23.
can be converted to the minimization of −f (x). For this reason, the optimization
problems can be expressed as either minimization or maximization depending on the
context and convenience of formulations.
In fact, in the optimization literature, some books formulate all the optimization
problems in terms of maximization, whereas others write these problems in terms of
minimization, though they are in essence dealing with the same problems.
∂f ∂f
= 2x + 0 = 0, = 0 + 2y = 0. (1.32)
∂x ∂y
Since
∂ 2f ∂ 2f
= , (1.34)
∂x∂y ∂y∂x
we can conclude that the Hessian matrix is always symmetric. In the case of f (x, y) =
x 2 + y 2 , it is easy to check that the Hessian matrix is
2 0
H= . (1.35)
0 2
At the stationary point (x∗ , y∗ ), if > 0 and fxx > 0, then (x∗ , y∗ ) is a local mini-
mum. If > 0 but fxx < 0, then it is a local maximum. If = 0, then it is inconclu-
sive, and we have to use other information such as higher-order derivatives. However,
if < 0, then it is a saddle point. A saddle point is a special point where a local
minimum occurs along one direction, whereas the maximum occurs along another
(orthogonal) direction.
Example 5
To minimize f (x, y) = (x − 1)2 + x 2 y 2 , we have
∂f ∂f
= 2(x − 1) + 2xy 2 = 0, = 0 + 2x 2 y = 0. (1.37)
∂x ∂y
The second condition gives y = 0 or x = 0. Substituting y = 0 into the first condition, we have
x = 1. However, x = 0 does not satisfy the first condition. Therefore, we have a solution x∗ = 1
and y∗ = 0.
For our example with f = (x − 1)2 + x 2 y 2 , we have
∂ 2f 2 2 2
2 + 2, ∂ f = 4xy, ∂ f = 4xy, ∂ f = 2x 2 ,
= 2y (1.38)
∂x 2 ∂x∂y ∂y∂x ∂y 2
14 Introduction to Algorithms for Data Mining and Machine Learning
At the stationary point (x∗ , y∗ ) = (1, 0), the Hessian matrix becomes
2 0
H= ,
0 2
which is positive definite because its double eigenvalues 2 are positive. Alternatively, we have
= 4 > 0 and fxx = 2 > 0. Therefore, (1, 0) is a local minimum.
M
N
(x, μi , νj ) = f (x) + μi φi2 (x) + νj max{0, ψj (x)}2 , (1.43)
i=1 j =1
where μi 1 and νj ≥ 0.
For example, let us solve the following minimization problem:
where a is a given value. Obviously, without this constraint, the minimum value occurs
at x = 1 with fmin = 0. If a < 1, then the constraint will not affect the result. However,
if a > 1, then the minimum should occur at the boundary x = a (which can be obtained
by inspecting or visualizing the objective function and the constraint). Now we can
define a penalty function (x) using a penalty parameter μ 1. We have
XII
Whether, as some philosophers here assume, we possess only the
fragments of a great cycle of knowledge, in whose center stood the
primeval man in friendly relation with the powers of the universe, and
build our hovels out of the ruins of our ancestral palace; or whether,
according to the developing theory of others, we are rising gradually
and have come up from an atom instead of descending from an
Adam, so that the proudest pedigree might run up to a barnacle or a
zoöphyte at last, are questions which will keep for a good many
centuries yet. Confining myself to what little we can learn from
History, we find tribes rising slowly out of barbarism to a higher or
lower point of culture and civility, and everywhere the poet also is
found under one name or another, changing in certain outward
respects, but essentially the same.
But however far we go back, we shall find this also—that the poet
and the priest were united originally in the same person: which
means that the poet was he who was conscious of the world of spirit
as well as that of sense, and was the ambassador of the gods to
men. This was his highest function, and hence his name of seer.
I suppose the word epic originally meant nothing more than this, that
the poet was the person who was the greatest master of speech. His
were the ἔπεα πτερόεντα, the true winged words that could fly down
the unexplored future and carry thither the names of ancestral
heroes, of the brave, and wise, and good. It was thus that the poet
could reward virtue, and, by and by, as society grew more complex,
could burn in the brand of shame. This is Homer’s character of
Demodocus in the eighth book of the “Odyssey,”
When the Muse loved and gave the good and ill,
The first histories were in verse, and, sung as they were at the feasts
and gatherings of the people, they awoke in men the desire of fame,
which is the first promoter of courage and self-trust, because it
teaches men by degrees to appeal from the present to the future.
We may fancy what the influence of the early epics was when they
were recited to men who claimed the heroes celebrated in them for
their ancestors, by what Bouchardon, the sculptor, said only two
centuries ago: “When I read Homer I feel as if I were twenty feet
high.”
Nor have poets lost their power over the future in modern times.
Dante lifts up by the hair the face of some petty traitor, the Smith and
Brown of some provincial Italian town, lets the fire of his Inferno glare
upon it for a moment, and it is printed forever on the memory of
mankind. The historians may iron out the shoulders of Richard III. as
smooth as they can; they will never get over the wrench that
Shakspeare gave them.
The poet’s gift, then, is that of seer. He it is that discovers the truth
as it exists in types and images; that is the spiritual meaning, which
abides forever under the sensual. And his instinct is to express
himself also in types and images. But it was not only necessary that
he himself should be delighted with his vision, but that he should
interest his hearers with the faculty divine. Pure truth is not
acceptable to the mental palate. It must be diluted with character and
incident; it must be humanized in order to be attractive. If the bones
of a mastodon be exhumed, a crowd will gather out of curiosity; but
let the skeleton of a man be turned up, and what a difference in the
expression of the features! Every bystander then creates his little
drama, in which those whitened bones take flesh upon them and
stalk as chief actor.
The poet is he who can best see or best say what is ideal; what
belongs to the world of soul and of beauty. Whether he celebrates
the brave and good, or the gods, or the beautiful as it appears in
man or nature, something of a religious character still clings to him.
He may be unconscious of his mission; he may be false to it, but in
proportion as he is a great poet, he rises to the level of it more often.
He does not always directly rebuke what is bad or base, but
indirectly, by making us feel what delight there is in the good and fair.
If he besiege evil it is with such beautiful engines of war (as Plutarch
tells us of Demetrius) that the besieged themselves are charmed
with them. Whoever reads the great poets cannot but be made better
by it, for they always introduce him to a higher society, to a greater
style of manners and of thinking. Whoever learns to love what is
beautiful is made incapable of the mean and low and bad. It is
something to be thought of, that all the great poets have been good
men. He who translates the divine into the vulgar, the spiritual into
the sensual, is the reverse of a poet.
It seems to be thought that we have come upon the earth too late;
that there has been a feast of the imagination formerly, and all that is
left for us is to steal the scraps. We hear that there is no poetry in
railroads, steamboats, and telegraphs, and especially in Brother
Jonathan. If this be true, so much the worse for him. But, because he
is a materialist, shall there be no poets? When we have said that we
live in a materialistic age, we have said something which meant
more than we intended. If we say it in the way of blame, we have
said a foolish thing, for probably one age is as good as another; and,
at any rate, the worst is good enough company for us. The age of
Shakspeare seems richer than our own only because he was lucky
enough to have such a pair of eyes as his to see it and such a gift as
his to report it. Shakspeare did not sit down and cry for the water of
Helicon to turn the wheels of his little private mill there at the
Bankside. He appears to have gone more quietly about his business
than any playwright in London; to have drawn off what water-power
he wanted from the great prosy current of affairs that flows alike for
all, and in spite of all; to have ground for the public what grist they
want, coarse or fine; and it seems a mere piece of luck that the
smooth stream of his activity reflected with ravishing clearness every
changing mood of heaven and earth, every stick and stone, every
dog and clown and courtier that stood upon its brink. It is a curious
illustration of the friendly manner in which Shakspeare received
everything that came along, of what a present man he was, that in
the very same year that the mulberry tree was brought into England,
he got one and planted it in his garden at Stratford.
It is perfectly true that this is a materialistic age, and for this very
reason we want our poets all the more. We find that every
generation contrives to catch its singing larks without the sky’s
falling. When the poet comes he always turns out to be the man who
discovers that the passing moment is the inspired one, and that the
secret of poetry is not to have lived in Homer’s day or Dante’s, but to
be alive now. To be alive now, that is the great art and mystery. They
are dead men who live in the past, and men yet unborn who live in
the future. We are like Hans-in-Luck, forever exchanging the
burthensome good we have for something else, till at last we come
home empty-handed. The people who find their own age prosaic are
those who see only its costume. And this is what makes it prosaic:
that we have not faith enough in ourselves to think that our own
clothes are good enough to be presented to Posterity in. The artists
seem to think that the court dress of posterity is that of Vandyke’s
time or Cæsar’s. I have seen the model of a statue of Sir Robert
Peel—a statesman whose merit consisted in yielding gracefully to
the present—in which the sculptor had done his best to travesty the
real man into a make-believe Roman. At the period when England
produced its greatest poets, we find exactly the reverse of this, and
we are thankful to the man who made the monument of Lord Bacon
that he had genius enough to copy every button of his dress,
everything down to the rosettes on his shoes. These men had faith
even in their own shoe-strings. Till Dante’s time the Italian poets
thought no language good enough to put their nothings into but Latin
(and, indeed, a dead tongue was the best for dead thoughts), but
Dante found the common speech of Florence, in which men
bargained, and scolded, and made love, good enough for him, and
out of the world around him made such a poem as no Roman ever
sang.
We cannot get rid of our wonder, we who have brought down the
wild lightning from writing fiery doom upon the walls of heaven to be
our errand-boy and penny postman. In this day of newspapers and
electric telegraphs, in which common-sense and ridicule can
magnetise a whole continent between dinner and tea, we may say
that such a phenomenon as Mahomet were impossible; and behold
Joe Smith and the State of Deseret! Turning over the yellow leaves
of the same copy of Webster on “Witchcraft” which Cotton Mather
studied, I thought, Well, that goblin is laid at last! And while I mused,
the tables were dancing and the chairs beating the devil’s tattoo all
over Christendom. I have a neighbor who dug down through tough
strata of clay-slate to a spring pointed out by a witch-hazel rod in the
hands of a seventh son’s seventh son, and the water is sweeter to
him for the wonder that is mixed with it. After all, it seems that our
scientific gas, be it never so brilliant, is not equal to the dingy old
Aladdin’s lamp.
It is impossible for men to live in the world without poetry of some
sort or another. If they cannot get the best, they will get at some
substitute for it. But there is as much poetry as ever in the world if we
can ever know how to find it out; and as much imagination, perhaps,
only that it takes a more prosaic direction. Every man who meets
with misfortune, who is stripped of his material prosperity, finds that
he has a little outlying mountain-farm of imagination, which does not
appear in the schedule of his effects, on which his spirit is able to
keep alive, though he never thought of it while he was fortunate. Job
turns out to be a great poet as soon as his flocks and herds are
taken away from him.
Perhaps our continent will begin to sing by and by, as the others
have done. We have had the Practical forced upon us by our
condition. We have had a whole hemisphere to clear up and put to
rights. And we are descended from men who were hardened and
stiffened by a downright wrestle with Necessity. There was no
chance for poetry among the Puritans. And yet if any people have a
right to imagination, it should be the descendants of those very
Puritans. They had enough of it, or they could not have conceived
the great epic they did, whose books are States, and which is written
on this continent from Maine to California.
This lesson I learn from the past: that grace and goodness, the fair,
the noble, and the true will never cease out of the world till the God
from whom they emanate ceases out of it; that the sacred duty and
noble office of the poet is to reveal and justify them to man; that as
long as the soul endures, endures also the theme of new and
unexampled song; that while there is grace in grace, love in love,
and beauty in beauty, God will still send poets to find them, and bear
witness of them, and to hang their ideal portraitures in the gallery of
memory. God with us is forever the mystical theme of the hour that is
passing. The lives of the great poets teach us that they were men of
their generation who felt most deeply the meaning of the Present.
I have been more painfully conscious than any one else could be of
the inadequacy of what I have been able to say, when compared to
the richness and variety of my theme. I shall endeavor to make my
apology in verse, and will bid you farewell in a little poem in which I
have endeavored to express the futility of all effort to speak the
loveliness of things, and also my theory of where the Muse is to be
found, if ever. It is to her that I sing my hymn.
Mr. Lowell here read an original poem of considerable length, which
concluded the lecture, and was received with bursts of applause.
*** END OF THE PROJECT GUTENBERG EBOOK LECTURES ON
ENGLISH POETS ***
1.D. The copyright laws of the place where you are located also
govern what you can do with this work. Copyright laws in most
countries are in a constant state of change. If you are outside
the United States, check the laws of your country in addition to
the terms of this agreement before downloading, copying,
displaying, performing, distributing or creating derivative works
based on this work or any other Project Gutenberg™ work. The
Foundation makes no representations concerning the copyright
status of any work in any country other than the United States.
1.E.6. You may convert to and distribute this work in any binary,
compressed, marked up, nonproprietary or proprietary form,
including any word processing or hypertext form. However, if
you provide access to or distribute copies of a Project
Gutenberg™ work in a format other than “Plain Vanilla ASCII” or
other format used in the official version posted on the official
Project Gutenberg™ website (www.gutenberg.org), you must, at
no additional cost, fee or expense to the user, provide a copy, a
means of exporting a copy, or a means of obtaining a copy upon
request, of the work in its original “Plain Vanilla ASCII” or other
form. Any alternate format must include the full Project
Gutenberg™ License as specified in paragraph 1.E.1.
• You pay a royalty fee of 20% of the gross profits you derive from
the use of Project Gutenberg™ works calculated using the
method you already use to calculate your applicable taxes. The
fee is owed to the owner of the Project Gutenberg™ trademark,
but he has agreed to donate royalties under this paragraph to
the Project Gutenberg Literary Archive Foundation. Royalty
payments must be paid within 60 days following each date on
which you prepare (or are legally required to prepare) your
periodic tax returns. Royalty payments should be clearly marked
as such and sent to the Project Gutenberg Literary Archive
Foundation at the address specified in Section 4, “Information
about donations to the Project Gutenberg Literary Archive
Foundation.”
• You comply with all other terms of this agreement for free
distribution of Project Gutenberg™ works.
1.F.