Download as pdf or txt
Download as pdf or txt
You are on page 1of 70

Fundamentals of Pure Mathematics

Analysis Notes
Linhan Li and David Quinn
January 9, 2024

Contents
1 The Real Numbers 3
1.1 Algebraic structure of real numbers . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.2 The absolute value . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
1.3 The completeness axiom . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
1.4 Infinity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
1.5 Countability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
1.6 Mathematical proof . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17

2 Real sequences 20
2.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
2.2 Limit theorems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
2.3 Monotone sequences . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
2.4 Subsequences . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28

3 Infinite Series 30
3.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30
3.2 Series with nonnegative terms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34
3.3 Cauchy Condensation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38
3.4 Absolute convergence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41
3.5 Series with alternating signs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43

1
CONTENTS CONTENTS

4 Continuity 45
4.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45
4.2 The Extreme and Intermediate Value Theorems . . . . . . . . . . . . . . . . . . . . . . 47
4.3 Limits of functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52

5 Differentiability on R 57
5.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57
5.2 Differentiability Theorems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60
5.3 Mean Value Theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62
5.4 Monotone functions and the Inverse Function Theorem . . . . . . . . . . . . . . . . . . 64
5.5 Taylor’s theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67

The goal of this course is is to present Analysis with proofs. This is not yet another Calculus course;
our focus will be less on computational aspects and more on understanding of the theory including
many subtler points such as completeness of the real numbers, notions of continuity and convergence
and uniform convergence. We will develop the theory from basis principles and will teach you how to
develop your own proofs.
Scattered throughout the notes you will find some exercises but also some “Additional exercises”,
the references given in the additional exercises will always be to Ross [2] even if if not explicitly stated.

2
1 THE REAL NUMBERS

1 The Real Numbers


Every rigorous study on mathematics begins with certain undefined concepts; primitive notions on which
the theory is based and certain postulates that are assumed to be true and do not need proof. In our case,
these primitive notions will be notions of sets and real numbers with certain postulated properties.
As a side note, we mention, that the sets of numbers (integers, rational and real numbers) are con-
structed in another theory; a Set Theory. Hence what we take here as a primitive notion and postulate
its properties without proof is, in another part of mathematics, proven from yet simpler notions and
concepts.

1.1 Algebraic structure of real numbers


You will have encountered the real numbers already; both in (accelerated) Proofs and Problem Solving
and in any first calculus course. In either case, however, they will lack a rigorous definition. Liebeck
[1] for instance builds on an intuitive concept of “infinite straight line”. Following this, Liebeck quickly
“proves” that there exists a real solution to x2 = 2 by constructing a square of side length 1, and using

Pythagoras to see that the diagonal has length 12 + 12 . This infinite line is a poor basis for rigorous
arguments as it relies too heavily on intuition, yet this intuition motivates how we establish our formal
definition. We want there to be a square of side length 1, and consequently, we want its diagonal to have
a well-defined length.
We will build our definition of the real numbers from three pieces, each defined by axioms; rules we
take as a given. First we will have an algebraic object called a field, next we require an ordering which is
compatible with the algebraic structure and finally we require completeness, an axiom requiring certain
sets to have a unique least upper bound (for instance the set {x | x2 < 2, x ∈ Q} has a least upper bound
which is a real number but does not have a rational least upper bound). We will cover this in detail in
this chapter.
We shall denote the set of real numbers by R. The set R is equipped with two algebraic operations +
and · (these are functions with domain R × R and range in R). We shall assume that the triple (R, +, ·)
is a field; that is it satisfies the following properties for every a, b, c ∈ R:
F1 Closure properties: a + b and a · b belong to R.

F2 Associative properties: (a + b) + c = a + (b + c) and (a · b) · c = a · (b · c).

F3 Commutative properties: a + b = b + a and a · b = b · a.

F4 Existence of the Additive Identity: There is a unique element 0 ∈ R such that 0 + a = a for all
a ∈ R.

3
1.1 Algebraic structure of real numbers 1 THE REAL NUMBERS

F5 Existence of the Multiplicative Identity: There is a unique element 1 ∈ R such that 1 ̸= 0 and
1 · a = a for all a ∈ R (one consequence of this is that fields must have at least two elements1 ).

F6 Existence of Additive Inverses: For every a ∈ R there is a unique element −a ∈ R such that
a + (−a) = 0.

F7 Existence of Multiplicative Inverses: For every a ∈ R and a ̸= 0 there is a unique element a−1 ∈ R
such that a · (a−1 ) = 1.

F8 Distributive Law: (This is the only law connecting addition with multiplication) a · (b + c) =
a · b + a · c.

Remark 1.1.1. This is a long list, sets which satisfy fewer of these are important in other contexts. For
instance properties, F1-F4, and F6 together mean that (R, +) is a commutative (abelian) group. Proper-
ties F1-F3, F5, F6 mean that (R \ {0}, ·) is a commutative (abelian) group. These are both examples of
infinite groups. The last property, F8, ties the two algebraic operators + and · together. There are many
examples of fields. You will already have seen the rational numbers Q and the complex numbers C from
[1]. You have also used the integers modulo a prime p denoted Z p , although you might not have realised
that it is a field (it is essential that p is prime).

To simplify our notation in what follows we shall usually write

• a − b instead of a + (−b)

• ab instead of a · b
1
• 1/a or instead of a−1
a
a
• or a/b instead of a · b−1
b
From the postulated properties one might derive all algebraic laws of R.

Exercise 1.1.2. Use only the axioms F1-F8 to show that

• −a = (−1) · a and −(−a) = a

• −(a − b) = b − a
1A set with one element can be given an algebraic structure, let R = {e}, where e is for element, then ({e}, +, ·) is a ring
if we set e + e = e · e = e. Indeed ({e}, +) is a group. Note that ({e}, +, ·) fails only axiom F5 because the multiplicative and
additive identities must be distinct.

4
1.1 Algebraic structure of real numbers 1 THE REAL NUMBERS

• If a, b ∈ R and ab = 0 then either a = 0 or b = 0.

Note that these are all true in any field as you need only the field axioms to derive them. Indeed the
fact that R is a field does not completely describe the real number system and we can find other fields
distinct from R. The set of real numbers is also ordered, i.e. it has a concept of “less than”.
Order Axioms: There is a relation 2 < on R which has the following properties:

O1 Trichotomy Property: Given a, b ∈ R one and only one of the following statements hold:

a < b, b < a, a = b.

O2 Transitive Property: a < b and b < c imply a < c.

O3 Additive Property: a < b and c ∈ R imply a + c < b + c.

O4 Multiplicative Properties

a<b and c>0 imply ac < bc

and
a<b and c<0 imply bc < ac.

By b > a we shall mean a < b. By a ≤ b and b ≥ a we shall mean a < b or a = b. If a < b and b < c we
shall simply write a < b < c.

Exercise 1.1.3. Let 2 := 1 + 1 where 1 is the multiplicative identity. Using only the axioms above prove
that in any ordered field we have 2 > 1.

Definition 1.1.4. We shall call a real number a ∈ R positive if a > 0. We shall call a ∈ R nonnegative if
a ≥ 0.

The set R contains certain special subsets: the set of natural numbers N

N = {1, 2, 3, . . . },

the natural numbers with zero


N0 = {0, 1, 2, 3, . . . },
2 Recall from [1] that a relation on R can be seen as a subset of R × R. The conditions which is satisfies may make it a
special type of relation, such as an equivalence relation, or an order relation.

5
1.1 Algebraic structure of real numbers 1 THE REAL NUMBERS

the set of integers Z


Z = {· · · − 2, −1, 0, 1, 2, 3, . . . },
and the set of rationals nm o
Q= : m, n ∈ Z and n ̸= 0 .
n
We also call the numbers belonging to the set Qc = R \ Q irrational numbers.
Note that we have not really defined the sets N and Z since it is not clear what number 2 or 3 really
mean without a proper definition. We define 2 as 1 + 1 and 3 as 2 + 1 or 1 + 1 + 1. The sets N0 and Z
are characterized by the following properties. Note that informally we can think of s(n) as the successor
of n, or s(n) = n + 1, and similarly p(n) = n − 1. There are many ways to construct a list of equivalent
axioms, these are based on the Peano axioms. We begin with the set N0 , the natural numbers with 0.

N1 0 ∈ N0 .

N2 For any n ∈ N0 there exists s(n) ∈ N0 .

N3 If s(n) = s(m) then n = m.

N4 There does not exist n ∈ N0 with s(n) = 0.

N5 If 0 ∈ A ⊂ N and if n ∈ A implies s(n) ∈ A then A = N0 .

We then define the integers Z as a set which contains N0 and satisfies the following axioms.

Z1 Given n ∈ Z then there exists p(n) ∈ Z such that s(p(n)) = n.

Z2 If A ⊂ Z and s(n) ∈ A if and only if n ∈ A then A = Z.

Exercise 1.1.5. Remove any of the axioms from the definition of the integers (or the natural numbers)
and give a set which satisfies these fewer axioms but is not the integers (or the natural numbers).

Theorem 1.1.6. (On mathematical Induction). Suppose for each n ∈ N0 that P(n) is a proposition (a
verbal statement or a formula) that satisfies the following two properties:

(i) P(0) is true

(ii) For every k ∈ N0 for which P(k) is true, P(k + 1) is also true.

Then for every integer n ∈ N0 the proposition P(n) is true.

6
1.2 The absolute value 1 THE REAL NUMBERS

Proof. We consider the subset A ⊂ N0 given by A = {n | P(n) is true}, the hypothesis states that 0 ∈ A
and whenever k ∈ A then k + 1 ∈ A, thus by the fifth axiom for N0 we have A = N0 , i.e., P(n) is true for
all n ∈ N0 .

Observe that N and Z are not fields, the set N fails to contain an additive identity 0 and also has no
additive inverses. Also (Z, +) is a group but Z fails to contain multiplicative inverses (for example 1/2
which is a multiplicative inverse of 2 ∈ Z fails to be in Z). Finally, recall that Q is a field. The same
ordering we have on R also works for Q which means that all axioms we have stated thus far are also
satisfied by the set of rationals. We shall introduce two further sets of axioms that will make distinction
between the reals and the rationals.
We can also observe at this point that the fields C and Z p cannot be ordered. Verification is left as an
exercise.

1.2 The absolute value


Definition 1.2.1. Let a ∈ R. The absolute value of a is the number |a| defined by

a, if a ≥ 0
|a| =
−a, if a < 0.

Theorem 1.2.2. The following statements are true:

• The absolute value is multiplicative, i.e., |ab| = |a||b| for all a, b ∈ R.

• Let a ∈ R and M ≥ 0. Then |a| ≤ M if and only if −M ≤ a ≤ M.

Proof. We prove the first statement as an illustration, the second statement is left as an exercise.
We consider this case by case, first suppose a, b ≥ 0 so that |a| = a and |b| = b. Then we know from
order axiom O4 that ab ≥ 0 so that |ab| = ab and the result follows: |ab| = ab = |a||b|.
Next let a ≥ 0 and b < 0. We have |a| = a, |b| = −b and by axiom O4, we have ab ≤ 0 so that
|ab| = −ab. Together we get |ab| = −ab = a(−b) = |a||b|. The case b ≥ 0 and a < 0 is similar.
Finally if a, b < 0 we have |a| = −a, |b| = −b and by axiom O4 |ab| = ab. We use the first part of
Exercise 1.1.2, namely −(−c) = c for all c ∈ R, and axioms F2 and F3 to show:

|ab| = ab = −(−ab) = (−a)(−b) = |a||b|.

Theorem 1.2.3. The absolute value satisfies the following three properties:

7
1.2 The absolute value 1 THE REAL NUMBERS

(i) Positive definite |a| ≥ 0. |a| = 0 if and only if a = 0.

(ii) Symmetry |a − b| = |b − a|.

(iii) Triangle inequality |a + b| ≤ |a| + |b| and ||a| − |b|| ≤ |a − b|.

Proof. We prove only the triangle inequality. Using that −|a| ≤ a ≤ |a| and −|b| ≤ b ≤ |b| we have

−(|a| + |b|) ≤ a + b ≤ |a| + |b|

so that |a + b| ≤ |a| + |b|.


For the second inequality we use a replacement a = a + b − b

|a| − |b| = |a − b + b| − |b| ≤ |a − b| + |b| − |b| = |a − b| .

Of course here a and b are arbitrary so that changing roles we also have |b| − |a| ≤ |b − a| = |a − b| which
can be written as −|a − b| ≤ −(|b| − |a|) = |a| − |b| so that altogether

−|a − b| ≤ |a| − |b| ≤ |a − b|

from which we conclude ||a| − |b|| ≤ |a − b|.

Additional exercises 1. Exercise 1.3.6 from [2] gives a simple extension of the triangle inequality.

Theorem 1.2.4. Let x, y ∈ R.

(i) x < y + ε for all ε > 0 if and only if x ≤ y.

(ii) x > y − ε for all ε > 0 if and only if x ≥ y.

(iii) |x| < ε for all ε > 0 if and only if x = 0.

Proof. (i) We prove the statement by contradiction. Assume that x < y + ε for all ε > 0 but x > y. Set
ε0 = x−y > 0. Then x = y+ε0 , hence the hypothesis (i) is not satisfied for ε = ε0 . This is a contradiction.
(ii) follows from (i) (by multiplication of both sides by −1).
Finally, If |x| < ε for all ε > 0 it follows that −ε < x < ε. Using parts (i) and (ii) (for y = 0) we then
conclude that 0 ≤ x ≤ 0. Then by the Trichotomy property x = 0.

The absolute value is closely related to the notion of distance. If a, b are two real numbers the define
the distance between these two points by |b − a|. We might also denote the distance between a and b by
the symbol d(a, b). Hence d(a, b) = |b − a|.

8
1.3 The completeness axiom 1 THE REAL NUMBERS

Notice that the property (i) of the previous theorem assures that the distance between two points is
never negative. The property (ii) says that the distance between a and b is the same as the distance
between b and a. Finally, the property (iii) can be interpreted as follows:

d(a, b) ≤ d(a, c) + d(b, c).

Hint: Use (iii) to prove |b − a| = |(b − c) + (c − a)| ≤ |b − c| + |c − a|.

1.3 The completeness axiom


We begin this section by recalling the definition of maximal and minimal elements of a set. One crucial
feature of these elements is that they are elements of the set. Note that they do no need to exist for an
arbitrary set.

Definition 1.3.1. Let S ⊂ R.

(i) If there exists s′ ∈ S such that s ≤ s′ for all s ∈ S then we say S has a maximum or maximal element
which is given by s′ .

(ii) If there exists s0 ∈ S such that s0 ≤ s for all s ∈ S then we say S has a minimum or minimal element
which is given by s0 .

Let a, b be real numbers. A closed interval is a set of the form:

[a, b] = {x ∈ R : a ≤ x ≤ b}, [a, ∞) = {x ∈ R : a ≤ x},

(−∞, b] = {x ∈ R : x ≤ b}, or (−∞, ∞) = R.


The set [a, b], for a ≤ b has minimum a and maximum b. Note that if a > b then [a, b] is the empty set.
An open interval is a set of the form

(a, b) = {x ∈ R : a < x < b}, (a, ∞) = {x ∈ R : a < x},

(−∞, b) = {x ∈ R : x < b}, or (−∞, ∞) = R.


Finally a half-open interval is a set of the form

(a, b] = {x ∈ R : a < x ≤ b}, or [a, b) = {x ∈ R : a ≤ x < b}.

9
1.3 The completeness axiom 1 THE REAL NUMBERS

The set (a, b] with a < b has no minimum, but has maximum b. By saying interval we mean either a
closed interval, an open interval or half-open interval as defined above.
If a < b the intervals (a, b), [a, b], (a, b] and [a, b) correspond to line segments on a real line (with
the end-points a and b either belonging or not belonging to the set). We refer to such an interval as
nondegenerative.
As we already observed, if b < a these “intervals” are all the empty set. If a = b only [a, b] is non-
empty, and in this case [a, a] = {a} is a singleton set and can be referred to as a degenerative interval.
Note that sometimes the empty set is also called a degenerative interval but the usage of the term here
varies.

Definition 1.3.2. Let E ⊂ R be nonempty.

• The set E is said to be bounded above if there is M ∈ R such that a ≤ M for all a ∈ E.

• A real number M is called an upper bound of the set E if a ≤ M for all a ∈ E.

• A real number s is called the supremum of the set E if

– s is an upper bound of E
– s ≤ M for all upper bounds M of the set E.

If a number s exists, we shall say that E has a supremum and write s = sup E.

We observe that if the supremum s exists then s is the least upper bound of the set E.

Example 1.3.3. If E = [0, 1] show that sup E = 1. Clearly 1 is an upper bound of E as by the definition
of the interval x ≤ 1 for all ∈ E. Let M be any other upper bound of E. Then x ≤ M for all x ∈ E. Since
1 ∈ E it follows that 1 ≤ M. Thus 1 is the smallest upper bound of the set E.

Remark 1.3.4. If a set has a supremum, it only has one supremum, i.e., the supremum is unique when-
ever it exists.

Theorem 1.3.5. (Approximation Property for Suprema). If the set E ⊂ R has a supremum then for any
positive number ε > 0 there exists a ∈ E such that

sup E − ε < a ≤ sup E.

10
1.3 The completeness axiom 1 THE REAL NUMBERS

Proof. Suppose this theorem is false for some ε > 0. It follows that no element of E lies between
sup E − ε and sup E. But then a ≤ sup E − ε for all a ∈ E and hence sup E − ε must be an upper bound
of E. Since sup E is the smallest upper bound it must be true that

sup E ≤ sup E − ε, or 0 ≤ −ε.

This is a contradiction with the fact that ε > 0.

Note that we will typically use this in the following form. For any m ∈ R such that m < sup E there
exists a ∈ E such that m < a ≤ sup E. Note that a might be equal to the supremum, but it does not need
to be.
The approximation property is a formalisation of the fact that the supremum is the least upper bound,
if we take any number less than the supremem then it is not an upper bound. We can define the supremum
of a set as an upper bound which satisfies the approximation property.

Remark 1.3.6. If E ⊂ N has a supremum then sup E ∈ E.

Proof. Let s = sup E. We apply the Approximation Property choosing ε = 1. If follows that there is
some x0 ∈ E such that
s − 1 < x0 ≤ s = sup E.
If x0 = s then s ∈ E and we are done.
Otherwise s − 1 < x0 < s and we can apply the Approximation property again to choose x1 ∈ E such
that x0 < x1 < s. By subtracting x0 we obtain 0 < x1 − x0 < s − x0 . Since x1 − x0 is a positive integer we
have that x1 − x0 ≥ 1. But since s − 1 < x0 then

x1 − x0 < s − (s − 1) = 1

which is a contradiction.

Completeness Axiom. If E ⊂ R is nonempty and bounded above, then E has a supremum.

Theorem 1.3.7. (Archimedean Principle). Given positive real numbers a, b ∈ R there is an integer n ∈ N
such that b < na.

Proof. If b < a we set n = 1 and we are done.


Otherwise, denote by E the set {k ∈ N : ka ≤ b}. Clearly, E is nonempty as 1 ∈ E. The set E is also
bounded as ka ≤ b is equivalent to k ≤ b/a so b/a is an upper bound of E. Then by the completeness
axiom s = sup E exists. Also by Remark 1.3.6, since E ⊂ N we have that s ∈ E. We set n = s + 1. Then
n ∈ N and na > b hence the claim holds.

11
1.3 The completeness axiom 1 THE REAL NUMBERS

Example 1.3.8. Show that for any real number r > 0 there is an integer n ∈ N such that
1
0< < r.
n
Proof. Set b = 1/r and a = 1 in the Archimedean Principle. It follows that there is n ∈ N such that
1/r < na = n. But 1/r < n is equivalent to 1/n < r.

Theorem 1.3.9 (Density of Rational numbers). Let a < b be real numbers. Then there is q ∈ Q such that
q ∈ (a, b).

Proof. Since b − a > 0 by Example 1.3.8 one can find an integer such that n1 < b − a. Consider two
cases.
If b > 0 and a < 0 then 0 ∈ (a, b) is rational. Otherwise a > 0 and we consider the set E = {k ∈ N0 :
k/n < b}. The set E is nonempty as 0 ∈ E and bounded since k/n < b is equivalent to k < nb (both n
and b are fixed). Thus E has a maximal element, say s. Note that we have
s s+1
<b≤ .
n n
Subtracting s/n, and the definition of n here gives 0 < b − s/n ≤ 1/n < b − a, and in particular
s
b− < b−a.
n
Rearranging this (subtract b and multiply by −1) we get a < s/n so that s/n ∈ (a, b).
The second case is if b ≤ 0. By the Archimedean principle one can find integer k such that k + b > 0.
Then using by the first case there is q ∈ Q such that a + k < q < b + k. Therefore a < q − k < b, and thus
q − k is a rational number between a and b.

Definition 1.3.10. Let E ⊂ R be nonempty.

• The set E is said to be bounded below if there is m ∈ R such that m ≤ a for all a ∈ E.

• A number m is called a lower bound of the set E if m ≤ a for all a ∈ E.

• A number t is called the infimum of the set E if

– t is a lower bound of E
– m ≤ t for all lower bounds m of the set E.

12
1.4 Infinity 1 THE REAL NUMBERS

If number t exists, we shall say that E has an infimum and write t = inf E.

We observe that supremum and infimum are related via the following (reflection) principle. Here the
set −E is defined as
−E = {x ∈ R : x = −e for some e ∈ E}.

Theorem 1.3.11. Let E ⊂ R be nonempty.

• Set E has a supremum if and only if the set −E has an infimum. Also

inf(−E) = − sup E.

• Set E has an infimum if and only if the set −E has a supremum. Also

sup(−E) = − inf E.

Exercise 1.3.12. (Monotone Property). Let A ⊂ B be two nonempty subsets of R. Show that if B is
bounded above then sup A ≤ sup B. Show that if B is bounded below then inf A ≥ inf B.

Additional exercises 2. Exercise 4.7 is similar to the above but goes a little further. Exercises 4.14 and
4.16 ask you to prove some statements that feel obvious but be careful to prove these rigorously.

1.4 Infinity
We have already encountered the symbols ∞ and −∞ when we introduced intervals. There they were
convenient notation to express when an ‘interval’ is unbounded. In the context of intervals, they provide
a useful piece of notation for sets that will occur frequently but these symbols should not be mistaken
for real numbers.
We introduce the set of so-called extended real numbers R∗ = R ∪ {−∞, ∞} and an ordering compat-
ible with the ordering on R. Let a ∈ R, then we let

−∞ < a, a < ∞, −∞ < ∞ a ∈ R.

The symbols ±∞ are not real numbers and should never be used in an algebraic expression as if they
were. The following are especially egregious

∞ − ∞, 0 · ∞, ∞/∞, a/0.

We can easily invent cases such as limx→∞ (x − (x − 1))), then erroneously claim this limit is something
like limx→∞ x − limx→∞ (x − 1).

13
1.5 Countability 1 THE REAL NUMBERS

Our definition of the supremum of a non-empty set E requires sup E ∈ R, however, an alternative
definition allows the supremum to take values in the extended real numbers. In that case sup E ∈ R ∪
{−∞, ∞}, and in particular sup E = ∞ if and only if E is not bounded above. In a similar manner,
inf E = −∞ if and only if E is not bounded below. In the setting of the extended real numbers, every
non-empty set of real numbers has a supremum and we use the term finite supremum to distinguish when
sup E ∈ R (i.e., when E is bounded above).
We can even define the supremum of the empty set as sup 0/ = −∞, do you think this makes sense?

1.5 Countability
Let us first recall what is a function f : X → Y between two sets X and Y . Each element x ∈ X is assigned
a unique (meaning one and only one) y = f (x) ∈ Y . Notice that same y can be assigned to two (or more)
different x’s (an example can be a function f (x) = x2 where both x = −1, 1 are assigned same value 1).
Also not every y ∈ Y must be assigned (again f : R → R given by f (x) = x2 does not assign values y < 0
to any x).
For this reason, we introduce the following terminology.

Definition 1.5.1. Let f be a function from a set X into a set Y .

(i) f is said to be one-to-one (1-1) on X if and only if each element y ∈ Y is assigned to at most one
x ∈ X. That is
If x1 , x2 ∈ X and f (x1 ) = f (x2 ) then x1 = x2 .

(ii) f is said to take X onto Y if for each y ∈ Y there is an x ∈ X such that y = f (x).

Functions that are 1-1 are also called injective, functions that are onto are also called surjective.
Functions that are both 1-1 and onto are called bijective.
The sets X, Y play a vital role. For example f (x) = x2 is bijective as a function f : [0, ∞) → [0, ∞)
but it is neither 1-1 nor onto as a function f : R → R.
A function f : X → Y which is 1-1 and onto (i.e., a bijection) has an inverse function. The inverse
function to f is the unique function g : Y → X such that

g( f (x)) = x for all x ∈ X and f (g(y)) = y for all y ∈ Y.

The function g above is usually denoted as f −1 . This is a bit unfortunate notation since f −1 can also
denote the function 1/ f . Since we are using the same notation for two very different things you must

14
1.5 Countability 1 THE REAL NUMBERS

deduce from the context whether we mean an inverse function or a reciprocal 1/ f .


Functions that have an inverse can be used to “count” the number of elements in finite and infinite
sets. Intuitively, here is how it works. When we count the elements of the set “E” we assign a number
n ∈ N to each element of E. This means that we construct a function from a subset of N to E. For
example, if E contains 3 elements the counting function takes {1, 2, 3} to E. Since we don’t want to
count any element E more than once f should be 1-1. We also do not want to miss any element of E and
hence the counting function f must be onto E. This intuition leads us to the following definition.

Definition 1.5.2. Let E be a set.

(i) E is said to be finite if either E = 0/ or there is an integer n ∈ N and a bijection f : {1, 2, 3, . . . , n} →


E. We say that the set E has n elements.

(ii) E is said to be countable if there is a bijective function f : N → E.

(iii) E is said to be at most countable if E is finite or countable.

(iv) E is said to be uncountable if E is neither finite nor countable.

The set N itself is countable because the function f : N → N, f (n) = n, is a bijection. The set
S = {2, 4, 6, ...} of all even positive integers is countable because the function f : N → S, f (n) = 2n, is
a bijection. The set Z is countable because the function f : N → Z defined by the formula below is a
bijection. 
n, if n is even,
f (n) = 2 (1.1)
− n−1 , if n is odd.
2

This function ‘enumerates’ the integers as shown in the table below.

N 1 2 3 4 5 6 7 ···
↓ ↓ ↓ ↓ ↓ ↓ ↓ ↓
Z 0 1 -1 2 -2 3 -3 ···

Not every set is countable, for example, open nonempty intervals are uncountable.

Theorem 1.5.3. A nonempty set E is at most countable if and only if there is an onto (surjective) function
f : N → E.

15
1.5 Countability 1 THE REAL NUMBERS

Proof. Suppose the set E is finite, then there is a bijection f : {1, 2, . . . , n} → E, we extend this function
to g : N → E by setting g(i) = f (i) for i ∈ {1, 2, . . . , n} and g(i) = f (1) for i > n. This function g is onto.
If E is countable then by definition there exists a bijection which is clearly onto.
Suppose now we have an onto function f : N → E. Inductively we define a function gn : {1, . . . , n} →
E, with g(1) = f (1) and g(k) = f (i) where i is the smallest natural number such that

f (i) ∈
/ {g(1), g(2), . . . , g(k − 1)}.

Two things can happen, at some point no such i exists then since f is onto we know that E is finite.
Otherwise for each element of e ∈ E we know f (i) = e for some i, and by the construction of g we have
g( j) = f (i) = e for some j ≤ i, hence g is onto and by construction g is one-to-one.

Theorem 1.5.4. (Cantor) The open interval (0, 1) is uncountable.

Proof. Let f : N → (0, 1) be any function. Let us write each number f ( j), j ∈ N in decimal notation
(using finite expansion if possible). That is

f (1) = 0.α11 α12 α13 . . . ,

f (2) = 0.α21 α22 α23 . . . ,


and so on for each f ( j). The number αi j represents the j-th digit in the decimal expansion of f (i).
We now show that the function f is not onto the interval (0, 1). We construct a real number x from
this interval not equal to f ( j) for any j ∈ N. We consider

x = 0.β1 β2 β3 . . . ,

such that we take βk = αkk + 1 if αkk ≤ 5 and βk = αkk − 1 otherwise. Observe that if x = f ( j) it would
have to be true that βk = α jk for every k = 1, 2, 3, . . . . That however is false as we defined β j such that it
is different from α j j . Hence x ̸= f ( j) for any j and therefore f is not onto (0, 1), and hence no function
f can be bijective and therefore the set (0, 1) is uncountable.

Proposition 1.5.5. If S1 and S2 are countable sets then S1 ∪ S2 is countable.

Proof. If S1 and S2 are countable, then there exists two bijections f1 : N → S1 and f2 : N → S2 . Define
f : N → S1 ∪ S2 by 
 f n+1  , if n is odd,
1 2
f (n) = (1.2)
 f2 n ,

2 if n is even.

16
1.6 Mathematical proof 1 THE REAL NUMBERS

We claim that f is a surjection. Indeed, if y ∈ S, then y ∈ S1 or y ∈ S2 . If y ∈ S1 , then y = f1 (k) for some


positive integer k, therefore y = f (2k − 1). If y ∈ S2 , then y = f2 (k) for some positive integer k, therefore
y = f (2k).
By Theorem 1.5.3, the set S1 ∪ S2 is at most countable, and since it isn’t finite, it is countable.

Theorem 1.5.6. Suppose A, B are sets.

(i) If A ⊂ B and B is at most countable, then A is at most countable.

(ii) If A ⊂ B and A is uncountable, then B is uncountable.

(iii) The set of real numbers R is uncountable.

Theorem 1.5.7. Let A1 , A2 , A3 , . . . be at most countable sets.

(i) A1 × A2 is at most countable.



[
(ii) E = A j = {x : x ∈ A j for some j ∈ N} is at most countable.
j=1

Remark 1.5.8. The sets Z and Q are countable, the set of all irrational numbers is uncountable.

Definition 1.5.9. Let X,Y be two sets and f : X → Y . The image of a set E ⊂ X under f is the set

f (E) = {y = f (x) : for some x ∈ E}.

The inverse image of a set E ⊂ Y under f is the set

f −1 (E) = {x ∈ X : there is y ∈ E such that f (x) = y}.

This definition makes sense whether or not f is 1-1 or onto.

1.6 Mathematical proof


What is a mathematical proof? Every mathematical statement has a list of hypotheses and a conclusion.
Mathematical proof is a sequence of logical steps that show from the hypotheses of the statement and
axioms (in our case axioms of real numbers) that the conclusion must be true.
There are three main methods of proof: mathematical induction (which we have already encoun-
tered), direct deduction and proof by contradiction.

17
1.6 Mathematical proof 1 THE REAL NUMBERS

The proof by direct deduction goes as follows. We assume the hypotheses of the statement are true
and proceed step by step to the conclusion. Each step is justified by hypothesis, one of the axioms or a
mathematical result that has already been proved.
The proof by contradiction has the following construction. We assume the hypotheses of the state-
ment we want to establish are true and that the conclusion we want to establish is false. Then we work
step by step (like when doing direct deduction) until we obtain a statement that is obviously false. At
this point we are done and using mathematical logic we can deduce that the conclusion we wanted to
establish must be true (since assuming the opposite leads us to a contradiction).

Example 1.6.1. Prove that a ̸= 0 implies that a2 > 0.

Proof. By the Trichotomy Property either a > 0 or a < 0.


Case 1: We first consider the case when a > 0. Then by the first Multiplicative property if we
multiply both sides of inequality 0 < a by a we obtain 0 = 0 · a < a · a = a2 .
Case 2: If a < 0 then −a > 0 and by Case 1 we have (−a)2 > 0. It was an earlier exercise to show
−a = (−1) · a. Further observing that

0 = ((−1) + 1) = (−1)((−1) + 1) = (−1)2 + (−1)

so that (−1)2 is the additive inverse of −1, i.e. (−1)2 = 1. Now

(−a)2 = (−1)2 · a2 = a2

so again a2 > 0.

Example 1.6.2 (Bernoulli’s inequality). For any x ∈ R with x ≥ −1 and any n ∈ N we have

(1 + x)n ≥ 1 + nx.

Proof. See Assignment 1.

The following is an example of a correct proof by contradiction in a situation where a proof by


contradiction is not needed.

Example 1.6.3. If a non-empty set E ⊂ R is bounded above then E has a unique supremum s ∈ R.

Proof. Since E is bounded above, the completeness axiom ensures the existence of a supremum. Sup-
pose we have two distinct suprema for E, called s and s′ . Both s and s′ are upper bounds for E. Since
s is a supremum and s′ is an upper bound we have that s ≤ s′ . In the same way, s′ is a supremum and s
is an upper bound so s′ ≤ s, and together s = s′ . This contradicts our assumption that there are distinct
suprema.

18
1.6 Mathematical proof 1 THE REAL NUMBERS

The structure of the proof above is as follows:

1. We aim to prove statement A.

2. Assume that statement A is false.

3. Prove statement A directly, i.e., statement A is true.

4. This contradicts our assumption that statement A is false, and therefore statement A is not false,
i.e., it is true.

We will present a ‘fixed’ proof.

Proof. Since E is bounded above, the completeness axiom ensures the existence of a supremum. Sup-
pose we s and s′ are suprema for E. Both s and s′ are upper bounds for E. Since s is a supremum and
s′ is an upper bound we have that s ≤ s′ . In the same way, s′ is a supremum and s is an upper bound so
s′ ≤ s, and together s = s′ . Thus the supremum is unique.

The difference is somewhat minor as the work goes into proving the statement directly. The main
difference here is one proof assumes that we can find two distinct values and derives a contradiction, the
other proof allows more freedom as it takes any two values and proves that these two values are equal.
Our proof of the approximation property in Remark 1.3.5 is an example of a ‘proper’ proof by
contradiction. Another statement which has a classic proof by contradiction is the following, but the
proof is left as an exercise.

Exercise 1.6.4. Show that 2 is an irrational number. Hint: Use proof by contradiction. Assume that

2 is a rational number.

Exercise 1.6.5. Prove that neither C nor Z p can be ordered.

Additional exercises 3. Exercises 1.8 and 1.9 from [2] will give you more practice using induction and
you will also gain experience with some useful bounds.

19
2 REAL SEQUENCES

2 Real sequences
2.1 Introduction
An infinite sequence is a function whose domain is N. A sequence whose terms are xn = f (n), for some
function f : N → R will be denoted by x1 , x2 , x3 , . . . , or (xn )n∈N or (xn )∞
n=1 or just (xn ).
For example, the sequence 1, 2, 3, 4, . . . shall be written as (n)n∈N .
Note that a sequence (xn )n∈N should not be confused with the set {xn : n ∈ N}. These are entirely
different concepts. Note further that (xn ) is a sequence but that xn is the nth term of a sequence.

Definition 2.1.1 ([2, Definition 7.1]). A sequence of real numbers (xn ) is said to converge to a real
number a if for every ε > 0 there is N ∈ N (N usually depends on ε) such that

for all n ≥ N we have that |xn − a| < ε.

We shall use the following phrases and notations interchangeably;

• (xn ) converges to a,

• xn converges to a as n goes to infinity,

• xn → a as n → ∞,

• lim xn = a,
n→∞

• the limit of (xn ) exists and is equal to a.

If a sequence does not converge we say that the sequence diverges.

In general, to show convergence we must start with an arbitrary ε and find a sufficiently large N ∈ N.
Furthermore, the constraint that N is natural number is unnecessary and not used in [2]. From the
Archimedean principle we know that for each real number r we can find a natural number N such that
N > r. With this we can easily swap between the two definitions.
1
Example 2.1.2. Show that n → 0 as n → ∞.

Proof. Pick any ε > 0. From the Archimedean principle it follows that there exists N ∈ N such that
N > ε1 or 1/N < ε. It follows that if n ≥ N then

|1/n − 0| = 1/n ≤ 1/N < ε.

Hence the limit of this sequence is 0.

20
2.1 Introduction 2 REAL SEQUENCES

Example 2.1.3 ([2, Example 8.4]). The sequence ((−1)n )n∈N has no limit, that is, the sequence diverges.

Proof. The main idea here is that the terms of the sequence are always ‘far’ apart. No matter how far
out in the sequence we go we can always find terms which are a distance of 2 apart, and if these were
both within ε of some limit then they would be less than 2ε apart, this suggests we can find a problem if
ε < 1.
Suppose that (−1)n → a ∈ R as n → ∞. Then for ε = 1 there is N ∈ N such that for n ≥ N we have:

|(−1)n − a| < 1.

This will give two different inequalities depending whether n is even or odd. If n is even we have
|1 − a| < 1 and for n odd | − 1 − a| = |1 + a| < 1. Hence,

2 = |(1 − a) + (1 + a)| ≤ |1 − a| + |1 + a| < 1 + 1 = 2.

This is a contradiction, hence the sequence does not converge to any a ∈ R.


2n+1
converges to 23 .

Example 2.1.4. Show that the sequence 3n+2 n∈N
We start with the rough work. To show this from the definition we start with an arbitrary ε > 0 and
find an Nε such that 2n+1 2
3n+2 − 3 < ε for all n > Nε . Let’s explore this.
!
n + 23 2n + 1 − 2n + 43

2n + 1 2 2n + 1 2 −1 1
− = − 2
= = =
3n + 2 3 3n + 2 3 n+ 3
3n + 2 9n + 6 9n + 6

1
So we want 9n+6 < ε which we can rearrange to get ε1 < 9n + 6 so n > 9ε 1
− 23 . This would work but it’s
also a little messy. This is an optimal value for Nε and that’s not needed. In other cases such optimal
values would be difficult to calculate so we’re better off bounding |an − a| but something easier.
One approach here is to bound as follows.

1 1 1
< <ε if n>
9n + 6 9n 9ε
1
Aside: Note that if we have something like 7n−3 then we could proceed as

1 1 1
< <ε if n>
7n − 3 3n 3ε
where the first inequality follows as n ∈ N so 7n − 3 > 3n (i.e. 4n − 3 > 0).
Back to the problem at hand, we now have what we need to write a proof.

21
2.1 Introduction 2 REAL SEQUENCES

1
then 2n+1 2 2n+1

Proof. We claim that for any ε > 0 if n > 9ε 3n+2 − 3 < ε and hence the sequence 3n+2 n∈N
converges to 23 .
In the following we simplify the difference 2n+1 2 1
3n+2 − 3 , and use n > 9ε in the final inequality.

!
n + 32 2n + 1 − 2n + 43

2n + 1 2 2n + 1 2 −1 1 1 1
− = − 2
= = = < < 9 =ε
3n + 2 3 3n + 2 3 n+ 3
3n + 2 9n + 6 9n + 6 9n 9ε

Hence
2n + 1 2 1
− <ε if n>
3n + 2 3 9ε
establishing our claim.
√ √
Example 2.1.5. The sequence (xn ) with xn = n + 1 − n converges to 0.

Proof. Here our intuition ought to agree with the claim. As n gets large the difference between n + 1

and n decreases. But of course this is far from sufficient to prove the claim.
√ √
We want to show that for any ε there exists a natural number N such that n + 1 − n < ε whenever
n ≥ N. Here we rewrite the terms as
√ √ √ √
√ √ ( n + 1 − n)( n + 1 + n) 1 1
n+1− n = √ √ =√ √ <√ .
n+1+ n n+1+ n n
1
The last bound above can be used to prove the claim by taking N > ε2
.

Remark 2.1.6. A sequence has at most one limit.

Lemma 2.1.7. Let (xn ) be a sequence of real numbers. If limn→∞ xn exists then so does limn→∞ |xn | and
limn→∞ |xn | = | limn→∞ xn |.

Proof. If L = limn→∞ xn then we know that for any ε > 0 there exists an N ∈ N such that |xn − L| < ε
whenever n ≥ N. Considering now the sequence (|xn |) we take the same ε and since ||xn | − |L|| ≤
|xn − L| < ε for any n ≥ N. We conclude that limn→∞ |xn | = |L| as required. Here the first inequality is
the triangle inequality from Theorem 1.2.3, and the second inequality comes from the convergence of
(xn ).

The previous lemma is of the form A =⇒ B and it is natural to ask if the converge may be true. It is
not. Consider the sequence xn = (−1)n then |xn | = 1 and limn→∞ |xn | = 1, however (xn ) is divergent. We
do however have a partial converse:

lim xn = 0 ⇔ lim |xn | = 0,


n→∞ n→∞

22
2.2 Limit theorems 2 REAL SEQUENCES

the proof follows from a simple observation of the definitions and it is a useful exercise to write a careful
proof.
Sequences whose limit is 0 will occur many times, we call these null sequences. These will be of
particular importance when we study series in Chapter 3. In the case of null sequences Lemma 2.1.7 can
be strengthened to the following.

Lemma 2.1.8. If (xn ) is a real sequence then limn→∞ xn = 0 if and only if limn→∞ |xn | = 0 .

We introduce the notion of when a sequence is bounded above, bounded below and bounded. We
remark that these notions are identical to the notions of boundedness of the set {xn : n ∈ N} (the notion
of boundedness for sets was introduced in Chapter 1).

Definition 2.1.9. Let (xn ) be a sequence of real numbers.

• (xn )n∈N is said to be bounded above if xn ≤ M for some M ∈ R and all n ∈ N.

• (xn )n∈N is said to be bounded below if xn ≥ m for some m ∈ R and all n ∈ N.

• (xn )n∈N is said to be bounded if it is both bounded above and bounded below.

Theorem 2.1.10 ([2, Theorem 9.1]). Every convergent sequence is bounded.

Proof. Let (xn ) be a sequence which converges to some a ∈ R. Given ε = 1 there is N ∈ N such that
|xn − a| < 1 for all n ≥ N. By the triangle inequality this implies that |xn | < |a| + 1 for n ≥ N. Hence all
terms of the sequence (xn ) with n “large” are bounded. On the other hand

|xn | ≤ max{|x1 |, |x2 |, . . . , |xN |} = M, if 1 ≤ n ≤ N.

Therefore the whole sequence (xn ) is bounded by max{M, 1 + |a|}.

Additional exercises 4. There are many examples of sequences in 7.3. Don’t shy away from writing a
proof of your claims, though as a starting point you can try parts (a), (b), (g) and (n). It would also be
useful to think through 7.5.
These are not far off questions 8.1 and 8.2.

2.2 Limit theorems


Usually, it is a challenge to decide whether a given sequence converges (or diverges). To make this
task easier we often compare a sequence whose convergence is in doubt with another sequence whose
convergence property is already known.

23
2.2 Limit theorems 2 REAL SEQUENCES

Theorem 2.2.1 (Squeeze Theorem,[2, Exercise 8.5]). Suppose that (xn ), (yn ) and (wn ) are real se-
quences.

• If both xn → a and yn → a (the same a) as n → ∞ and if

xn ≤ wn ≤ yn , for all n ≥ N0 ,

then wn → a as n → ∞.

• If xn → 0 and (yn ) is bounded then the product xn yn → 0 as n → ∞.

Additional exercises 5. The proof of this result is quite accessible and is covered by 8.4 and 8.5.
 
Example 2.2.2. Use the squeeze theorem to show that the sequence n21+n is convergent.
n∈N
1 1 1
Proof. Recall that n → 0 as n → ∞. Now since 0 ≤ n2
≤ n the sequence (1/n2 ) must converge to 0 also.
Furthermore,
1 1 1 1
2
= 2 2
≤ 2 ≤ 2, for all n ≥ 1.
2n n +n n +n n
Since both ( n12 ) and ( 2n12 ) converge to 0 the result follows. We have that

1
lim = 0.
n→∞ n2 + n

Theorem 2.2.3. Let E ⊂ R. If E has a finite supremum, i.e., E is non-empty and bounded above, then
there is a sequence (xn ) with each xn ∈ E such that xn → sup E as n → ∞. An analogous statement holds
if E has a finite infimum (i.e., E is non-empty and bounded below).

Proof. We construct a sequence as follows. For each n ∈ N we use the definition of supremum (in fact
we use the approximation property) to find xn ∈ E such that

sup E − 1/n < xn ≤ sup E.

Hence by the squeeze theorem such xn → sup E as n goes to infinity.

Theorem 2.2.4 ([2, Theorems 9.2-9.6]). Suppose that (xn ), (yn ) are real sequences and α ∈ R. If both
(xn ), (yn ) are convergent then

(i)
lim (xn + yn ) = lim xn + lim yn .
n→∞ n→∞ n→∞

24
2.2 Limit theorems 2 REAL SEQUENCES

(ii)
lim (αxn ) = α lim xn .
n→∞ n→∞

(iii)
lim (xn · yn ) = ( lim xn ) · ( lim yn ).
n→∞ n→∞ n→∞

(iv) If in addition limn→∞ yn ̸= 0 and yn ̸= 0 for all n ∈ N then


xn limn→∞ xn
lim = .
n→∞ yn limn→∞ yn

n3 + n2 − 1
Example 2.2.5. Calculate lim .
n→∞ 1 − 3n3

Solution: Dividing both the numerator and denominator by n3 we get

n3 + n2 − 1 1 + (1/n) + (1/n3 )
= .
1 − 3n3 (1/n3 ) − 3

Since 1/nk → 0 as n → ∞ for any k ∈ N (by the squeeze theorem and our knowledge of (1/n)n∈N ) we
have by the previous Theorem:

n3 + n2 − 1 1 + 0 + 0 1
lim 3
= =− .
n→∞ 1 − 3n 0−3 3

It is convenient to introduce the notation xn → +∞ and xn → −∞. Note that such sequences DO NOT
converge; they diverge!

Additional exercises 6. Get some practice with this by trying 9.1, 9.2 and 9.3.

Definition 2.2.6 ([2, Definition 9.8]). Let (xn ) be a sequence of real numbers.

(i) (xn ) is said to diverge to +∞ (we write xn → +∞ as n → ∞ or lim xn = +∞) if for each M ∈ R
n→∞
there is N ∈ N such that for all n ≥ N we have xn > M.

(ii) (xn ) is said to diverge to −∞ (notation xn → −∞ as n → ∞ or lim xn = −∞) if for each m ∈ R there
n→∞
is N ∈ N such that for all n ≥ N we have xn < m.

We note that previous Theorems can be extended to infinite limits. For example we can generalize
Theorem 2.2.4 to incorporate sequences which diverge to ±∞. For example part (i) has the following
generalization.

25
2.2 Limit theorems 2 REAL SEQUENCES

Theorem 2.2.7. Suppose that (xn ), (yn ) are real sequences. If both lim xn , lim yn exist and belong to
n→∞ n→∞
the set of extended real numbers R∗ , then

lim (xn + yn ) = lim xn + lim yn ,


n→∞ n→∞ n→∞

provided the forbidden algebraic operation ∞ − ∞ (or its commutative analogue −∞ + ∞) does not occur.

There are similar statements for parts (ii)-(iv) of Theorem 2.2.4.

Theorem 2.2.8 (Comparison theorem for sequences). Suppose that (xn ), (yn ) are real sequences. If both
lim xn , lim yn exist (and belong to the set of extended real numbers R∗ ) and if
n→∞ n→∞

xn ≤ yn for all n ≥ N for some N ∈ N

then
lim xn ≤ lim yn .
n→∞ n→∞

Proof. We limit ourselves here to the case lim xn , lim yn ∈ R. Assume by contradiction that
n→∞ n→∞

ε = lim xn − lim yn > 0.


n→∞ n→∞

It follows from the definition of the limit of a sequence that there are natural numbers N1 , N2 such that

( lim xn ) − xk < ε/2, for all k ≥ N1


n→∞

and
( lim yn ) − yk < ε/2, for all k ≥ N2 .
n→∞
Hence h i h i
ε = lim xn − lim yn = ( lim xn ) − xk + (xk − yk ) + yk − ( lim yn ) .
n→∞ n→∞ n→∞ n→∞
If we take k ≥ max{N, N1 , N2 } we have that xk − yk ≤ 0 by our assumption and hence by triangle in-
equality:
ε ≤ ( lim xn ) − xk + yk − ( lim yn ) < ε/2 + ε/2 = ε,
n→∞ n→∞
which is a contradiction. Hence our claim holds.

Remark 2.2.9. In the last theorem if we had xn < yn in place of xn ≤ yn then we could still only conclude
lim xn ≤ lim yn . Consider as the sequences xn = 0 and yn = 1/n, clearly 0 < 1/n for all n ∈ N but the
n→∞ n→∞
limits of (xn ) and (yn ) are equal.

Additional exercises 7. Exercises 9.9 and 9.10 cover some related cases and give useful practice for
dealing with sequences which diverge to infinity.

26
2.3 Monotone sequences 2 REAL SEQUENCES

2.3 Monotone sequences


We begin with a special case about monotone sequences.

Definition 2.3.1 ([2, Definition 10.1]). Let (xn ) be a sequence of real numbers.

(i) (xn ) is said to be increasing (respectively strictly increasing) if x1 ≤ x2 ≤ x3 ≤ . . . (x1 < x2 < x3 <
. . . for strictly increasing).

(ii) (xn ) is said to be decreasing (respectively strictly decreasing) if x1 ≥ x2 ≥ x3 ≥ . . . (x1 > x2 > x3 >
. . . for strictly decreasing).

(iii) (xn ) is said to be monotone if it is either increasing or decreasing.

The first observation we make is that monotone bounded sequences are convergent.

Theorem 2.3.2 (On monotone convergence, [2, Theorem 10.2]). If (xn ) is increasing and bounded above
or if it is decreasing and bounded below, then (xn ) is convergent (and converges to the supremum/infimum
of the set {xn | n ∈ N} respectively).

Proof. Suppose that (xn ) is increasing and bounded above. By the Completeness Axiom the supremum
a = sup{xn : n ∈ N} exists and is finite. Let ε > 0. By the definition of the supremum there exists N ∈ N
such that
a − ε < xN ≤ a.
Since (xn ) is increasing xN ≤ xn for all n ≥ N. It follows that a − ε < xn ≤ a for all n ≥ N. Hence
|xn − a| < ε from which convergence follows. The proof in the case that (xn ) is decreasing and bounded
below is analogous and left as an exercise.

Example 2.3.3. Show that a1/n → 1 as n → ∞, provided a > 0.

Hint of a solution. We shall only consider the case a > 1. Observe that (a1/n ) is decreasing. Indeed,
an+1 > an . Taking the n(n + 1)st root then yields a1/n > a1/(n+1) . Since a > 1 we also have a1/n > 1 so
the sequence (a1/n ) is bounded below. Thus by the previous theorem this sequence has a limit (and the
limit is ≥ 1). Denote this limit by L. We see that
 2
2 1/(2n)
L = lim a = lim a1/n = L.
n→∞ n→∞

Here we have used material around Remark 2.4.2 in the next section. In particular we use that the limit
of (a1/(2n) ) is L since it is a subsequence of (a1/n ). Now L2 = L implies that either L = 0 or L = 1. Since
L ≥ 1 it must be that L = 1.

27
2.4 Subsequences 2 REAL SEQUENCES

Additional exercises 8. Exercise 10.1 should be simple if you have a fair understanding of the defini-
tions. Exercise 10.10 will take a bit more work and gives a good example of how this theory is used.

Indeed if a monotone sequence is unbounded then we have a limit in the extended real numbers.

Theorem 2.3.4 ([2, Theorem 10.4]). Let (xn ) be an unbounded monontone sequence of real numbers.

(i) If (xn ) is an unbounded increasing sequence then limn→∞ xn = ∞.

(ii) If (xn ) is an unbounded decreasing sequence then limn→∞ xn = −∞.

Proof. We prove only the first part, with the second being similar. First observe that any increasing
sequence is bounded below by its first term so that (xn ) unbounded implies that (xn ) is not bounded
above.
For any M ∈ R we there exists N ∈ N such that xN ≥ M, but then for any n ≥ N we have that
M ≤ xN ≤ xn since the sequence is increasing. This, however, is the definition of limn→∞ xn = ∞ so our
claim is proven.

These last two theorems together show that for any monotone sequence, the limit always exists as an
extended real number.

2.4 Subsequences
We introduce a notion of subsequence. Informally, a subsequence is obtained from x1 , x2 , x3 , . . . by
“deleting” some xn ’s, infinitely many must remain, and these must fall in the same order as before. For
example if the original sequence is (−1)n then by deleting every odd term the resulting subsequence is
1, 1, 1, . . . . We always need a formal definition.

Definition 2.4.1 ([2, Definition 11.1]). By a subsequence of a sequence (xn )n∈N we shall mean a se-
quence of the form xn1 , xn2 , xn3 , . . . (written shortly as (xnk )k∈N ) where n1 < n2 < n3 < . . . is an increasing
sequence of natural numbers.

Remark 2.4.2. Any subsequence of a convergent sequence is also convergent and has the same limit.
Indeed every sequence is a subsequence of itself, see [2, Theorem 11.3]. Subsequences of divergent
sequences can behave in a greater variety of ways.

Theorem 2.4.3 ([2, Theorem 11.2]). Let (xn ) be a sequence of real numbers.

• There exists t ∈ R such that for any ε > 0 there exists infinitely many n ∈ N for which |xn − t| < ε,
if and only if there exists a subsequence of (xn ) converging to t.

28
2.4 Subsequences 2 REAL SEQUENCES

• The sequence (xn ) is not bounded above (below) if and only if there exists a subsequence diverging
to ∞ (diverging to −∞).

Proof. Assume, to begin, that (xn ) has a subsequence (xnk ) converging to some t ∈ R, then by definition
we know that for every ε > 0 there exists some K ∈ N such that |xnk − t| < ε for all k ≥ K. There are
thus infinitely many such xnk and these are all terms of the sequence (xn ) and so we are done. That was
the easy part.
Now we must construct a subsequence converging to t using our assumption that for any ε > 0 there
exists infinitely many n ∈ N such that |xn −t| < ε. What this condition means for us is that no matter how
far we go along the sequence we have only gone past finitely many terms, so that there are still infinitely
many terms which are within ε of t.
We inductively define a subsequence as follows. Let n1 be the least natural number such that
|xn1 − t| < 1, such an n1 exists by hypothesis (and any non-empty set of natural numbers has a mini-
mal element). We then define nk , for k > 1 to be the smallest natural number greater than nk−1 for which
|xnk − t| < 1k . That such a term exists is once again a result of our hypothesis.
We then need to show that the subsequence we constructed actually converges to t. Let any ε > 0
then there exists K ∈ N with 0 < K1 < ε but then |xnk − t| < 1k < K1 < ε for all k ≥ K, which proves that
the subsequence converges.
The second bullet point is somewhat similar, especially if we observe that ‘not bounded above’ is
equivalent to the existence, for any M ∈ R, of infinitely many n ∈ N such that xn > M. The proof is left
as an exercise, but you can see [2, Theorem 11.2] for an alternative approach.

Theorem 2.4.4 ([2, Theorem 11.3]). If the sequence (xn ) converges, then every subsequence converges
to the same limit.

29
3 INFINITE SERIES

3 Infinite series of real numbers


3.1 Introduction
Series are one of the most used tools of analysis. We can use series to define functions (ln, sin, cos, exp,
...) and use the partial sums to approximate the value of functions are given real numbers. A series is a
formal expression of the form

∑ ak ,
k=1
where (ak ) is a sequence of (real) numbers. For example ∑∞ n=1 n! is such a formal sum. We will, of
course, be interested in questions about the convergence of series and it should reasonable to expect that
this example does not converge.

Definition 3.1.1 ([2, Section 14.2]). Let S = ∑∞


k=1 ak be an infinite series with terms ak . For each n the
partial sum of S of order n is the defined by
n
sn = ∑ ak .
k=1

The series S is said to converge if and only if its sequence of partial sums (sn ) converges to some
s ∈ R as n → ∞. That is for any ε > 0 there exists N ∈ N such that if n ≥ N we have
n
|sn − s| = ∑ ak − s < ε.
k=1

In this case we shall write



∑ ak = s
k=1
and call the number s the sum or value of the series ∑∞ k=1 ak .
The series S is said to diverge if its sequence of partial sums (sn ) diverges as n → ∞. This can mean
sn → ±∞ or that the limit is not defined (in either case the limit is not a real number). When (sn ) diverges
to ∞ (or −∞) we shall write !
∞ ∞
∑ ak = ∞ or ∑ ak = −∞ .
k=1 k=1

We have already encountered one type of infinite convergent series, namely decimal expansions. We
write x ∈ [0, 1) in the form

x = 0.a1 a2 a3 . . . , where ai ∈ {0, 1, 2, . . . , 9}

30
3.1 Introduction 3 INFINITE SERIES

and we mean by that



a
x= ∑ 10kk .
k=1
For example 1/3 = 0.33333 . . . . The partial sums are s1 = 0.3, s2 = 0.33, s3 = 0.333, . . . . These are
approximations of 1/3 and they get closer to 1/3 as more terms of the decimal expansion are taken.

The main question we shall consider in this chapter is how to determine whether a given series
converges or diverges.

One way to determine if a given series converges is to find a formula for its partial sums. In most
cases, this is not possible but there are few simple examples where it works.

Example 3.1.2. Show that ∑∞ −k = 1.


k=1 2

Solution: This series is simple enough that we can show that the partial sums sn = ∑nk=1 2−k = 1 − 2−n
for n ∈ N. Thus sn → 1 as n → ∞.

Example 3.1.3. The previous example is a special case of a more general geometric series. For a, r ∈ R
a geometric series with common ratio is

∑ ark .
k=0
To determine the convergence of this series we examine the partial sums, sn , the sum of the first n
terms. Although this is a finite sum it will be hard to investigate how it behaves as n ends to infinity as
the number of terms increases. In the case of a geometric series there is a classic means to simplify as
rsn has many terms in common with sn so sn − rsn has a fixed number of terms.

sn = a + ar + ar2 + · · · + arn−1
rsn = ar + ar2 + · · · + arn−1 + arn
n
Indeed we have sn − rsn = a − arn , and if r ̸= 1 we see sn = a 1−r
1−r . This converges as long as it is defined
n a
and r converges, i.e., if |r| < 1, and in this case the series converges to 1−r . We need to consider the
case r = 1, now sn = a + a + a + · · · + a = na, and this converges only if a = o.

Example 3.1.4. Be careful not to treat every series as though it converges. From the previous example
we know that ∑∞ n
n=0 2 diverges. This is also intuitively clear since the partial sums are growing by a
greater amount with each additional term. However if we assumed, in error, that the series converges
then ∑∞ n ∞ n ∞ n
n=0 2 = S ∈ R we might go further and claim that 2S = ∑n=0 2 · 2 = ∑n=1 2 = S − 2 and from
this conclude that S = −2, but of it’s not. Using our definition we know that the series diverges.

31
3.1 Introduction 3 INFINITE SERIES

Example 3.1.5. Show that ∑∞ k


k=1 (−1) diverges.

Solution: We have for the partial sums sn :



−1, if n is odd,
sn =
0, if n is even.

Thus (sn ) does not converge as n → ∞.

Sometimes we are not able to able to calculate the partial sums exactly, but we are able to estimate
them and the estimate is enough to determine convergence or divergence.

Example 3.1.6 (Harmonic series). Show that the series ∑∞ k=1 1/k diverges. Note that the sequence (1/k)
converges. This is the most common mistake, to confuse convergence of the sequence of terms of a
series with the convergence of the series itself. These are different things.
Z k+1
Solution: We compare 1/k with an integral 1/x dx. It follows that
k

n n Z k+1 Z n+1
sn = ∑ 1/k ≥ ∑ 1/x dx =
1
1/x dx = log(n + 1) → ∞,
k=1 k=1 k

as n → ∞.

The example above shows that terms of divergent series might converge to zero. However, if the
terms if a series DO NOT converge to zero the series itself is ALWAYS divergent.

Theorem 3.1.7 (Divergence test). Let (ak ) be a sequence of real numbers. If ak does not converge to
zero then the series

∑ ak diverges.
k=1

We can state this in another way.

Theorem 3.1.8 ([2, Corollary 14.5]). Let (ak ) be a sequence of real numbers. If the series

∑ ak converges
k=1

then (an ) converges to 0.

32
3.1 Introduction 3 INFINITE SERIES

Proof. We assume that the series converges, in other words we assume that the sequence of partial sums
(sn )n ∈ N converges to some s ∈ R, and so does (sn+1 )n∈N . We can see this, for instance, as the second
sequence is a subsequence of the first. Then the sequence (sn+1 − sn )n∈N = (an+1 )n∈N converges to
s − s = 0.

This is an important result, but as Example 3.1.6 has already shown it can never be used to establish
convergence, only divergence. If (ak ) converges to zero then this theorem gives no information and the
series ∞
∑ ak
k=1
might converge or diverge. Some other method must be used instead. Here are two cases when the partial
sums can be determined explicitly.

Theorem 3.1.9 (Telescopic series). Let (bk ) be a convergent sequence of real numbers with limk→∞ bk =
b, and let ak = bk − bk+1 for k ∈ N. Then

∑ ak = b1 − b.
k=1

Note that we can compute the partial sums explicitly:


n
sn = ∑ ak = b1 − bn+1.
k=1

Note also that since we know (bk ) converges to b, we have

lim ak = lim (bk − bk+1 ) = lim bk − lim bk+1 = b − b = 0


n→∞ n→∞ n→∞ n→∞

as we should expect.

Example 3.1.10. We show that



1
∑ k(k + 1) = 1.
k=1

Solution: We can write


∞ ∞  
1 1 1 1
∑ k(k + 1) = ∑ k − k + 1 = 1 − k→∞
lim = 1.
k
k=1 k=1

33
3.2 Series with nonnegative terms 3 INFINITE SERIES

Note that we have so far used 0 or 1 as the starting index for the terms of our series. This is not
significant when it comes to the question of whether or not a series converges, though it can alter the
value of the sum. We return to the geometric series as an illustration.

Theorem 3.1.11 (Geometric series). Let x ∈ R and N ∈ {0, 1, 2, . . . } = N0 . Then the series

∑ xk converges if and only if |x| < 1.
k=N

In this case

kxN
∑ x = 1 − x, |x| < 1.
k=N
In particular,

1
∑ xk = 1 − x , |x| < 1.
k=0

Proof. Using mathematical induction we can establish that for x ̸= 1 the partial sum sn = ∑n+N k
k=N x is
equal to
xn+1
 
N 1
sn = x − .
1−x 1−x
Taking the limit as n → ∞ we see that this converges if and only if xn+1 → 0.

Theorem 3.1.12. Let (ak ) and (bk ) be real sequences and α ∈ R. If the series ∑∞ ∞
k=1 ak and ∑k=1 bk are
convergent then
∞ ∞ ∞
∑ (ak + bk ) = ∑ ak + ∑ bk
k=1 k=1 k=1
and
∞ ∞
∑ (αak ) = α ∑ ak .
k=1 k=1

Proof. This is left as an exercise. Note however that splitting an infinite sum is not generally possible
and we must first write the partial sum. Since the partial sum is finite we can split it.

3.2 Series with nonnegative terms


We focus here on series with nonnegative terms. This simplifies the situation as the sequence of partial
sums becomes an increasing sequence. This allows us to establish several criteria on convergence and
divergence of the series although in general, they won’t tell us what a series converges to.

34
3.2 Series with nonnegative terms 3 INFINITE SERIES

Theorem 3.2.1. Suppose that ak ≥ 0 for large k. Then ∑∞ k=1 ak converges if and only if the sequence of
partial sums (sn ) is bounded. That is, there exists M > 0 such that

n
∑ ak ≤ M, for all n ∈ N.
k=1

Proof. If ∑∞k=1 ak converges then the sequence of partial sums is convergent and hence bounded. Con-
versely, assume that (sn ) is bounded. Due to our assumption, this sequence is monotone for n ≥ N, that
is sn ≤ sn+1 for all n ≥ N. By the monotone convergence theorem the sequence (sn ) must be convergent
as it is monotone and bounded.

If ak ≥ 0 for large k we shall introduce the following notation. We write



∑ ak < ∞ if the series is convergent and
k=1


∑ ak = ∞ if the series is divergent.
k=1

Theorem 3.2.2 (Comparison test). Suppose that 0 ≤ ak ≤ bk for large k.


∞ ∞
• If ∑ bk < ∞ then ∑ ak < ∞.
k=1 k=1
∞ ∞
• If ∑ ak = ∞ then ∑ bk = ∞.
k=1 k=1

We won’t cover integration in detail on this course however the integral test for series can be quite
useful. The Cauchy condensation test can be used in most cases in which we would apply the integral
test and does not require the additional theory which the integral test does.

Theorem 3.2.3 (Integral test). Suppose that f : [1, ∞) → R is nonnegative and decreasing on [1, ∞). Let
ak = f (k), k = 1, 2, 3, . . . . Then ∑∞ ∞
k=1 ak = ∑k=1 f (k) converges if and only if the improper integral
Z ∞
f (x) dx < ∞.
1

Proof. Due to the monotonicity of the function f we have that


Z k+1
ak+1 = f (k + 1) ≤ f (x) dx ≤ f (k) = ak ,
k

35
3.2 Series with nonnegative terms 3 INFINITE SERIES

for k ∈ N. Let us sum this inequality over all k = 1, 2, . . . , n. We get for the partial sums sn :
n+1 Z n n
sn − a1 ≤ ∑ ak ≤ f (x) dx ≤ ∑ ak = sn.
k=2 1 k=1
R∞
This inequality implies that (sn ) is bounded if and only if 1 f (x) dx < ∞.

Example 3.2.4 (p-series test). The series



1
∑ kp
k=1
converges if and only if p > 1.

Solution using integral test: Set f (x) = x−p on [1, ∞). Then f ′ (x) = −px−p−1 < 0 for all x provided
p > 0. Hence f is nonnegative and decreasing on [1, ∞). Since for p ̸= 1

x1−p n n1−p − 1
Z ∞
x−p dx = lim = lim ,
1 n→∞ 1 − p 1 n→∞ 1 − p

the integral has finite limit if and only if n1−p → 0 or p > 1. Note that for p = 1 we get the harmonic
series, when p < 0 the series diverges trivially as 1/n p does not converge to 0.

The integral test is a powerful tool but its usefulness is restricted by the condition that f is monotone.

Example 3.2.5. In this example we consider the following two series


√ √
∞ √ √ ∞
k+1− k
∑ ( k + 1 − k), ∑ k
.
k=1 k=1
√ √
Note that we have encountered the sequence (xk ) with xk = k + 1 − k already in Example 2.1.5 where
it was shown that xk → 0 as k → ∞, so the divergence test would fail. This series is however a telescoping
√ √
series and the partial sum can be given as sn = n + 1 − 1. Thus (sn ) is a divergent sequence and hence
the series is divergent.
For the second series there are two obvious comparisons, namely
√ √ √ √
k+1− k 1 k+1− k √ √
0< ≤ , ≤ k + 1 − k.
k k k
However, should we try the comparison test here, both of these comparisons would be inconclusive as
√ √
the series formed from the terms on the right would be divergent. We use a bound on k + 1 − k as we
did in Example 2.1.5 to obtain √ √
k+1− k 1
0< ≤√ .
k k3

36
3.2 Series with nonnegative terms 3 INFINITE SERIES

We can now use the comparison test together with knowledge of p-series (see 3.2.4) to conclude that
the series is convergent.

Example 3.2.6. Determine whether the series


r

3k log k
∑ k2 + k k
k=1

is convergent of divergent.

Solution: It is known that for large k the value of log k grows more slowly than any positive power kε ,
ε > 0. Because of this we can estimate
r r r
3k log k 1 3k log k 1 kε 3
2
= ≤3 = 3 ε,
k +k k k k+1 k k k k2−2

3 3
for k large. By choosing ε > 0 small we see that 2 − ε2 > 1 and hence by the p-series test ∑ 3 ε is
k=1 k 2−2
convergent. If follows from the comparison test that our original series is also convergent.

Sometimes the comparison test requires quite delicate inequalities and it might be easier to consider
the ratio an /bn of two sequences (an ) and (bn ) as n → ∞. This leads to the limit comparison test, (not to
be confused with the ratio test which will come very soon).

Theorem 3.2.7 (Limit Comparison test). Suppose that 0 ≤ ak , 0 < bk for large k and that
an
L = lim exists as an extended real number.
n→∞ bn

∞ ∞
• If L ∈ (0, ∞) then ∑ ak converges if and only if ∑ bk converges.
k=1 k=1
∞ ∞
• If L = 0 and ∑ bk converges then ∑ ak converges.
k=1 k=1
∞ ∞
• If L = ∞ and ∑ bk diverges then ∑ ak diverges.
k=1 k=1

In general, whenever the limit comparison test works, the comparison test also does, but the limit
comparison test might be a little bit easier to verify.

37
3.3 Cauchy Condensation 3 INFINITE SERIES

3.3 Cauchy Condensation


Cauchy’s Condensation Test and its proof are inspired by the following argument used by Nicole Oresme
∞ 1
to establish the divergence of the harmonic series ∑ .
n=1 n
We start by grouping the terms of the series as follows:
∞      
1 1 1 1 1 1 1 1 1 1 1
∑ n = 1+ 2 + + + + + + + ··· + n−1
+ n−1 +···+ n + ···
n=1 3 4 5 6 7 8 2 +1 2 +2 2

The last term in each group is smaller than all the others, therefore

1 1 1 1 1
∑ n ≥ 1 + 2 + 2 · 4 + 4 · 8 + · · · + 2n−1 · 2n + · · ·
n=1
1 1 1 1
= 1 + + + + · · · + + · · · = +∞.
2 2 2 2
Since we are not allowed to insert parentheses any way we like in infinite sums, we need to find a
rigorous way to use Oresme’s idea.

1
Example 3.3.1. The harmonic series ∑ n diverges.
n=1

Proof. Define
1 1
Sn = 1 + + · · · + , n = 1, 2, ...
2 n
We wish to show that the sequence (Sn )n∈N is not convergent. It is enough to show that it isn’t bounded.
Each Sn is a sum with finitely many terms, therefore we are allowed to insert parantheses as in (3.3)
below.
Consider the subsequence (S2n )n∈N . We have

1
S2 = 1 + ,
2
and, for n ≥ 2,
 
1 1 1
S2n = S2n−1 + n−1 + n−1 +···+ n (3.3)
2 +1 2 +2 2
1
≥ S2n−1 + 2n−1 · n (3.4)
2
1
= S2n−1 + . (3.5)
2

38
3.3 Cauchy Condensation 3 INFINITE SERIES

An easy induction argument now gives,


n
S2n ≥ 1 + , n = 1, 2, 3, ... (3.6)
2
It follows that S2n → +∞ as n → +∞, therefore (Sn ) is not bounded.

Exercise 3.3.2. Prove that we actually have

Sn → +∞ as n → +∞. (3.7)

Exercise 3.3.3. Prove that, if an increasing sequence (an )n∈N has a subsequence (ank )k∈N such that

ank −−−−→ +∞,


k→+∞

then
an −−−−→ +∞.
n→+∞

Theorem 3.3.4 (Cauchy’s Condensation Test). Let (an )n∈N be a decreasing sequence with non-negative
terms. Then the following are equivalent:

1. The series ∑ an converges,
n=1

2. The series ∑ 2n a2n converges.
n=0

Proof. Define the partial sums

Sn = a1 + a2 + a3 + a4 + · · · + an , n = 1, 2, 3, ...
Tn = a1 + 2a2 + 4a4 + 8a8 + · · · + 2n−1 a2n−1 , n = 1, 2, 3, ...

As in Oresme’s argument we look at the subsequence (S2n )n∈N . Since (an )n∈N is decreasing, we have

S2n = a1 + a2 + (a3 + a4 ) + (a5 + a6 + a7 + a8 ) + · · ·


+ (a2n−1 +1 + a2n−1 +2 + · · · + a2n )
≥ a1 + a2 + 2a4 + 4a8 + · · · + 2n−1 a2n
1 1
= a1 + (a1 + 2a2 + 4a4 + 8a8 + · · · + 2n a2n )
2 2
1 1
= a1 + Tn+1 ,
2 2

39
3.3 Cauchy Condensation 3 INFINITE SERIES

therefore  a1 
Tn+1 ≤ 2 S2n − , n = 1, 2, 3, ....
2
Changing n to n − 1 for the sake of convenience,
 
1
Tn ≤ 2 S2n−1 − a1 , n = 2, 3, .... (3.8)
2
∞ ∞
If ∑ an converges, then (Sn )n∈N is bounded. By (3.8), (Tn )n∈N is bounded, therefore ∑ 2n an converges.
n=1 n=0
Keeping the same grouping as above3 and using

a3 + a4 ≤ a2 + a2 = 2a2 , a5 + a6 + a7 + a8 ≤ a4 + a4 + a4 + a4 = 4a4 , ...

we see that

S2n = a1 + a2 + (a3 + a4 ) + (a5 + a6 + a7 + a8 ) + · · ·


+ (a2n−1 +1 + a2n−1 +2 + · · · + a2n )
≤ a1 + a2 + 2a2 + 4a4 + · · · + 2n−1 a2n−1
= a2 + Tn .

If ∑ 2n an converges, then (Tn )n∈N is bounded, therefore (S2n )n∈N is bounded, therefore (Sn )n∈N is
n=0

bounded (because Sn ≤ S2n ), therefore ∑ an converges.
n=1
∞ 1
Example 3.3.5 (The p-series). The series ∑ p
converges iff p > 1.
n=1 n
∞ 1
Proof. If p ≤ 0 then 1
np = n−p ̸→ 0, therefore the series ∑ p
diverges.
n=1 n
1
Assume from now on that p > 0. Then p is decreasing. By Cauchy’s Condensation Test, the series
n
∞ 1 ∞ 1
n
∑ p converges iff the series ∑ 2 n p converges. Observe,
n=1 n n=0 (2 )
 n
n 1 1 1
2 n p = n p−1 = .
(2 ) (2 ) 2 p−1
∞  n
1 1
The geometric series ∑ 2 p−1 converges iff 2 p−1 < 1, which is equivalent to p > 1.
n=0

3 There are various ways to group the terms in this proof.

40
3.4 Absolute convergence 3 INFINITE SERIES

3.4 Absolute convergence



Definition 3.4.1. Let S = ∑ ak be an infinite series.
k=1

We say that the series S converges absolutely if ∑ |ak | < ∞.
k=1

We say that the series S converges conditionally if S converges but ∑ |ak | diverges.
k=1
∞ ∞
Theorem 3.4.2. If ∑ ak converges absolutely, then ∑ ak converges, but not conversely.
k=1 k=1
Proof. We define  
+
a
k ak ≥ 0 −
0 ak ≥ 0
ak = , ak = .
0 ak < 0 −ak ak < 0


Note that 0 ≤ a+
k ≤ |ak | and 0 ≤ ak ≤ |ak | for all k ∈ N. So by the comparison test, both ∑ a+k and
k=1
∞ ∞ ∞ ∞
∑ a−k converge. Since ak = a+k − a−k , we can apply Theorem 3.1.12 to get that ∑ ak = ∑ a+k − ∑ a−k
k=1 k=1 k=1 k=1
converges.
(−1)k∞
An example of a series that converges conditionally is ∑, but note that we have yet to prove
k=1 k
that this series converges, we will do so in the next section.

Note that to test convergence of ∑ |ak | we may use every test of the previous section (integral,
k=1
comparison, limit comparison).
We finish by introducing two additional tests for convergence.
Theorem 3.4.3 (Root test, limit form). Let ak ∈ R and assume that r = limk→∞ |ak |1/k exists. If

• r < 1 then the series ∑ ak converges absolutely.
k=1

• r > 1 then the series ∑ ak diverges.
k=1
Proof. Both of these results follow from the comparison test. In the case r < 1 one can prove that for all
k large
1+r k
 
|ak | ≤ .
2

41
3.4 Absolute convergence 3 INFINITE SERIES

∞ ∞
It follows that ∑ |ak | converges as the geometric series ∑ ((1 + r)/2)k is convergent.
k=1 k=1
We’ll show this inequality in more detail, it is an important skill. We are given r = lim |ak |1/k .Since
0 ≤ r < 1 there exists ε > 0 such that (r − ε, r + ε) ⊂ (−1, 1). (What we want is that x ∈ (r − ε, r + ε)
implies |x| < a < 1 for some a.) For r + ε < 1 we need any ε < 1 − r and ε = 1−r 2 will do fine. In this
1+r 1 1−r
case r + ε = 2 . Now back the to sequence (|ak | k ), with the value of ε = 2 we take any N such that
1
n ≥ N implies ||ak | k − r| < ε. Such an N exists by the convergence of the sequence. This in turn shows
1 1 1+r k
that r − ε < |ak | k < r + ε for all n > N, and that implies |ak | k < 1+r

2 < 1 so that |ak | < 2 and we
cam use the comparison test to show that the series is absolutely convergent.
For r > 1 one can show that for all k large

1+r k
 
|ak | ≥ .
2
Thus again one can use comparison test to show divergence.
Exercise 3.4.4. In the context of the previous proof, prove that if r > 1 then there exists N such that
k > N implies
1+r k
 
|ak | ≥ .
2
Additionally, if r > 1 show that an does not converge to 0 (this should be a simple observation if you
use the first part).
It is usually easier to check the following test.
|ak+1 |
Theorem 3.4.5 (Ratio test, limit form). Let ak ∈ R and assume that r = lim exists as an extended
k→∞ |ak |
real number. If

• r < 1 then the series ∑ ak converges absolutely.
k=1

• r > 1 then the series ∑ ak diverges.
k=1
It should be noted that we could strengthen the statement. For r > 1 not only does the series diverge,
the sequence of terms (an ) does not converge to 0, hence by the divergence test the series diverges.
The key step in the proof is that we can keep an+1 /an close to r for n > N. If r < 1 then we can
ensure the ratio is also less than 1 for n > n. We then relate am to aN by observing:
am am−1 aN+1
am = ... aN
am−1 am−2 aN

42
3.5 Series with alternating signs 3 INFINITE SERIES

Each ratio is bounded, and aN is a single particular value so we get a useful bound on am and can prove
the series converges by comparing to a geometric series.

Proof. We give the proof in the case r < 1 and leave the case r > 1 as a very different exercise.
Let rn = aan+1
n
, the lim rn = r < 1. Take ε = 1−r
2 > 0 then there exists N ∈ N such that rn < r + ε =
1+r
2 < 1 for all n ≥ N. Now we note that for any m > N
 m−N
am am−1 aN+1 1+r
|am | = ... aN < |aN |.
am−1 am−2 aN 2

Therefore we can conclude that ∑∞ k=N ak is absolutely convergent by using the comparison test. Hence

∑k=1 ak is absolutely convergent.

Additional exercises 9. Exercises 14.1 - 14.4 give a wealth of series to work on, in many cases the terms
of the series are the ratio of two types of expressions; polynomial, exponential, factorial etc. Try some
problems with different types of ratios.
Exercise 14.8 is a nice general statement. Exercise 14.9 makes an important point.
Exercise 15.6 asked you to find some examples of series with a particular behaviour, this is an im-
portant skill!

3.5 Series with alternating signs


It has been stated that the series ∑∞ k
k=1 (−1) 1/k is conditionally convergent. This series is an example
of a series with alternating signs. We now establish a general theorem showing convergence for this and
other similar series.

Theorem 3.5.1 (Alternating sign series). Let (ak ) be a decreasing sequence of nonnegative numbers
such that ak → 0 as k → ∞. Then the series

∑ (−1)k ak is convergent.
k=1

Proof. As before, let (sn ) be the sequence of partial sums, i.e.,


n
sn = ∑ (−1)k ak .
k=1

43
3.5 Series with alternating signs 3 INFINITE SERIES

We claim the following: The subsequence of odd terms s1 ≤ s3 ≤ s5 ≤ . . . is increasing and bounded
above (by s2 ). The subsequence of even terms s2 ≥ s4 ≥ s6 ≥ . . . is decreasing and bounded below (by
s1 ).
Indeed, Let n be odd. Then
sn+2 = sn + an+1 − an+2 ≥ sn ,
since an+1 ≥ an+2 . Similarly, if n is even then

sn+2 = sn − an+1 + an+2 ≤ sn .

Also since for n odd we have sn < sn + an+1 = sn+1 ≤ sn−1 ≤ · · · ≤ s2 so that the subsequence of odd
terms is bounded above. Similarly for n even we have sn > sn − an+1 = sn+1 ≥ sn−1 ≥ · · · ≥ s1 so that
the subsequence of even terms is bounded below.
By the monotone convergence theorem it follows that these two limits exist:

e = lim s2k , and o = lim s2k+1 .


k→∞ k→∞

Moreover, we claim that e = o because ak → 0, hence the limit of the full sequence limk→∞ sk = e = o
exists. To see this consider

o − e = lim s2k+1 − lim s2k = lim [s2k+1 − s2k ] = lim −a2k+1 = 0.


k→∞ k→∞ k→∞ k→∞

Corollary 3.5.2. The series


∞ ∞ ∞
1 1 1
∑ (−1)k k , ∑ (−1)k log k , ∑ (−1)k k log k
k=1 k=2 k=2

are all convergent.

44
4 CONTINUITY

4 Continuity
4.1 Introduction
A real function can be considered as a means of assigning a real number to each real number in the
domain of the function, i.e., some subset of R. We tend to look at well behaved functions such as x2 .
More formally we define this function by f : R → R, x 7→ x2 . Note that the domain of this f is R. We
might have reason to consider another function g : R≥0 → R, x 7→ x2 . Both f and g do the same thing, but
the domains are different. The domain of f is all of R, but the domain of g is just the set of non-negative
real numbers. This distinction is more important than it may appear at first. One minor point is that
g(−1) is not defined as −1 is not in the domain of g. More significantly the definition of continuity
incorporates the domain of the function and changing the function by changing only the domain may
change whether or not it is continuous. Throughout we shall require x ∈ dom( f ), i.e., x is in the domain
of f .

Definition 4.1.1 ([2, Definition 17.1]). Let f be a function f : dom( f ) → R where dom( f ) ⊆ R. We say
that f is continuous at some a ∈ dom( f ) if for any sequence (xn ) whose terms lie in dom( f ) and which
converges to a, we have limn→∞ f (xn ) = f (a). If f is continuous at each a ∈ S ⊆ dom( f ) then we say
say f is continuous on S. If f is continuous on dom( f ) then we say that f is continuous.

Example 4.1.2. The function p : R → R, x 7→ ∑m i


i=0 ai x , where ai ∈ R is continuous. That is to say,
polynomials are continuous.
To verify continuity we take some a ∈ R and let (xn ) be a sequence of real numbers converging to a.
We have seen that if (xn ) and (yn ) are convergent sequences then so is (xn yn ). Inductively this shows that
(ai xni ) is convergent for any i ∈ N0 . Similarly the sum of any two convergent sequences is convergent so
that the sequence (p(xn )) converges to p(a).

The approach we used in this example is important and deserves to be stated more generally. Since
continuity of a function depends on only the behaviour of sequences we can carry over any results we
have already established in that context.

Theorem 4.1.3. Let f , g : D → R be continuous on D, and let α ∈ R then the following functions are
continuous on D.

1. α f ;

2. f + g;

3. f g.

45
4.1 Introduction 4 CONTINUITY

Definition 4.1.4. Let A, B ⊆ R be nonempty, let f : A → R, g : B → R and f (A) ⊆ B. The composition


of g with f is the function g ◦ f : A → R defined by

(g ◦ f )(x) = g( f (x)), for all x ∈ A.

Theorem 4.1.5 ([2, Theorem 17.5]). If f is continuous at a ∈ R and g is continuous at f (a) then the
composition g ◦ f , when well-defined, is continuous at a.

Proof. See [2, Theorem 17.5] for details. Essentially since f is continuous we have, for any suitable con-
vergent sequence (xn ), that ( f (xn )) is convergent, and further the continuity of g ensures that (g( f (xn )))
is convergent.

There is another way to define continuity known as the ‘ε-δ ’ definition. It is a bad idea to define a
concept in more than one way however, so instead, we show that this is equivalent to our ‘sequential’
definition.

Theorem 4.1.6 ([2, Theorem 17.2]). Let f be a function f : dom( f ) → R where dom( f ) ⊆ R. Then f is
continuous at a ∈ dom( f ) if and only if for any ε > 0 there exists δ > 0 such that whenever x ∈ dom( f )
and |x − a| < δ we have | f (x) − f (a)| < ε.

Proof. See [2, Theorem 17.2].

Example 4.1.7. In Figure 1 we see the function f (x) = x3 + 21 x2 . Here with the point a = 1 and ε = 0.5
we can see that for δ = 0.1 the image of any x such that |x − 1| < 0.1 lies between f (1) − ε = 1 and
f (1) + ε = 2, this is the red curve.
We always start with ε and we must then find a δ . In this case we considered a fixed ε and the δ was
then found by inspection.
Note that the red curve is not as large as it could be while still fitting inside the horizontal dashed
lines. We could solve | f (x) − 1.5| < ε to find, approximately, x ∈ (0.858, 1.113) but we only need to find
one value of δ and we can make it as small as we like. Indeed we could take δ = 0.01; we only need
that a suitable δ exists and indeed if a suitable δ exists then any smaller positive value is also suitable.

Additional exercises 10. Try exercises 17.9 and 17.10.

Note that our definition allows us to consider continuity of functions with domains like D = Z or
D = Q.

46
4.2 The Extreme and Intermediate Value Theorems 4 CONTINUITY

1.5 (1, f (1))

0.5

−1 −0.5 0.5 1

−0.5

Figure 1: Continuity of f (x) = x3 + 12 x2 at a = 1

4.2 The Extreme and Intermediate Value Theorems


Definition 4.2.1 ([2, Section 18]). Let E ⊆ R be nonempty. A function f : E → R is said to be bounded
on E if
| f (x)| ≤ M, for all x ∈ E,
where M is some (large) real number.

Continuous functions on closed bounded intervals are always bounded. And in fact, more is true.
The function attains both its supremum and infimum on the interval, and so we often refer to these as the
maximum and minimum.

Theorem 4.2.2 (Extreme value theorem, [2, Theorem 18.1]). Let I ⊆ R be a closed and bounded interval.
Let f : I → R be continuous on I. Then f is bounded on the interval I.
Denote by
m = inf{ f (x) : x ∈ I}, M = sup{ f (x) : x ∈ I}.
Then there exist points xm , xM ∈ I such that

f (xm ) = m and f (xM ) = M.

Proof. We first establish that f is bounded. Suppose that f were not bounded. Then for any M ∈ R one
can find a cM ∈ I such that | f (cM )| > M. In particular we could construct a sequence (xn ) ⊆ I such that

47
4.2 The Extreme and Intermediate Value Theorems 4 CONTINUITY

| f (xn )| > n for each n ∈ N. The sequence (xn ) is bounded as each term lies in the closed and bounded
interval I.
We define sk = sup{xn | n ≥ k}. Each sk is real since I is bounded and each sk ∈ I since I is closed.
Thus f (sk ) ∈ R. Furthermore (sk ) is decreasing and bounded below and hence converges to some s ∈ I.
Now for each k there exists a sequence (ym ) with ym ∈ {xn | n ≥ k} such that ym → sk and by construction
of the sequence (xn ) we have | f (ym )| ≥ k for all m. Then by the continuity of f we have | f (sk )| =
lim | f (ym )| ≥ k. We apply continuity once more to see f (sk ) → f (s) ∈ R which implies lim | f (sk )| =
| f (s)| so that f (sk ) is bounded giving a contradiction. Hence the sequence (xn ) cannot exists and f is
bounded.
Recall that this means the set { f (x) | x ∈ I} is bounded. Since I is non-empty we have that this set is
non-empty and by the completeness axiom, it has both an infimum and supremum.
We now show the existence of a minimum. By the properties of inf there exists a sequence (xn ) ⊆ I
(different from above) such that f (xn ) → m, the infimum. In a similar manner to above we construct a
convergent sequence (sk ). It is left as an exercise to show f (sk ) → m. Once established we can then use
continuity to show f (s) = m for some s ∈ I.
The proof for the maximum is similar and omitted.

This theorem does not hold if either “closed” or “bounded” is omitted from the hypotheses on the
interval I.

Exercise 4.2.3. Find all the places where the closed hypothesis is used in the previous proof.

Lemma 4.2.4. Let f : I → R where I is an open nonempty interval. If f is continuous at a point a ∈ I


and f (a) > 0 then for some δ , ε > 0 we have that

f (x) > ε, for all x ∈ (a − δ , a + δ ).

Theorem 4.2.5 (Intermediate Value Theorem, [2, Theorem 18.2]). Let I be a non-degenerate interval
and let f : I → R be a continuous function. If a, b ∈ I, a < b then f attains, on the interval (a, b), all
values between f (a) and f (b). That is to say given y0 between f (a) and f (b) there exists x0 ∈ (a, b)
such that
f (x0 ) = y0 .

Proof. To simplify the notation we assume that f (a) < y0 < f (b). The case where f (b) < f (a) is similar.
Let
x0 = sup E, where: E = {x ∈ [a, b] : f (x) < y0 }.
Clearly, a ∈ E, and E ⊆ [a, b) as b ∈
/ E so that x0 ∈ [a, b].

48
4.2 The Extreme and Intermediate Value Theorems 4 CONTINUITY

(b, f (b))

(sup E, y0 )

(a, f (a))

Figure 2: Intermediate value theorem

In Figure 2 we depict a function f (x) with domain [a, b] in red and blue. The red curves are the image
of the set E defined here. Observe, that in this case the set E is not a single interval and indeed in the
figure we have f (x) = y0 for three different x ∈ (a, b).
Since x0 is the supremum of E there exists a sequence (xn ) such that xn ∈ E for all n and xn → x0 as
n → ∞. Since f is continuous we know that f (xn ) → f (x0 ) and given that f (xn ) < y0 (as xn ∈ E) we see
that f (x0 ) ≤ y0 .
Suppose that f (x0 ) < y0 , in which case x0 ̸= b. Setting ε = y0 − f (x0 ) > 0 we know by the continuity
of f that there exits δ > 0 such that | f (x) − f (x0 )| < y0 − f (x0 ) for all |x − x0 | < δ . In particular there
exists x1 ∈ (x0 , b) such that f (x1 ) − f (x0 ) < y0 − f (x0 ), i.e., f (x1 ) < y0 . However now x1 > x0 which
contradicts the fact x0 was the supremum.
By the trichotomy property we must then have f (x0 ) = y0 .

The proof above using the following. If f is continuous on some interval [a, b] and f (c) ̸= 0 for some
c ∈ (a, b) then there exists δ > 0 such that f (x) ̸= 0 for all x ∈ (c − δ , c + δ ).
The intermediate value theorem is equivalent to the following.

Theorem 4.2.6 (Bolzano’s theorem, [2, Exercise 18.8]). Let f (x) be continuous on [a, b] such that
f (a) f (b) < 0, then there exists c ∈ (a, b) such that f (c) = 0.

Proof. If f (a) f (b) < 0 then exactly one of f (a), f (b) is negative and the other is positive. In particular
0 lies between f (a) and f (b) so that the intermediate value theorem can be applied.

49
4.2 The Extreme and Intermediate Value Theorems 4 CONTINUITY

In this proof we used the intuitive idea that if f (x0 ) is greater than y0 then by continuity f (x) would
be greater than y0 for x ‘close’ to x0 , so x0 is greater than the supremum. If y0 is greater than f (x0 ) then
by continuity y0 would be greater than f (x) for x ‘close’ to x0 , so x0 is less than the supremum.

Example 4.2.7 ([2, Section 18, Example 2]). Let f : X → X where X ⊆ R. We are often interested in
when the function f have a fixed point, i.e., when there exists x ∈ X such that f (x) = x. For instance if
f : R → R is given by x 7→ x2 then a fixed point of f is a solution to x2 = x, i.e., f has fixed points 0, 1
To find the fixed points of a function we solve f (x) = x, or in other words solve f (x) − x = 0. If f is
continuous then we may be able to apply Bolzano’s theorem to show the existence of a zero of f (x) − x,
and hence a fixed point of f .
Let f : [0, 1] → [0, 1] be continuous. In this case define g(x) = f (x) − x so that g : [0, 1] → [−1, 1].
Observe that g(0) = f (0) − 0 ≥ 0 and g(1) = f (1) − 1 ≤ 0 (since 0 ≤ f (1) ≤ 1). Now if either of
these inequalities were actually equality then we would have a fixed point of f , otherwise g(0) > 0 and
g(1) < 0 and Bolzano’s theorem immediately applies proving the existence of some c ∈ (0, 1) such that
g(c) = 0, or equivalently f (c) = c.
It is a good exercise to visualise this. Try to plot a continuous function f : [0, 1] → [0, 1]; you’ll find
that you can’t do this without crossing the line y = x.

Example 4.2.8 ([2, Exercise 17.14]). The Riemann function f : R → R defined by



0, if x is irrational or x = 0,
f (x) =
1/q, if x = p/q where q ∈ N, p ∈ Z and (p, q) = 1,

is continuous on irrational numbers and discontinuous at all rational numbers.

Proof. First consider a non-zero rational number x ∈ Q. If f were continuous at x then for any sequence
xn → x we would have f (xn ) → f (x) as n → ∞. If we choose xn ∈ Qc then f (xn ) = 0 → 0 but f (x) > 0
as x is a non-zero rational. Hence, f is not continuous at x.
Consider now an irrational number x. To prove continuity at x we have to show that for any sequence
(yn ), yn → x we have f (yn ) → 0 = f (x).
It suffices to consider yn ∈ Q since for irrational numbers f (yn ) = 0 hence there is nothing left to
prove in such a case.
Let yn = pn /qn where the quotient is in the reduced form (qn ∈ N, pn ∈ Z and (pn , qn ) = 1)). We
have to prove that
1
f (yn ) = → 0, as n → ∞.
qn

50
4.2 The Extreme and Intermediate Value Theorems 4 CONTINUITY

From the definition of convergence for sequences this says

1
for all ε > 0 there exists N ∈ N such that < ε for all n ≥ N.
qn

Assume for the sake of contradiction that this is false. Then these exists ε > 0 such that for all N ∈ N
have some n0 > N with q1n ≥ ε. This allows is to construct a subsequence n1 < n2 < n3 , . . . such have
that
1 1
≥ε >0 or qnk ≤ = M, k = 1, 2, 3, . . . .
qnk ε
Consider a set E of rational numbers such that

E = {p/q : p ∈ Z and q ∈ N, q ≤ M}.

We claim that convergent sequences of elements from E have the following property. If (yn ) ⊆ E is a
convergent sequence, then (yn ) is eventually constant (that is yn = yn+1 for all n ≥ N). This follows from
the claim that if x, y ∈ E and x ̸= y then
1
|x − y| ≥ 2 .
M
Observe also that if yn → x then for any ε > 0 there exists N ∈ N such that for all n, m ≥ N we have

|yn − ym | ≤ |yn − x| + |x − ym | < ε.

It is left as an exercise to verify the claim that any convergent sequence in E is eventually constant.
Having this, since pnk /qnk → x as n → ∞ and pnk /qnk ∈ E it follows that (pnk /qnk ) is eventually constant
(take ε < M12 ) and hence its limit must belong to E. So x ∈ E ⊆ Q. This is a contradiction as we have
assumed that x ∈ Qc . Hence we have shown that f (yn ) → 0 as desired and so f is continuous at x.

Theorem 4.2.9 ([2, Corollary 18.3]). Let f : [a, b] → R, a, b ∈ R be a continuous function. Then the
image of f is an interval (possibly a degenerate interval, i.e. a point).

Proof. We know that the domain of f is closed and bounded therefore the extreme value theorem shows
there exists c, d ∈ [a, b] such that f (c) = inf{ f (x) | x ∈ [a, b]} and f (d) = sup{ f (x) | x ∈ [a, b]}. This
shows that the image of f is a subset of the closed interval between f (c) and f (d).
Now taking any value y between f (c) and f (d) the intermediate value theorem states that there exists
a c between c and d such that f (c′ ) = y so that the interval with f (c) and f (d) as endpoints is contained

in the image of f . This shows that the closed interval between f (c) and f (d) is a subset of the image of
f . Together with the previous paragraph we have that the image of f is a closed interval.

51
4.3 Limits of functions 4 CONTINUITY

If a function f is strictly increasing then it must be injective since a < b implies f (a) < f (b). A
consequence is that if f is strictly increasing on its domain then there is a well-defined inverse.

Theorem 4.2.10 ([2, Theorem 18.5]). Let f : [a, b] → R be a strictly increasing function such that the
image of f is an interval, then f is continuous on [a, b].

Proof. First note that since F is increasing the image of f is the interval [ f (a), f (b)]. In particular for
any a′ ∈ [ f (a), f (b)] there exists some d ∈ [a, b] with f (d) = a′ .
Let c ∈ (a, b), then f (c) is not an endpoint of the image of f . Take an ε > 0 such that ( f (c) −
ε, f (c) + ε) ⊆ im( f ). Since f is strictly increasing and maps onto an interval there exists x1 , x2 ∈ [a, b]
such that x1 < c < x2 (as f is strictly increasing) and f (x1 ) = f (c) − ε and f (x2 ) = f (c) + ε (as f maps
onto the interval). But then we see that it is enough to take 0 < δ ≤ min{c − x1 , x2 − c} and we have that
f is continuous at c.
The cases where c = a or c = b need only minor adjustment from above and are left as an exercise.
Note that we assumed our ε was small enough for the interval ( f (c) − ε, f (c) + ε) to be a subset of
the image. If it were not so small then we could take a smaller ε ′ and the proof would work in the same
way since we would then have |x − c| < δ implies | f (x) − f (c)| < ε ′ < ε.

If a function f is strictly increasing then it must be injective since a < b implies f (a) < f (b). A
consequence of this is that if f is strictly increasing on its domain then there is a well-defined inverse.

Theorem 4.2.11 ([2, Theorem 18.4]). Let f : [a, b] → R be a continuous strictly increasing function.
Then f −1 : [ f (a), f (b)] → R is a continuous, strictly increasing function.

Proof. The domain of f −1 is a consequence of Theorem 4.2.9 using that f is strictly increasing ensures
f (a) < f (b). That f −1 is continuous follows from Theorem 4.2.10.

4.3 Limits of functions


The notion of continuity is closely related to that of a limit of a function. Our approach (and that of
Ross [2]) has been to define continuity of a function f at a point a in terms of convergence of a sequence
( f (xn )) where (xn ) is itself a convergent sequence, converging to a. Informally, no matter how we
approach a the function approaches f (a). The limit will be similar but we do not need the function to
be defined at a, one example will be limits as x goes to ∞ or the limit of functions such as sinx x as x goes
to 0. Even if the function is defined at a there is no need for the limit to agree with f (a), or even for the
limit to exist.

52
4.3 Limits of functions 4 CONTINUITY

Definition 4.3.1 ([2, Definition 20.1]). Let f : dom( f ) → R and a an element of the extended real
numbers. We say that limx→adom( f ) f (x) = L, for some extended real number L if for every sequence (xn )
whose terms lies in dom( f ) \ {a} and which converges to a we have limn→∞ f (xn ) = L.
Although this definition considers any domain, we will usually consider the case where f is defined
on an interval around a. The next definition gives the special case where the domain is an interval and
when a is a (finite) real number. After that we give the case that the domain is an interval and a = ±∞.
Definition 4.3.2 ([2, Definition 20.3(a)-(c)]). Let a ∈ R and let f be a function defined on some interval
I which contains a.
• We define the two-sided limit limx→a f (x) to be limx→aI\{a} f (x).

• We define the right-hand limit limx→a+ f (x) to be limx→aI∩(a,∞) f (x).

• We define the left-hand limit limx→a− f (x) to be limx→a(−∞,a)∩I f (x).


Definition 4.3.3 ([2, Definition 20.3(d)]). Let b ∈ R.
• Let f be a function defined on some interval I = (b, ∞). We define limx→∞ f (x) to be limx→∞I f (x).

• Let f be a function defined on some interval I = (−∞, b). We define limx→−∞ f (x) to be limx→−∞I f (x).
Just as when we defined continuous functions for each of the above cases we can give an equivalent
ε-δ condition. We will state only for the case limx→a+ f (x) and leave the other cases, and their proofs,
as an exercise. The proofs will be similar to that which connects the two equivalent conditions for
continuity.
Theorem 4.3.4 ([2, Theorem 20.6]). Let a ∈ R and let I be an open interval with left end-point a.
Then limx→a+ f (x) = L if for every ε > 0 there is δ > 0 such that

if a < x < a + δ and x ∈ I, then | f (x) − L| < ε.

The value L of the limit shall be denoted as f (a+ ).


The connection between limits and one-sided limits is straightforward.
Theorem 4.3.5 ([2, Theorem 20.10]). Let f be a real function. The limit

lim f (x)
x→a

exists and is equal to L if and only if

L = lim f (x) = lim f (x).


x→a+ x→a−

53
4.3 Limits of functions 4 CONTINUITY

Exercise 4.3.6. Let I be a non degenerate interval and a ∈ I. Prove that f : I → R is continuous at a if
and only if limx→a f (x) = f (a).

Example 4.3.7. If f (x) = x2 + 1 show that f (x) → 5 as x → 2.

Let ε > 0 and L = 5. Observe that

f (x) − L = x2 + 1 − 5 = (x − 2)(x + 2).

If 0 < δ ≤ 1 then |x − 2| < δ implies that 1 < x < 3 and therefore for such x: |x + 2| < 5. If follows that

| f (x) − L| = |x − 2||x + 2| < δ · 5 = 5δ .

Hence if 5δ ≤ ε then the desired bound | f (x) − L| < ε will hold. It follows that by setting δ =
min{ε/5, 1} we obtain the required properties and that limx→2 (x2 + 1) = 5.

It is important to realize that the function f does not have to be defined at the point a. For example
take the function f (x) = sinx x . This can defined in an obvious way on the set R \ {0}. Yet, the limit
limx→0 f (x) exists and can be calculated.
Moreover, the value of a limit at a point a never depends on the value f (a) even if f is defined at the
point a. Consider a function g(x) = 1 for x ∈ (−3, −1) ∪ (−1, 0) and g(−1) = 5. The domain of this
function is the interval (−3, 0). Clearly

lim g(x) = 1 ̸= g(−1) = 5.


x→−1

This observation can be generalized as follows. Let a ∈ R and let I be an open interval containing
a. If limx→a f (x) = L and f and g are two functions defined everywhere on I, except possibly at a, such
that
f (x) = g(x) for all x ∈ I \ {a} then lim f (x) = lim g(x) = L.
x→a x→a

x3 − x2 + x − 1
Example 4.3.8. Let g(x) = . Find the limit of g at the point 1.
x−1
Solution: Observe that
x3 − x2 + x − 1 (x − 1)(x2 + 1)
g(x) = = = x2 + 1 for all x ̸= 1.
x−1 x−1
We define f (x) = x2 + 1 on R and we see that the functions f and g have the same limit at the point
a = 1. It follows that
lim g(x) = lim f (x) = lim (x2 + 1) = 2.
x→a x→a x→a

54
4.3 Limits of functions 4 CONTINUITY

More generally we claim if f (x) = g(x) for all x ∈ (a, c) ∪ (c, b) = D (so all x in the interval (a, b)
expect, possibly, for c) and limx→c f (x) = L then limx→c g(x) = L.
Using the sequence definition of the limit we have that for any sequence (xn ) with xn ∈ D and xn → c
we have f (xn ) → L. Now take any such sequence (xn ) and consider the sequence g(xn ). We have, since
xn ∈ D for all n, that g(xn ) = f (xn ) for all n, i.e. ( f (xn )) and (g(xn )) are the same sequence and by our
hypothesis limx→c f (x) = L we know f (xn ) → L so g(xn ) → L. This verifies the criteria in the definition
so limx→c g(x) = L.

Example 4.3.9. Show that the function f (x) = sin(1/x), defined on R \ {0}, has no limit at a = 0.

Proof. The most simple approach is to examine behaviour of this function near zero. We can see that
this function oscillates infinitely many times between values −1 and +1, though this is informal. To
establish the result properly we consider
2 2
an = , bn = , for n ∈ N.
(4n + 1)π (4n + 3)π
Clearly an → 0 and bn → 0 as n → ∞ but f (an ) = 1 and f (bn ) = −1 for all n ∈ N. It follows that
f (an ) → 1 and f (bn ) → −1 as n → ∞. If limx→0 f (x) were to exist then the limits of ( f (an ))n∈N and
( f (bn ))n∈N would have to be equal, but they are not.

The results we had about sums, products, and ratios of limits of sequences will also apply to limits
of functions. Hence for example we have that if both

lim f (x) and lim g(x)


x→a x→a

exist, then so does lim ( f + g)(x) and indeed


x→a

lim ( f + g)(x) = lim f (x) + lim g(x).


x→a x→a x→a

There is also a Squeeze theorem for functions that can be formulated in a similar way as we did for
sequences.

Theorem 4.3.10 (Comparison Theorem for functions, [2, Exercise 20.16]). Let a ∈ R, let I be an open
interval containing a and let f , g be real functions defined everywhere on I except possibly at a. If both
f and g have limits as x approaches a and

f (x) ≤ g(x), for all x ∈ I \ {a},

then
lim f (x) ≤ lim g(x).
x→a x→a

55
4.3 Limits of functions 4 CONTINUITY

Remark 4.3.11. If
f (x) < g(x), for all x ∈ I \ {a},
then we must still conclude that
lim f (x) ≤ lim g(x).
x→a x→a
That is
lim f (x) < lim g(x),
x→a x→a

might be false. See for example functions f (x) = 0 and g(x) = x2 with a = 0.

56
5 DIFFERENTIABILITY ON R

5 Differentiability on R
5.1 Introduction
Definition 5.1.1 ([2, Definition 28.1]). A real function f is said to be differentiable at a point a ∈ R if f
is defined at some open interval containing a and

f (x) − f (a)
f ′ (a) = lim
x→a x−a
exists. The number f ′ (a) is called the derivative of f at the point a.

This definition is often given in a different form which is the result of writing x = a + h.

Definition 5.1.2. A real function f is said to be differentiable at a point a ∈ R if f is defined at some


open interval containing a and
f (a + h) − f (a)
f ′ (a) = lim
h→0 h
exists. The number f ′ (a) is called the derivative of f at the point a.

If we plot the function f as a graph (x, f (x)) then f ′ (a) is the slope of the tangent line to the graph at
the point (a, f (a)). To see this we notice that f (a+h)−
h
f (a)
is the slope of the chord passing though points
(a, f (a)) and (a + h, f (a + h)) on the graph of f . If we let h → 0 the slope of the chord will approximate
the slope of the tangent line.

If f is differentiable at every point of a set E then f ′ is a function with domain E. This function may
be denoted in several ways (you will encounter all of these):
df
f ′ = f (1) = = Dx f .
dx
When y = f (x) the notation dy/dx or y′ is also used.
Depending on the function, differentiation can be performed more that once. The higher derivatives
are defined by induction, if n ∈ N then

f (n+1) (a) = ( f (n) (a))′ ,

provided the derivative exists. Again various notation is used for higher derivatives of the function f (x)
such as
dn f
f (n) = n = Dnx f .
dx

57
5.1 Introduction 5 DIFFERENTIABILITY ON R

The second derivative, for example, can be written as f (2) or f ′′ . We say that f is twice differentiable at
a point a if f ′′ (a) exists.

Differentiability and continuity are related. Every differentiable function is continuous but not every
continuous function is differentiable. There are continuous functions f : R → R that are NOT differ-
entiable at aem any point, however, these can be quite complicated to define. It is far easier to find
a function that is continuous everywhere but not differentiable at a single point. One such function is
f (x) = |x| which fails to be differentiable only at x = 0.
Indeed, since x → 0 implies that |x| → 0 f is continuous at 0. Considering the two limits

f (h) − f (0) h f (h) − f (0) −h


lim = lim = 1, lim = lim = −1,
h→0+ h h→0+ h h→0− h h→0− h

f (h)− f (0)
we see that they are not equal and hence limh→0 h does not exist. Hence f is NOT differentiable
at 0.

Theorem 5.1.3 ([2, Theorem 28.2]). If f is differentiable at a, then f is continuous at a.

Proof. Suppose that f is differentiable at a and note that for x ̸= a we have

f (x) − f (a)
f (x) = (x − a) + f (a).
x−a
We will use this fact often in this chapter. Here f (a) is constant and the limits, as x → a, of both (x − a)
and f (x)−
x−a
f (a)
exist. We get

f (x) − f (a)
lim f (x) = lim (x − a) · lim + f (a)
x→a x→a x→a x−a
=0 · f ′ (a) + f (a)
= f (a).

That is to say, f (x) is continuous at a.

As with continuity, it is convenient to define “one-sided” derivatives to deal with functions whose
domains are closed (or half-open) intervals. We will briefly discuss what it means for a function to be
differentiable on an interval I. The important distinction here is that we once again consider one-sided
limits.

Definition 5.1.4. Let I be a non-degenerate interval.

58
5.1 Introduction 5 DIFFERENTIABILITY ON R

(i) A function f : I → R is said to be differentiable on I if

f (x) − f (a)
fI′ (a) := lim
x→aI x−a
exists and is finite at every a ∈ I.

(ii) f is said to be continuously differentiable on I if fI′ exists and is continuous on I.

Notice that if a ∈ I is not an endpoint of I then fI′ (a) is just f ′ (a). The difference is at the endpoints.
If I = [a, b] then the limits taken at the endpoints are

f (a + h) − f (a) f (b + h) − f (b)
lim , and lim .
h→0+ h h→0− h

In what will follow we usually drop the subscript I from fI′ and just write f ′ (this is slightly “sloppy” at
the endpoints).
3
Example 5.1.5. Show that f (x) = x 2 is differentiable on [0, ∞).
3 3√
For x > 0 the usual power rule implies that f ′ (x) = 32 x 2 −1 = 2 x. At a = 0 by the definition:

′ h3/2 − 0 √
f (0) = lim = lim h = 0.
h→0+ h h→0+

Let I be a non-degenerate interval. For each n ∈ N ∪ {0}, we write Cn (I) to denote the collection
of real functions whose n-th derivative exists and is continuous on I. Hence C1 (I) is the collection of
all real functions that are continuously differentiable on I. We also abbreviate C0 (I) to C(I), this is the
collection of all real functions that are continuous on I (using the convention that the 0th derivative is the
function itself).
Furthermore C∞ (I) means the intersection n∈N Cn (I), that is to say, all functions whose n-th deriva-
T

tive exists and is continuous on I for all natural numbers n. It is hopefully clear that we have Cm (I) ⊂
Cn (I) if m > n.

Note that not every function that is differentiable on R belongs to C1 (R). It is necessary that the
function is continuously differentiable, i.e., f ′ is continuous.

Example 5.1.6. The function 


x2 sin(1/x), x ̸= 0
f (x) =
0, x=0

59
5.2 Differentiability Theorems 5 DIFFERENTIABILITY ON R

is differentiable on any interval I but is not continuously differentiable on any interval containing 0.

Solution: Using the definition, and that | sin(1/x)| ≤ 1 we see

′ h2 sin(1/h) − 0
f (0) = lim = lim h sin(1/h) = 0.
h→0 h h→0

We know from earlier calculus courses that

f ′ (x) = 2x sin(1/x) − cos(1/x), x ̸= 0.

It means that f is differentiable on R, but the function f ′ is not continuous at 0 since the limit limx→0 f ′ (x)
does not exist (this is due to oscillation between −1 and 1 of cos(1/x) as x → 0).
Also, if F is differentiable on two sets we cannot conclude that f is differentiable on the union.
Consider the example f (x) = |x| on [−1, 0] and [0, 1] and then on [−1, 1] = [−1, 0] ∪ [0, 1].

5.2 Differentiability Theorems


We shall establish several familiar theorems (from calculus) about derivatives.
Theorem 5.2.1 ([2, Theorem 28.3]). Let f , g be real functions and α ∈ R. If f and g are differentiable
at a, then f + g, f · g, α f and (if g(a) ̸= 0) also f /g are all differentiable at a. We have

(α f )′ (a) =α f ′ (a) (linear),


′ ′ ′
( f + g) (a) = f (a) + g (a) (sum rule),
′ ′ ′
( f · g) (a) = f (a)g(a) + f (a)g (a) (product rule),
 ′
f f ′ (a)g(a) − f (a)g′ (a)
(a) = (quotient rule).
g (g(a))2
Proof. We prove the product rule here. By Theorem 5.1.3 there exists an open interval I ∋ a and functions
F : I → R, G : I → R continuous at a such that

f (x) = F(x)(x − a) + f (a), g(x) = G(x)(x − a) + g(a), x ∈ I.

If follows that

f (x)g(x) = f (a)g(a) + [F(x)g(a) + G(x) f (a) + F(x)G(x)(x − a)](x − a).

Let H(x) = F(x)g(a) + G(x) f (a) + F(x)G(x)(x − a) for x ∈ I. It follows that H is continuous at a as
sums and products of continuous functions at a point are also continuous there. Hence, since

f (x)g(x) = f (a)g(a) + H(x)(x − a),

60
5.2 Differentiability Theorems 5 DIFFERENTIABILITY ON R

by Theorem 5.1.3 we have that

f (x)g(x) − f (a)g(a)
( f g)′ (a) = lim = lim H(x) = H(a) = F(a)g(a) + G(a) f (a) + F(a)G(a)(a − a)
x→a x−a x→a
= f ′ (a)g(a) + g′ (a) f (a).

Theorem 5.2.2 (Chain Rule [2, Theorem 28.4]). Let f , g be real functions. If f is differentiable at a and
g is differentiable at f (a), then g ◦ f is differentiable at a and

(g ◦ f )′ (a) = g′ ( f (a)) f ′ (a).

Proof. By Theorem 5.1.3 there exist open intervals I ∋ a and J ∋ f (a) and functions F : I → R continuous
at a, G : J → R continuous at f (a) such that

f (x) = F(x)(x − a) + f (a), x ∈ I, g(y) = G(y)(y − f (a)) + g( f (a)), y ∈ J.

By making I smaller (if necessary) we may assume that f (x) ∈ J for all x ∈ I (due to continuity of f at
a). Fix x ∈ I. We can write:

(g ◦ f )(x) = g( f (x)) = G( f (x))( f (x) − f (a)) + g( f (a)) = G( f (x))F(x)(x − a) + (g ◦ f )(a).

We see that if we set H(x) = G( f (x))F(x) for x ∈ I, then

(g ◦ f )(x) = H(x)(x − a) + (g ◦ f )(a).

Moreover, since F is continuous at a and G is continuous at f (a) the function H is continuous at a. If


follows that
(g ◦ f )(x) − (g ◦ f )(a)
(g ◦ f )′ (a) = lim = H(a) = G( f (a))F(a) = g′ ( f (a)) f ′ (a).
x→a x−a

Theorem 5.2.3 (Power Rule ). (i) If n ∈ N then (xn )′ = nxn−1 for all x ∈ R. [2, Section 28, Example
3]

(ii) If q ∈ Q then (xq )′ = qxq−1 for all x > 0.

61
5.3 Mean Value Theorem 5 DIFFERENTIABILITY ON R

Proof. (i) Recall the algebraic formula

an − bn = (a − b)(an−1 + an−2 b + · · · + abn−2 + bn−1 ), for n ∈ N.

Let f (x) = xn . Using the definition of derivative and formula above we have
x n − an
f ′ (a) = lim = lim [xn−1 + xn−2 a + · · · + xan−2 + an−1 ] = nan−1 ,
x→a x − a x→a

since the brackets [. . . ] contain exactly n terms and each of them has limit an−1 as x → a.
(ii) First consider q = 1/n where n ∈ N. Then again from the definition of derivative and formula above
we have for f (x) = x1/n , x > 0, a > 0:

x1/n − a1/n
f ′ (a) = lim
x→a x−a
x1/n − a1/n 1
= lim = .
x→a (x1/n − a1/n )[x(n−1)/n + x(n−2)/n a1/n + · · · + x1/n a(n−2)/n + a(n−1)/n ] na(n−1)/n
This holds, since the term x1/n − a1/n cancels out and what remains in the denominator are n terms and
each of them has limit a(n−1)/n as x → a.
Similarly, we can verify the formula for q = −1/n, n ∈ N, since f (x) = x−1/n can be written as a
composition g ◦ h where g(z) = 1/z and h(x) = x1/n . We can differentiate both of the functions (g using
the quotient rule and h we have already done above).
Finally consider any q = m/n where m ∈ N and n ∈ Z \ {0}. We claim that f (x) = xm/n is differ-
entiable for every x > 0. This follows from the fact that we can write f = g ◦ h where g(z) = zm and
h(x) = x1/n . From (i) we have that g is differentiable and from (ii) (the parts just above) we have that h is
also differentiable for x > 0. Hence by the chain rule f must be differentiable. Applying the chain Rule
we have:
m−1 1
1
 m  m−1 n−1  m  m
′ ′ ′
f (a) = g (h(a))h (a) = m(a ) n
n−1 = a n − n = a n −1 = qaq−1 .
na n n n

5.3 Mean Value Theorem


The mean value theorem formalises the relationship between the derivative of a function and the slope
of one of its chords. We begin with a special case of this theorem.

Theorem 5.3.1 (Rolle’s Theorem, [2, Theorems 29.1, 29.2]). Suppose a, b ∈ R with a < b. If f is
continuous on [a, b], differentiable on (a, b) and f (a) = f (b) then f ′ (c) = 0 for some c ∈ (a, b).

62
5.3 Mean Value Theorem 5 DIFFERENTIABILITY ON R

Proof. By the extreme value theorem f has a minimum m and maximum M it attains somewhere on
[a, b]. If m = M then f is constant, hence f ′ (c) = 0 everywhere on (a, b) and the claim follows.
Otherwise m < M and recalling f (a) = f (b) in our hypothesis we have either f (a) = f (b) > m or
f (a) = f (b) < M or both, hence at least one of m or M is attained at a point c ∈ (a, b). Let us assume
that f (c) = M, the other case is analogous. Since M is the largest value the function attains we have for
all h such that c + h ∈ [a, b]:
f (c + h) − f (c) = f (c + h) − M ≤ 0.
It follows that
f (c + h) − f (c) f (c + h) − f (c)
f ′ (c) = lim ≤ 0, and f ′ (c) = lim ≥ 0,
h→0+ h h→0− h
hence f ′ (c) = 0 from these two inequalities.

Remark 5.3.2. Both continuity and differentiability hypothesis in this theorem are required.

If f (a) ̸= f (b) then Rolle’s theorem cannot be applied directly, instead we have the following mean
value theorem:

Theorem 5.3.3 (Mean Value Theorem [2, Theorem 29.3]). Suppose a, b ∈ R with a < b.

(i) If f is continuous on [a, b] and differentiable on (a, b) then there is c ∈ (a, b) such that

f (b) − f (a) = f ′ (c)(b − a).

(ii) If f , g are continuous on [a, b] and differentiable on (a, b) then there is c ∈ (a, b) such that

f ′ (c)(g(b) − g(a)) = g′ (c)( f (b) − f (a)).

Proof. (i) is a special case of (ii) for g(x) = x. Hence it suffices to prove (ii). Consider a function

h(x) = f (x)(g(b) − g(a)) − g(x)( f (b) − f (a)).

Due to our assumptions on continuity and differentiability of f , g we see that h must satisfy the same
assumptions. Also h(a) = h(b) and hence Rolle’s Theorem applies. It follows that there is a point
c ∈ (a, b) where h′ (c) = 0, i.e.,

h′ (c) = f ′ (c)(g(b) − g(a)) − g′ (c)( f (b) − f (a)) = 0.

63
5.4 Monotone functions and the Inverse Function Theorem 5 DIFFERENTIABILITY ON R

The mean value theorem has many uses and applications in analysis. It can be used to prove certain
inequalities.

Example 5.3.4. Show that ex > 1 + x for all x > 0.


Solution. Consider f (x) = ex − 1 − x for x > 0. By the mean value theorem there exists c ∈ (0, x) such
that
f (x) = f (x) − f (0) = f ′ (c)(x − 0) = x f ′ (c).
Since f ′ (c) = ec − 1 > 0 for c > 0 and x > 0 we see that f (x) > 0 which gives us the desired inequality.

Example 5.3.5 (Bernoulli’s inequality). Let α > 0, δ ≥ −1. Then

(1 + δ )α ≤ 1 + δ α, if α ∈ (0, 1],
(1 + δ )α ≥ 1 + δ α, if α ∈ [1, ∞).

Solution. Here we will consider only the case α ∈ (0, 1]. The other case is left as an exercise. Let
f (x) = xα . By the mean value theorem

f (1 + δ ) = f (1) + αδ cα−1

for some c between values 1 and 1 + δ . There are two cases to consider. If δ > 0 then c > 1 and hence
cα−1 < 1 as α − 1 < 0. This gives
αδ cα−1 ≤ αδ .
On the other hand if −1 ≤ δ ≤ 0 then c < 1 and cα−1 > 1. Multiplying by negative or zero delta yields
again
αδ cα−1 ≤ αδ .

5.4 Monotone functions and the Inverse function theorem


Definition 5.4.1 ([2, Definition 29.6]). Let E ⊂ R, E ̸= 0/ and f : E → R.

(i) f is called increasing (strictly increasing) on E if for all x1 , x2 ∈ E, x1 < x2 we have f (x1 ) ≤ f (x2 )
( f (x1 ) < f (x2 ) in the strictly increasing case).

(ii) f is called decreasing (strictly decreasing) on E if for all x1 , x2 ∈ E, x1 < x2 we have f (x1 ) ≥ f (x2 )
( f (x1 ) > f (x2 ) in the strictly decreasing case).

64
5.4 Monotone functions and the Inverse Function Theorem 5 DIFFERENTIABILITY ON R

(iii) f is called monotone (strictly monotone) on E if f is either increasing or decreasing (respectively,


either strictly increasing or strictly decreasing) on E.

For example f (x) = x2 is strictly monotone on [−1, 0] or [0, 1] but it is non monotone on [−1, 1].

Theorem 5.4.2 ([2, Corollary 29.7]). Let a < b be real and f be continuous on [a, b] and differentiable
on (a, b).

(i) If f ′ (x) > 0 for all x ∈ (a, b) then f is strictly increasing on [a, b].

(ii) If f ′ (x) < 0 for all x ∈ (a, b) then f is strictly decreasing on [a, b].

(iii) f ′ (x) = 0 for all x ∈ (a, b) then f is constant on [a, b].

Proof. In each case use the mean value theorem. It is a good exercise to write down the details.

Exercise 5.4.3. Consider the converse statement of the previous theorem. That is, let a < b and given a
function continuous on [a, b], differentiable on (a, b).

• If f is strictly increasing on [a, b] then is f ′ (x) > 0 for all x ∈ (a, b)?

• If f is strictly decreasing on [a, b] then is f ′ (x) < 0 for all x ∈ (a, b)?

• If f is constant on [a, b] then is f ′ (x) = 0 for all x ∈ (a, b)?

In each case either prove the statement is true, or give a counterexample.

Theorem 5.4.4. Let f be 1-1 and continuous on an interval I. Then f is strictly monotone on I and the
inverse function f −1 is continuous and strictly monotone on f (I).

Proof. We may assume that I contains at least two points, otherwise the claim is trivial. Let a, b ∈ I,
a < b. Since f is 1-1 this implies that f (a) < f (b) or f (a) > f (b). We assume f (a) < f (b) with the
other case treated similarly (multiply by −1 to bring it back to this case). In search of a contradiction
we suppose that f is not strictly monotone on [a, b] so then there exists three points c ∈ (a, b), such that
f (c) ∈
/ ( f (a), f (b)) hence one of the following occurs.

f (c) ≤ f (a) < f (b)


f (a) < f (b) ≤ f (c)

In the first case, the intermediate value theorem (for continuous functions) shows that there is x1 ∈ [c, b)
such that f (x1 ) = f (a). In the second case, the same argument gives some x2 ∈ (a, c] such that f (x2 ) =

65
5.4 Monotone functions and the Inverse Function Theorem 5 DIFFERENTIABILITY ON R

f (b). Since x1 ̸= a and x2 ̸= b we arrive at a contradiction to the fact that f is 1-1. Hence f is strictly
monotone.
The function f is 1-1 hence f : I → f (I) is bijective and the inverse f −1 : f (I) → I is well defined.
The composition f −1 ◦ f s the identity function on I, that is f −1 ◦ f (x) = x for all x ∈ I.
We take a different approach to Wade. Let f (a) ∈ f (I), we will show that f −1 is continuous at f (a).
To that end, let ε > 0, since f is strictly monotone f will map the interval (a−ε, a+ε) to an open interval
J containing f (a). Let δ > 0 be such that ( f (a) − δ , f (a)+ δ ) ⊂ J. Then for any y ∈ ( f (a) − δ , f (a)+ δ )
we have y ∈ J and so f −1 (y) ∈ (a − ε, a + ε) using the fact that f −1 is the inverse to f . That is to say,
f −1 is continuous.
To see that f −1 is strictly monotone, we note that f −1 is 1-1 and continuous and so the first part of
the theorem applies.

Exercise 5.4.5. In the proof that f −1 is continuous we used the continuity of f and f −1 ◦ f . Will this
work more generally? Suppose f and g ◦ f are continuous, must the function g be continuous? Note we
saw earlier that if f and g are continuous then g ◦ f is continuous whenever it is defined.

Theorem 5.4.6 (Inverse function theorem [2, Theorem 29.9]). Let f be 1-1 and continuous on an open
interval I. If a ∈ f (I) and f ′ at the point f −1 (a) exists and is nonzero, then f −1 is differentiable at a and
1
( f −1 )′ (a) = .
f ′ ( f −1 (a))

Proof. By the previous theorem we already know that f −1 exists, is continuous and strictly monotone.
Let x0 = f −1 (a) ∈ I. Since we assume that I is open it follows that there are c, d ∈ R such that x0 ∈
(c, d) ⊂ I. Hence a is a point between f (c), f (d) and therefore we can choose h ̸= 0 sufficiently small
such that a+h is still between f (c), f (d) and hence f −1 (a+h) is well defined and belongs to the interval
(c, d).
Let x = f −1 (a + h). Then f (x) − f (x0 ) = a + h − a = h. Since f −1 is continuous x → x0 if and only
if h → 0. Hence
f −1 (a + h) − f −1 (a) x − x0 1
lim = lim = ′ .
h→0 h x→x0 f (x) − f (x0 ) f (x0 )

Theorem 5.4.7 (L’Hopital’s Rule [2, Theorem 30.2]). Let a be an extended real number and I an interval
that either contains a or has a as an endpoint.
Let f , g be differentiable on I \ {a} and g(x) ̸= 0 ̸= g′ (x) for all x ∈ I \ {a}. Suppose further that

A = lim f (x) = lim g(x)


x→a, x∈I x→a, x∈I

66
5.5 Taylor’s theorem 5 DIFFERENTIABILITY ON R

is either 0 or ∞.
f ′ (x)
If B = lim exists as an extended real number, then
x→a, x∈I g′ (x)

f (x) f ′ (x)
lim = lim .
x→a, x∈I g(x) x→a, x∈I g′ (x)

Proof. There are a lot of cases to consider we only outline the case when B ∈ R and A = 0, a ∈ R.
Consider an arbitrary sequence xk → a as k → ∞ such that xk ∈ I \ {a}. By the sequential characteri-
zation of limits it suffices to show that
f (xk )
→ B, as k → ∞.
g(xk )
In general, functions f , g might not be defined at the point x = a but since
0 = lim f (x) = lim g(x)
x→a, x∈I x→a, x∈I

if we set f (a) = g(a) = 0 then f , g are defined on I ∪ {a} and both are continuous at the point a. It
follows from part (ii) of the mean value theorem that there is c between a and xk such that
f (xk ) f (xk ) − f (a) f ′ (c)
= = ′ .
g(xk ) g(xk ) − g(a) g (c)
(notice that c depends on k here). If we let k → ∞ on both sides, because c lies between xk and a then
c → a as k → ∞. Therefore,
f (xk ) f ′ (c)
lim = lim = B.
k→∞ g(xk ) c→a, c∈I g′ (c)

Example 5.4.8. Find lim x log x.


x→0+
Solution. To use L’Hopital’s Rule we write x as 1/x in the denominator. It becomes a limit of type ∞/∞.
log x 1/x
lim x log x = lim = lim = 0.
x→0+ x→0+ 1/x x→0+ −1/x2

5.5 Taylor’s theorem


Definition 5.5.1. Let n ∈ N and a < b be extended real numbers. If f : (a, b) → R is a function differen-
tiable n-times at a point x0 ∈ (a, b) we call the polynomial
n
f (k) (x0 )
Pnf ,x0 (x) = f (x0 ) + ∑ (x − x0 )k
k=1 k!
Taylor’s polynomial of degree n at x0 .

67
5.5 Taylor’s theorem 5 DIFFERENTIABILITY ON R

Note that we could also write


n
f (k) (x0 )
Pnf ,x0 (x) = ∑ (x − x0 )k
k=0 k!

if we adopt the conventions f (0) = f , 0! = 1 and (x − x0 )0 = 1 even for x = x0 .


The idea is that near the point x0 this polynomial gives a good approximation of the function f . The
following theorem allows to understand how good the approximation may be.

Theorem 5.5.2 (Taylor’s formula [2, Theorem 31.3]). Let n ∈ N and a < b be extended real numbers. If
f : (a, b) → R and if f (n+1) exists on (a, b), then for each x, x0 ∈ (a, b) there exists a number c between
x and x0 which depends on n, x and x0 , such that
n
f (k) (x0 ) f (n+1) (c)
f (x) = f (x0 ) + ∑ (x − x0 )k + (x − x0 )n+1
k=1 k! (n + 1)!
f (n+1) (c)
= Pnf ,x0 (x) + (x − x0 )n+1 .
(n + 1)!
Proof. Assume that x < x0 (the other case is similar). Consider two functions
n
(x − t)n+1 f (k) (t)
F(t) = , G(t) = f (x) − f (t) − ∑ (x − t)k .
(n + 1)! k=1 k!

We claim that F, G are differentiable on (a, b) and that

(x − t)n f (n+1) (t)


F ′ (t) = − , G′ (t) = − (x − t)n .
n! n!
By the generalized mean value theorem there exists c ∈ (x, x0 ) such that

(F(x) − F(x0 ))G′ (c) = (G(x) − G(x0 ))F ′ (c).

Since F(x) = G(x) = 0 this implies

G′ (c)
−F(x0 )G′ (c) = −G(x0 )F ′ (c) or G(x0 ) = F(x0 ) · .
F ′ (c)
Hence
n
f (k) (x0 ) (x − x0 )n+1 (n+1)
f (x) − Pnf ,x0 (x) = f (x) − f (x0 ) − ∑ (x − x0 )k = f (c).
k=1 k! (n + 1)!

68
5.5 Taylor’s theorem 5 DIFFERENTIABILITY ON R

Example 5.5.3. The Taylor’s polynomial of function ex of degree n at 0 is

x2 xn
1+x+ +···+ .
2! n!
The “error term” equals to

x2 xn ec
 
x
e − 1+x+ +···+ = xn+1 ,
2! n! (n + 1)!

for some c between 0 and x. Recall that c depends on both n and x.


Calculate the error we make if we estimate the value of the number e using Taylor’s polynomial of
degree 10.
Solution: The error is less than
10
1 ec 3
e− ∑ = 110+1 ≤ ≈ 7.8 × 10−8 ,
k=0 k! (10 + 1)! 11!

since c ∈ (0, 1) and we estimate ec ≤ e1 = e < 3.

69
REFERENCES REFERENCES

References
[1] Martin Liebeck. A Concise Introduction to Pure Mathematics. Chapman & Hall / CRC, 2011.

[2] Kenneth A. Ross. Elementary Analysis. Springer, 2013.

70

You might also like