Professional Documents
Culture Documents
MAT2122 Multivariable Calculus
MAT2122 Multivariable Calculus
MAT2122 Multivariable Calculus
MAT 2122
Fall 2021
y
x
Alistair Savage
University of Ottawa
Preface 4
1 Differentiation 6
1.1 Open sets and boundaries . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
1.2 Limits . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
1.3 Affine functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
1.4 Affine approximations in one and two variables . . . . . . . . . . . . . . . . . 14
1.5 The derivative . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
1.6 Differentiation rules . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
1.7 Paths, curves, and surfaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
1.8 Directional derivatives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
1.9 Higher order derivatives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
1.10 Taylor’s theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
1.11 The implicit function theorem . . . . . . . . . . . . . . . . . . . . . . . . . . 35
1.12 The inverse function theorem . . . . . . . . . . . . . . . . . . . . . . . . . . 41
2 Extrema 44
2.1 Maximum and minimum values . . . . . . . . . . . . . . . . . . . . . . . . . 44
2.2 The second derivative test . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48
2.3 Constrained extrema . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51
2.4 Lagrange multipliers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53
2.5 Multiple constraints . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56
2.6 Finding global extrema . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58
2
Contents 3
4 Change of variables 90
4.1 Two variables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 90
4.2 Polar coordinates . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 95
4.3 Three variables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 97
4.4 Cylindrical coordinates . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 98
4.5 Spherical coordinates . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 100
These are notes for the course Multivariable calculus (MAT 2122) at the University of Ot-
tawa. This is a course on calculus in multiple dimensions aimed at students majoring in
mathematics or doing a joint degree with another subject. We will focus mainly on real-
valued functions of multiple real-valued inputs. Many concepts will be discussed using the
language of vectors and linear algebra, since this is the most natural setting for multivariable
calculus. We’ll see how much of the calculus you learned in previous courses generalizes to
multiple dimensions. This will allow us to explore interesting new mathematics, such as
integration over surfaces and three-dimensional regions.
Acknowledgements: I would like to thank Aaron Tikuisis and Tanya Schmah for sharing with
me their lecture notes for this course, upon which portions of these lecture notes are based.
The official textbooks for this course are the open access books [FRYc, FRYe]. Portions of
these notes follow those textbooks.
Alistair Savage
4
Contents 5
Conventions
We will use the following conventions in these notes:
x = (x1 , . . . , xn ) (1)
or as column matrices
x1
x = ... , (2)
xn
interchangeably. Note that (1) is not the same as the row matrix
x1 · · · xn .
Rather, (1) is just a horizontally written notation (to save space) for the column matrix (2).
We write q
∥x∥ = x21 + · · · + x2n
for the norm of a vector. Note that some textbooks, including [FRYe], use the notation |x|
for this norm.
Exercises
After most sections, there is a list of exercises. Some exercises are written directly in the
notes, but most refer to the problems book of the course textbooks. Sometimes the list of
exercises is quite long, but many of the problems are similar. In this case, you should attempt
a selection of them—enough so that you feel comfortable doing that type of question. Doing
exercises is the best way to learn the material!
Chapter 1
Differentiation
In this first chapter, we discuss differentiation in multiple dimensions. Some results will be
review from previous courses, while other material will be new.
Definition 1.1.1 (Open ball). Let a ∈ Rn and ε > 0. The open ball of radius ε centered at
a is
ε
n
Bε (a) := {x ∈ R : ∥x − a∥ < ε}. •
a
In other words, A is open if every point of A is the center of an open ball contained in A.
•
a
Example 1.1.3. Every open ball is an open set. Let’s prove it! Choose b ∈ Rn and δ > 0.
We will prove that A = Bδ (b) is open. Let a ∈ A. We must find ε > 0 such that Bε (a) ⊆ A.
δ •
a
•
b
6
Open sets and boundaries 7
ε = δ − ∥a − b∥ > 0.
∥x − b∥ ≤ ∥x − a∥ + ∥a − b∥ (triangle inequality)
< ε + ∥b − a∥
= δ − ∥a − b∥ + ∥b − a∥
= δ.
Example 1.1.4. The set A = (0, ∞) × (0, ∞) ⊆ R2 is open. Indeed, if (a, b) ∈ A, then
a, b > 0. If we take ε = min{a, b}, then Bε (a, b) ⊆ A.
•
△
Example 1.1.5. The set A = [0, ∞) × (0, ∞) is not open. To see this, we note that for any
b > 0, we have (0, b) ∈ A. However, for all ε > 0, we have
− 2ε , b ∈ Bε (0, b) but − 2ε , b ∈
/ A.
Hence Bε (0, b) ⊈ A. △
A \ B := {x : x ∈ A, x ∈
/ B}.
Definition 1.1.6 (Boundary). Let A ⊆ Rn . The boundary of A is the set of all points
a ∈ Rn such that,
∀ε > 0 (Bε (a) ∩ A ̸= ∅ and Bε (a) \ A ̸= ∅.)
We denote the boundary of A by ∂A.
Remark 1.1.7. In Section 8.4 we will give a different definition of boundary for surfaces in
R3 .
8 Differentiation
•
b
In this case, none of the boundary points of A are in A. The closed ball has the same
boundary:
∂{x ∈ Rn : ∥x − b∥ ≤ δ} = {x ∈ Rn : ∥x − b∥ = δ}.
In this case, all of the boundary points are in the set itself. △
Example 1.1.9. If
A = [0, ∞) × (0, ∞)
then
∂A = {(x, 0) : x ≥ 0} ∪ {(0, y) : y ≥ 0}.
The boundary is the region colored green in the following picture:
Exercises.
1.1.1. For each of the following sets, state whether or not the set is open and describe its
boundary.
(a) [0, 1] × (0, 1)
(b) (0, 1) × (0, 1)
(c) [0, 1] × [0, 1]
(d) {(x, y) : y ≥ x2 }
(e) {(x, y) : y > x2 }
(f) R2
(g) ∅
Limits 9
1.2 Limits
The notion of a limit underpins all of calculus. In this section we give a precise definition of
a limit in the multivariable setting, and discuss several important properties of limits.
Definition 1.2.1 (Limit). Suppose A ⊆ Rn is an open set, f : A → Rm , a ∈ A ∪ ∂A, and
L ∈ Rm . Then we write
lim f (x) = L
x→a
if
∀ε > 0, ∃δ > 0 such that (x ∈ A, 0 < ∥x − a∥ < δ) =⇒ ∥f (x) − L∥ < ε.
(Note that the condition 0 < ∥x − a∥ is equivalent to the condition x ̸= a.)
Remarks 1.2.2. (a) As in single variable calculus, limits may not exist. However, if they
exist, they are unique.
(b) Note that Definition 1.2.1 allows a to lie in the boundary of A, and does not require
a ∈ A. Thus, in some situations, we can discuss the limit of a function at a point that
does not lie in the domain of that function.
The fi are the component functions of f , and we write f = (f1 , . . . , fm ). The following
proposition states that the coordinates of the limit of f are limits of its component functions
(if these all exist).
Proposition 1.2.3. Suppose A ⊆ Rn is an open set, f : A → Rm , a ∈ A ∪ ∂A, and
L = (L1 , . . . , Lm ) ∈ Rm . Then
lim f (x) = L
x→a
if and only if
lim fi (x) = Li for i ∈ {1, 2, . . . , m}.
x→a
In particular limx→a f (x) exists if and only if limx→a fi (x) exists for each i.
Proof. Suppose limx→a f (x) = L. Let ε > 0. Then there exists δ > 0 such that
we have
(x ∈ A, 0 < ∥x − a∥ < δ) =⇒ |fi (x) − Li | ≤ ∥f (x) − L∥ < ε.
10 Differentiation
Remark 1.2.6. In light of Proposition 1.2.3 and Corollary 1.2.5, we will often restrict our
attention to real-valued functions f : A → R, A ⊆ Rn , when discussing limits and continuity.
We will see later that we can also make this simplification when studying derivatives. How-
ever, we cannot make analogous simplifications in the domain of a function. For example,
we will see that all the partial derivatives of a function may exist, even though the function
is not differentiable.
ℓ(t)
•
a=ℓ(0)
Limits 11
Warning 1.2.9. The converse of Proposition 1.2.8 is false! There exist functions f such that
limt→0 f (ℓ(t)) = L for all lines ℓ, but limx→a f (x) does not exist. See [FRYc, §2.1.1] for an
example.
We have
t2 + 5t · 0 + 3 · 02 t2
f (ℓ(t)) = f (t, 0) = = = |t|.
|t| + |0| |t|
Thus,
lim f (ℓ(t)) = lim |t| = 0.
t→0 t→0
Now
|x2 | |x| |x| |x|(|x| + |y|)
= ≤ = |x|,
|x| + |y| |x| + |y| |x| + |y|
5|xy| 5|x|(|x| + |y|)
≤ = 5|x|,
|x| + |y| |x| + |y|
3|y 2 | 3|y|(|x| + |y|)
≤ = 3|y|.
|x| + |y| |x| + |y|
Therefore
|f (x, y) − 0| ≤ |x| + 5|x| + 3|y| = 6|x| + 3|y|.
Since lim(x,y)→(0,0) (6|x| + 3|y|) = 0, we have
lim f (x, y) = 0
(x,y)→(0,0)
x2 − y 2
f (x, y) = .
x2 + y 2
12 Differentiation
Does lim(x,y)→(0,0) f (x, y) exist? If so, what is it? Let’s try approaching the origin along the
two coordinate axes. First consider ℓ1 (t) = (t, 0). Then
t2 − 02
f (ℓ1 (t)) = f (t, 0) = = 1.
t2 + 02
02 − t2
f (ℓ2 (t)) = f (0, t) = = −1.
02 + t2
We conclude this section with some properties of limits. The proofs of these properties
are analogous to the proofs in the single-variable case, so we omit them. For the remainder
of this section, A ⊆ Rn is open, f, g : A → R, and a ∈ A ∪ ∂A.
then K = L.
Proposition 1.2.13 (Arithmetic of limits). Suppose limx→a f (x), limx→a g(x) both exist.
1 1
lim = .
x→a f (x) limx→a f (x)
1
Corollary 1.2.14. If a ∈ A and f, g are both continuous at a, then so are f + g, f g, and f
(if defined).
Exercises.
1.2.1. Prove Proposition 1.2.8.
f : Rn → Rm , n, m ∈ N.
Intuitively, both linear and affine functions are those functions whose graphs are “flat”.
However, the graphs of linear functions are required to pass through the origin. Note that if
x0 ∈ Rm is some fixed vector, then the function x 7→ A(x − x0 ) + b is also affine, since we
have
Ax + (b − Ax0 ) .
A(x − x0 ) + b = |{z}
| {z }
linear translation
Many of the affine maps we are about to see will be of this form.
f (x)
Note that the graph passes through the origin. On the other hand, the function
g : R → R, g(x) = 12 x + 1,
14 Differentiation
g(x)
Note that the graph does not pass through the origin. △
Exercises.
1.3.1. Which of the following functions are linear? Which are affine?
(a) f : R2 → R, f (x, y) = 2x − 7y + 10
(b) f : R2 → R, f (x, y) = 2(x − 1) + 5(y − 3) + 17
(c) f : R2 → R, f (x, y) = x2 + y
(d) f : R2 → R3 , f (x, y) = (x + 4y, 3x + y, 1)
(e) f : R3 → R2 , f (x, y, z) = (z, x)
df f (x) − f (a)
f ′ (a) = (a) := lim ,
dx x→a x−a
if this limit exists. If this limit exists, we say f is differentiable at a. We say f is differentiable
if it is differentiable at all a ∈ A.
Affine approximations in one and two variables 15
Lemma 1.4.1. The function f is differentiable at a with derivative f ′ (a) if and only if
and
Since the two one-sided limits exist and are equal, we have proved (1.2). The proof of the
reverse implication is similar, and is left as exercise Exercise 1.5.1.
Recall that the tangent line to a differentiable function f : R → R at the point (a, f (a))
is given by
y = f ′ (a)(x − a) + f (a). (1.3)
In other words, it is the graph of the affine function
Lemma 1.4.1 says that this function is the best affine approximation to the function f (x)
near the point a. Precisely, it says that the function (1.4) is the unique affine function such
that the difference between it and f (x) (the “error”) goes to zero faster than |x − a|.
Example 1.4.2. To refresh our memory, let’s find the tangent line to the function
f (x)
f : R → R, f (x) = x3 + x2 − 2x, x
at the point (−1, f (−1)) = (−1, 2). We first compute the derivative:
The tangent line to the graph of f at the point (−1, 2) is therefore given by
we see that it is the graph of an affine function. In particular, it is the graph of the affine
function
g(x) = −x + 1. △
Now let’s move to two variables. Suppose A ⊆ R2 is open and we have a differentiable
function f : A → R. (We’ll return to the precise definition of a differentiable function in
multiple variables soon.) Remember that the tangent plane to the graph of f at the point
(a1 , a2 , f (a1 , a2 )) is given by
where
∂f ∂f
fx = and fy = (1.6)
∂x ∂y
are the partial derivatives of f ,
∇f = (fx , fy ) (1.7)
is the gradient of f , and · denotes the dot product.
f : R2 → R, f (x, y) = x3 + xy − y 2 + 4.
Suppose we want to find the equation of the tangent plane to the graph of f (that is, the
surface z = f (x, y)) at the point (0, 1, f (0, 1)) = (0, 1, 3). We compute
fx = 3x2 + y, fy = x − 2y.
Exercises.
Exercises from [FRYb, §2.5] : Q5–Q7, Q10.
The derivative 17
y = f (x).
∆x = x − a, ∆y = y − b.
Note that a (and thus b) are fixed points, x is our variable, and y changes with x, according
to the equation y = f (x).
In our new notation, (1.3) becomes
∆y = f ′ (a)∆x (1.8)
(1.9)
∆y = fx (a)(x1 − a1 ) + fy (a)(x2 − a2 ) = fx fy ∆x,
where the righthand side involves matrix multiplication. These, of course, are the cases
where m = 1 and n = 1 or n = 2, respectively. You might now be able to guess the general
form.
Definition 1.5.1 (Differentiable function). Let A ⊆ Rn be open, f : A → Rm , and a ∈ A.
We say f is differentiable at a if there exists a linear map Df (a) : Rn → Rm such that
If all the partial derivatives of the components functions fi exist, then the Jacobian matrix,
or matrix of partial derivatives of f at a point a ∈ Rn is the m × n matrix
∂f1 ∂f1
∂x1
(a) · · · ∂xn
(a)
.. .. .. .
h i
∂fi
∂xj
(a) = . . .
∂fm ∂fm
∂x1
(a) · · · ∂xn (a)
18 Differentiation
Theorem 1.5.2. If f is differentiable at a, then its total derivative is the linear map cor-
responding to its Jacobian matrix (that is, the linear map given by left multiplication by the
Jacobian matrix).
We will not prove Theorem 1.5.2 in class, since the proof is a bit long and technical.
As in linear algebra, we often identify a matrix and the corresponding linear map. Thus,
Theorem 1.5.2 says that, if f is differentiable at a, then
∂f1 ∂f1
∂x
(a) · · · ∂x n
(a)
Df (a) = ∂xji (a) = ... .. .
1
..
h i
∂f
. .
∂fm ∂fm
∂x1
(a) · · · ∂xn (a)
As will see in Theorem 1.5.12, it is possible for the Jacobian matrix to exist (that is, the
partial derivatives of all the component functions of f exist), but the function to not be
differentiable. However, we will still use the notation Df (a) for the Jacobian matrix in this
case.
Definition 1.5.1 states that, if f : A → Rm , A ⊆ Rn , is a differentiable function, then the
best affine approximation to f at the point a ∈ A is given by
Note that, when n = m = 1, then (1.11) becomes (1.8), describing a tangent line. On the
other hand, when n = 2, m = 1, equation (1.11) becomes (1.9), describing a tangent plane.
In higher dimensions, we refer to the affine approximation as a hyperplane.
Remark 1.5.3. Some references use the term linear approximation instead of affine approxi-
mation. However, the term affine approximation is more correct.
f : A → R, A ⊆ Rn .
f : R2 → R, f (x, y) = 3xy 2 .
We have
∂f ∂f
∇f = , = (3y 2 , 6xy), ∇f (2, 1) = (3, 12).
∂x ∂y
Thus
f (x, y) − f (2, 1) − ∇f (2, 1) · (x − 2, y − 1)
lim
(x,y)→(2,1) ∥(x, y) − (2, 1)∥
3xy 2 − 6 − 3(x − 2) − 12(y − 1)
= lim
(x,y)→(2,1) ∥(x − 2, y − 1)∥
2
3xy − 3x − 12y + 12
= lim .
(x,y)→(2,1) ∥(x − 2, y − 1)∥
Setting w = x − 2, z = y − 1, this limit is equal to
3(w + 2)(z + 1)2 − 3(w + 2) − 12(z + 1) + 12 3wz 2 + 6z 2 + 6wz
lim = lim √ .
(w,z)→(0,0) ∥(w, z)∥ (w,z)→(0,0) w2 + z 2
√ √
Now, since |z| = z 2 ≤ w2 + z 2 , we have
3wz 2 + 6z 2 + 6wz |z|
√ =√ |3wz + 6z + 6w| ≤ |3wz + 6z + 6w| → 0
2
w +z 2 w2 + z 2
as (w, z) → (0, 0). Thus, f is differentiable at (2, 1) with total derivative
⊺
Df (2, 1) = ∇ f (2, 1) = 3 12 .
and the tangent plane to the graph of f at the point (2, 1, f (2, 1)) = (2, 1, 6) is given by
Notice that, in Example 1.5.4, it was quite easy to compute the gradient ∇f , but then
it required some effort to prove that the function f was differentiable. It would be nice if
there were an easier way to see that a function is differentiable, at least for some large class
of functions. In fact, there is, as we now explain.
First, as we saw for limits, we can actually always reduce our attention to this m = 1
case, as the following proposition states.
Proposition 1.5.5. Let A ⊆ Rn be open, f = (f1 , . . . , fm ) : A → Rm , and a ∈ A. Then f
is differentiable at a if and only if each component function fi , 1 ≤ i ≤ m, is differentiable
at a. In this case, we have ⊺
∇ f1 (a)
Df (a) =
..
.
⊺
∇ fm (a)
where ∇⊺ fi (a) is the transpose of the gradient ∇fi (a).
20 Differentiation
Since all the partial derivatives are continuous, the function f is differentiable at every point
of R3 , by Theorem 1.5.6.
If a = (2, 0, π), we have
1 2 0
Df (a) = .
12 1 0
Since f (a) = (2, 8), the best affine approximation to f at a is given by
x1 − 2
1 2 0
y − (2, 8) = x2 ,
12 1 0
x3 − π
that is,
Since the graph of f lives in five dimensions, we are not able to visualize this as we can do
for tangent lines and tangent planes. Nevertheless, we were still able to give an algebraic
description of the approximation. △
Warning 1.5.9. The converse to Theorem 1.5.6 is false (even for single-variable functions!),
as the next example shows.
The derivative 21
We have
f (x) − 0 1
lim = lim x sin x
=0
x→0 x x→0
Therefore, f is differentiable everywhere. However, limx→0 f ′ (x) does not exist, and so f ′ (x)
is not continuous at x = 0. △
Remark 1.5.11. A function whose partial derivatives exist and are continuous is said to be of
class C 1 . Thus, Theorem 1.5.6 and Corollary 1.5.7 say that any C 1 function is differentiable.
However, Example 1.5.10 shows that there are differentiable functions that are not of class
C 1.
You learned in earlier classes that, for single-variable functions, differentiability implies
continuity. We have the same result in higher dimensions.
Proof. The proof is very similar to the single-variable case, and so will omit it.
Warning 1.5.13. The existence of all partial derivatives is not enough to ensure that a
function is differentiable, as the next example illustrates.
Then ( (
x2 −2xy x2
∂f (x−y)2
x ̸= y, ∂f − (x−y)2 x ̸= y,
(x, y) = (x, y) =
∂x 1 x = y, ∂y 0 x = y.
However, f is not continuous as (0, 0). (See [FRYc, Example 2.2.9] for details.) Thus, the
partial derivatives exist everywhere (and are continuous everywhere except at (0, 0)), but f
is not differentiable at (0, 0), since it is not continuous there. △
22 Differentiation
Exercises.
1.5.1. Complete the proof of Lemma 1.4.1.
This result can be generalized to multivariable functions, using tools of linear algebra.
Differentiation rules 23
f : Rn → Rm and g : Rm → Rk
where the product on the right-hand side is matrix multiplication (equivalently, composition
of linear maps).
Then we have
2u 5
Df = , Dg = y 2 2xy .
3v 3u
Thus, using the chain rule, we have
In particular, we have
∂(g ◦ f ) ∂(g ◦ f )
= 36u3 v 2 + 90uv 3 and = 18u4 v + 135u2 v 2 .
∂u ∂v
Of course, we could check this by computing
Exercises.
Exercises from [FRYb, §2.4] : Q1, Q5, Q6, Q12–Q14, Q18, Q22.
24 Differentiation
c(t) •
• c(b)
•
c(a)
A curve is the image of a path:
C = {c(t) : t ∈ [a, b]}.
Example 1.7.2. Consider the path
c(π/2)
•
Given a path c : [a, b] → Rn , we will usually denote its derivative by c′ (t) instead of Dc(t).
Thus, if
c(t) = (c1 (t), . . . , cn (t)) then c′ (t) = (c′1 (t), . . . , c′n (t)).
For t0 ∈ [a, b], the derivative c′ (t0 ) is the velocity at time t0 . (We will often use this termi-
nology, even if t does not really denote time in our particular situation.)
c(t0 )
• c′ (t0 )
Paths, curves, and surfaces 25
z = f (x, y)
20
z 10
2
0
−4 0
−2 0
2 y
x −2
h i c′ (t)
′ ∂f ∂f
(f ◦ c) (t) = Df (c(t))Dc(t) = (c(t)) (c(t)) 1
= ∇f (c(t)) · c′ (t).
∂x ∂y c′2 (t)
describes a cone:
z 0.5
0 0.5
0
−0.5 0
0.5 −0.5
1 y
x
(b) We can describe the surface as the set of solutions to an equation. For example,
S = {(x, y, z) : x2 + y 2 ≤ 1, z 2 = x2 + y 2 }
(c) We can describe a surface as the graph of a function f : A → R. For example, the
graph of the function
p
f : A → R, f (x, y) = x2 + y 2 , A = {(x, y) : x2 + y 2 ≤ 1}
When n = 2, level sets are sometimes called contour lines. A contour plot is an image
showing multiple contour lines for the same function.
f : R2 → R, f (x, y) = x2 y.
The level set of f at k is the set of all points (x, y) such that k = f (x, y) = x2 y, or,
Directional derivatives 27
equivalently, y = k
x2
.
6 y
k=2
4 k=1
k = −1
2 k = −2
x
−1.5 −1 −0.5 0.5 1 1.5
−2
−4
−6 △
Exercises.
Exercises from [FRYb, §1.6] : Q1–Q4, Q6, Q10, Q21.
Exercises from [FRYb, §1.7] : Q2–Q11.
Remark 1.8.2. Some references require the vector v in Definition 1.8.1 to be a unit vector,
that is, ∥v∥ = 1. We will not require this.
f : R3 → R, f (x, y, z) = xy 2 − 3z,
f (a + tv) = f (q + 2t, r + t, s − t)
= (q + 2t)(r + t)2 − 3(s − t)
= qr2 + 2tr2 + 2qtr + 4t2 r + qt2 + 2t3 − 3s + 3t.
28 Differentiation
Thus
d
f (a + tv) = 2r2 + 2qr + 8tr + 2qt + 6t2 + 3,
dt
and so
d
Dv f (q, r, s) = f (a + tv) = 2r2 + 2qr + 3. △
dt t=0
Dv f (a) = ∇f (a) · v.
Proof. Let
ℓ(t) = a + tv, t ∈ R.
Then
d
Dv f (a) = f ◦ ℓ(t)
dt t=0
= (f ◦ ℓ)′ (t)
= Df (ℓ(0))Dℓ(0) (chain rule)
⊺
= ∇ f (a)ℓ′ (0)
= ∇f (a) · v.
Thus
D(2,1,−1) f (x, y, z) = 2y 2 + 2xy + 3.
Taking (x, y, z) = (q, r, s), this gives the same directional derivative that we computed in
Example 1.8.3. △
Note that there is a significant computational advantage that comes from Proposi-
tion 1.8.4. Namely, we can compute the gradient ∇f once, and then find any directional
derivative by computing a simple dot product.
The following result gives two important interpretations of the gradient.
(a) The gradient ∇f (a) is a vector pointing in the direction in which f increases most
rapidly.
(b) If S is the level set of f at k (hence a ∈ S), then ∇f (a) is orthogonal to the tangent
line/plane/hyperplane to S at a.
Higher order derivatives 29
a• k=1
k=2
k=3
Proof. (a) For any unit vector v, Dv f (a) tells us how fast f changes along the direction
v. We have
Dv f (a) = ∇f (a) · v = ∥v∥ ∥∇f (a)∥ cos θ = ∥∇f (a)∥ cos θ,
where θ is the angle between ∇f (a) and v. Now, cos θ attains its maximal value when θ = 0,
that is, when ∇f (a) and v are parallel.
(b) Let c(t) be a path in S such that c(t0 ) = a. Then c′ (t0 ) is a tangent line contained
in the tangent plane to S at a. So we want to show that ∇f (a) ⊥ c′ (t0 ) for any such path.
For all t we have
Thus we have
0 = (f ◦ c)′ (t)
= D(f ◦ c)(t)
= ∇f (c(t)) · c′ (t). (chain rule)
Exercises.
Exercises from [FRYb, §2.7] : Q1–Q8, Q10–Q20.
∂ 2f ∂ ∂f
:= , 1 ≤ i, j ≤ n.
∂xi ∂xj ∂xi ∂xj
30 Differentiation
Of course, if the partial derivatives of these second order derivatives exist, we can differentiate
again, to get the third order derivatives
∂ 3f ∂ ∂ 2f
:= , 1 ≤ i, j, k ≤ n.
∂xi ∂xj ∂xk ∂xi ∂xj ∂xk
We can continue as long as we obtain functions whose partial derivatives exist.
We may also use the subscript notation of (1.6), especially when our variables are denoted
x, y, z. Then, for example, we have
∂ 2f ∂ 2f
fxx := (fx )x = , fxy := (fx )y = ,
∂x ∂x ∂y ∂x
∂ 2f ∂ 2f
fyx := (fy )x = , fyy := (fy )y = .
∂x ∂y ∂y ∂y
Pay careful attention to the order of the subscripts!
We also have
∂ 4f ∂ 3f
x31 cos(x3 x4 )
=
∂x1 ∂x3 ∂x4 ∂x2 ∂x1 ∂x3 ∂x4
Higher order derivatives 31
∂ 2f
−x31 x3 sin(x3 x4 )
=
∂x1 ∂x3
∂f
−x31 sin(x3 x4 ) − x31 x3 x4 cos(x3 x4 )
=
∂x1
= −3x21 sin(x3 x4 ) − 3x21 x3 x4 cos(x3 x4 ). △
Example 1.9.5. The functions sin x, cos x, ex , x1 , and log x are all of class C ∞ . (Note that
we are implicitly assuming that the domain of each function is the subset of R where it is
defined.) Similarly, all polynomials are of class C ∞ . △
We saw in Example 1.9.1 that fxy = fyx , and we saw in Example 1.9.2 that
∂ 4f ∂ 4f
= .
∂x3 ∂x1 ∂x2 ∂x4 ∂x1 ∂x3 ∂x4 ∂x2
In both examples, it did not matter in what order we computed the partial derivatives. The
following theorem explains this phenomenon.
Theorem 1.9.6 (Clairaut’s theorem or Schwarz’s theorem). Suppose A ⊆ Rn is open and
2f ∂2f
f : A → R. If the partial derivatives ∂x∂i ∂xj
and ∂x j ∂xi
exist and are continuous at a, then
∂ 2f ∂ 2f
(a) = (a).
∂xi ∂xj ∂xj ∂xi
In particular, if f is of type C 2 , then
∂ 2f ∂ 2f
= for all 1 ≤ i, j ≤ n.
∂xi ∂xj ∂xj ∂xi
Proof. A proof of this theorem can be found in [FRYc, §2.3.1].
Remark 1.9.7. Using Theorem 1.9.6 and induction, we get analogues for higher order deriva-
tives. For example, if f : R3 → R is of class C 4 , then
∂ 4f ∂ 4f
= .
∂x ∂y ∂x ∂z ∂z ∂y ∂x2
32 Differentiation
Warning 1.9.8. There exist functions whose mixed partial derivatives exist, but are not
equal!
f : A → B, g : B → Rm ,
Exercises.
1.9.1. Prove (1.16) using induction.
where
Rk (h)
→ 0 as h → 0.
hk
Taylor’s theorem 33
Definition 1.10.1 (Taylor polynomial). If the k-th order partial derivatives of f exist at a,
then the k-th Taylor polynomial, or the degree k Taylor polynomial of f at a is
k n
X 1 X ∂ if
pk (a + h) = f (a) + (a)hj1 · · · hji , (1.17)
i=1
i! j ,...,j =1 ∂xj1 · · · ∂xji
1 i
so that
f (a + h) = pk (a + h) + Ra,k (h).
Ra,k (h)
lim = 0.
h→0 ∥h∥k
Proof. We will not prove this theorem in class. A proof of the k = 2 case can be found in
[MT11, §3.2, Th. 3].
Definition 1.10.3 (Hessian matrix). If the second partial derivatives of f exist, then the
Hessian matrix of f at a is
2
∂ f ∂2f ∂2f
2 (a) ∂x1 ∂x2
(a) · · · ∂x1 ∂xn
(a)
∂x2 1
∂ f (a) ∂2f ∂2f
(a) · · · (a)
Hf (a) := ∂x2 ∂x.1 ∂x22 ∂x2 ∂xn
∈ Matn×n (R). (1.18)
. .
. . . .
.
. . . .
∂2f ∂2f ∂2f
∂xn ∂x1
(a) ∂xn ∂x2 (a) · · · ∂x2
(a)
n
Comparing (1.17) and (1.18), we see that the degree 2 Taylor polynomial is given by
1 ⊺
p2 (a + h) = f (a) + ∇f (a) · h + h (Hf (a))h, (1.19)
2
∂ 2f ∂ 2f √ √
= −2x ey sin(x2 − 2y) − 2ey cos(x2 − 2y) , ( π, π) = −4 πeπ .
∂x ∂y ∂x ∂y
Thus, the gradient and Hessian are
√ π
√ √
π
4πe −4 πe
∇f ( π, π) = (0, −eπ ), Hf ( π, π) = √ ,
−4 πeπ 3eπ
and the degree 2 Taylor polynomial is
√
p2 ( π, π) + (h1 , h2 )
√ π
π π 1 4πeπ −4 πe h1
= −e + (0, −e ) · (h1 , h2 ) + h1 h2 √ π π
2 −4 πe 3e h2
3 √
= −eπ − eπ h2 + 2πeπ h21 + eπ h22 − 4 πeπ h1 h2 .
2
Since f is of class C 2 , Corollary 1.10.4 tells us that the difference between the function and
this polynomial goes to zero faster than ∥h∥2 as h → 0. △
The implicit function theorem 35
Exercises.
1.10.1 ([MT11, §3.2, Exercise 1]). Let f (x, y) = ex+y .
L : R2 → R, L(x, y) = ax + by,
is a linear function.
1.10.3 ([MT11, §3.2, Exercises 3–7]). For each of the following, determine the second-order
Taylor polynomial for the given function about the point (0, 0). Use this polynomial to
approximate f (−1, −1). Compare your approximation to the exact value using a calculator.
1.10.4 ([MT11, §3.2, Exercise 11]). Let g(x, y) = sin(xy) − 3x2 ln y + 1. Find the degree 2
polynomial that best approximates g near the point (π/2, 1).
{x ∈ A : f (x) = k}
f (x, y) = 0.
When can we solve for one variable in terms of another? If the level set satisfies the vertical
line test (every vertical line meets the graph at most once), then we can write y as a function
of x.
y
level set
vertical line test
1 2 3 4 x
−5
y
4
2
x
f (x, y) = x3 − 8x2 + 16x − y 2 = 0.
1 2 3 4 5 6
−2
−4
This level set does not satisfy the vertical line test. However, near some points we could
write y as a function of x. For instance, consider the portion of the level set near the point
(1, 3), indicated in red above. In this small region, we can write y as a function of x. On
the other hand, we would not be able to do this near the point (0, 0). The problem is that
the tangent line is vertical there.
To state the implicit function theorem, we use the fact that Rn+1 = Rn × R to write
points of Rn+1 in the form (x, z), where x ∈ Rn , and z ∈ R. In particular, if F : Rn+1 → R,
then ∂F
∂z
denotes the partial derivative with respect to the last variable.
∂F
F (x0 , z0 ) = 0 and (x0 , z0 ) ̸= 0.
∂z
The implicit function theorem 37
Example 1.11.3. Consider the points (1, 3), (0, 0), and (4, 0) of the level set
y
4 (1, 3)
•
2 (4, 0)
(0, 0) x
F (x, y) = x3 − 8x2 + 16x − y 2 = 0. • •
2 4 6
−2
−4
We have
∂F
= −2y.
∂y
This is zero if and only if y = 0. Therefore, the special implicit function theorem (The-
orem 1.11.2) implies that we can write y as a function of x near every point on the level
set where y ̸= 0. In particular, we can do so near the point (1, 3). (Note that the z in
Theorem 1.11.2 is our y here.)
We could also interchange the roles of the variables, and try to write x as a function of
y. Then the z of Theorem 1.11.2 is our x. Since
∂F
= 3x2 − 16x + 16 = (x − 4)(3x − 4),
∂x
we can write x as a function of y near every point on the level set where x ̸= 34 , 4. In
particular, we can do this near (0, 0), a point where we could not write y as a function of x.
The most problematic point is (4, 0), since both partial derivatives of F vanish here:
∇F (4, 0) = 0.
38 Differentiation
Here we cannot write any variable as a function of the others. Looking at the picture of the
level set above, we see why this is the case. The level set does not satisfy the vertical line
test or the horizontal line test in any neighbourhood of the point (4, 0). △
We now give a more general version of the implicit function theorem. Suppose we have
a C 1 function
F = (F1 , . . . , Fm ) : Rn × Rm → Rm .
We write points of Rn ×Rm = Rn+m as (x, z), with x = (x1 , . . . , xn ) ∈ Rn , z = (z1 , . . . , zm ) ∈
Rm . Define
∂F1 ∂F1
∂x1
· · · ∂xn
. . ..
. .. and
Dx F := . . ∈ Matm×n (R)
∂Fm
∂x
· · · ∂F
∂xn
m
∂F11 ∂F1
∂z1
· · · ∂zm
. . ..
Dz F := ..
.. . ∈ Matm×m (R).
∂Fm
∂z1
· · · ∂F
∂zm
m
(This conflicts with our notation for the directional derivative, but the context should make
clear what we mean.) We view the equation F (x, z) = 0 as a collection of m equations:
F1 (x, z) = 0, . . . , Fm (x, z) = 0.
We’re interested in trying to solve for z1 , . . . , zm in terms of x1 , . . . , xn .
Theorem 1.11.5 (General implicit function theorem). Suppose F : Rn × Rm → Rm is a C 1
function and (x0 , z0 ) ∈ Rn × Rm , such that
F (x0 , z0 ) = 0 and Dz F (x0 , z0 ) is invertible.
Then there exist neighbourhoods U ⊆ Rn of x0 and V ⊆ Rm of z0 such that there is a unique
function
g : U → V such that F (x, g(x)) = 0 for all x ∈ U.
Furthermore, g is of class C 1 and
−1
Dg(x) = − Dz F (x, g(x)) Dx F (x, g(x)).
Sketch of proof. For a complete proof of this theorem, see [Gou59, p. 45]. We give here only
a sketch of a proof to give some intuition as to why the theorem might hold. The first degree
Taylor polynomial of F at (x, z) (see (1.17)) gives the approximation
F (x, z) ≈ F (x0 , z0 ) + ∇F (x, z) · (x, z) − (x0 , z0 )
The implicit function theorem 39
Remark 1.11.6. Remember that a square matrix is invertible if and only if its determinant
is nonzero. So we can see if Dz F (x0 , z0 ) is invertible or not by computing its determinant.
Example 1.11.7. Consider the set of points (x, y, z) satisfying the following two equations:
x sin y + ex+z = 1,
x2 y + eyz − z 2 = 1.
We have " ∂F #
1 ∂F1 ∂F1
∂x ∂y ∂z sin y + ex+z x cos y ex+z
DF = = .
∂F2 ∂F2 ∂F2 2xy x2 + zeyz yeyz − 2z
∂x ∂y ∂z
Thus
x cos y ex+z 1 1
D(y,z) F = 2 , D(y,z) (1, 0, −1) = .
x + zeyz yeyz − 2z 0 −2
Since
det D(y,z) (1, 0, −1) = 1(−2) − 1(0) = −2 ̸= 0,
the matrix D(y,z) (1, 0, −1) is invertible. Thus, the implicit function theorem applies, and we
can solve for (y, z) in terms of x near the point (1, 0, −1).
However,
−1 1
D(y,z) (−1, 0, 1) = ,
2 −2
which is not invertible. Thus, the implicit function theorem does not apply. On the other
hand,
sin y + ex+z ex+z 1 1
D(x,z) = , and D(x,z) (−1, 0, 1) =
2xy yeyz − 2z 0 −2
is invertible. So the implicit function theorem tells us that we can solve for (x, z) in terms
of y near the point (−1, 0, 1). △
40 Differentiation
Warning 1.11.8. The converses to Theorems 1.11.2 and 1.11.5 are false! For example, if
∂F
∂z
(x0 , z0 ) = 0 in the setting of Theorem 1.11.2, this does not imply that it impossible to
write z as a function of x near (x0 , z0 ). See Exercise 1.11.1.
Exercises.
1.11.1. Use the function F (x, z) = x3 − z 3 to show that the converse to Theorem 1.11.2 is
false.
1.11.2 ([MT11, §3.5, Exercise 1]). Show that the equation x + y − z + cos(xyz) = 1 can be
solved for z = g(x, y) near the origin. Find ∂x
∂g
and ∂y
∂g
at (0, 0).
1.11.3 ([MT11, §3.5, Exercise 2]). Show that xy + z + 3xz 5 = 4 is solvable for z as a function
of (x, y) near (1, 0, 1). Compute ∂x
∂z
and ∂y
∂z
at (x, y) = (1, 0).
(a) Using the implicit function theorem, verify that we can solve for x as a function of y
and z near any point on S. Explicitly write x as a function of y and z.
(b) Show that near (1, 1, −1) we can solve for either y or z, and give explicit expressions
for these variables in terms of the other two.
1.11.5 ([MT11, §3.5, Exercise 7]). Show that x3 z 2 − z 3 yx = 0 is solvable for z as a function
of (x, y) near (1, 1, 1), but not near the origin. Compute ∂x
∂z
and ∂y∂z
at (x, y) = (1, 1).
1.11.6 ([MT11, §3.5, Exercise 8]). Discuss the solvability of the system
3x + 2y + z 2 + u + v 2 = 0
4x + 3y + z + u2 + v + w + 2 = 0
x + z + w + u2 + 2 = 0
S = {(x, y, z) : x2 + y 2 + z 2 = 1}.
This sphere intersects the x-axis at two points. Which variables can we solve for at these
points? What about the points of intersection of S with the y-axis and z-axis?
The inverse function theorem 41
f : Rn → Rn .
When can we invert this function, at least locally (that is, in a neighbourhood of a point)?
The inverse function theorem says that we can do this when the derivative is invertible at
that point.
F : Rn × Rn → Rn , F (x, y) = f (x) − y.
Thus,
y = f (x) ⇐⇒ F (x, y) = 0.
Let y0 = f (x0 ), so that F (x0 , y0 ) = 0.
Since
Dx F = Df,
our assumption that Df (x0 ) is invertible implies that Dx F (x0 ) is invertible. Hence, the
implicit function theorem (Theorem 1.11.5) implies that there exist neighbourhoods V of y0
and U of x0 such that there is a unique function
Thus g = (f |U )−1 . (More precisely, (1.21) shows that g is a right inverse to f |U , implying
that f |U is surjective. The uniqueness of g shows that f is also injective, so that g is a
two-sided inverse to f |U .)
Then we have
sin φ cos θ −ρ sin φ sin θ ρ cos φ cos θ
Df = sin φ sin θ ρ sin φ cos θ ρ cos φ sin θ .
cos φ 0 −ρ sin φ
42 Differentiation
Using some basic trigonometric formulas (Exercise 1.12.4), we can compute that
Therefore, we can apply the invertible function theorem (Theorem 1.12.1) as long as ρ ̸= 0
and sin φ ̸= 0 (that is, φ is not a multiple of π).
The function f maps (ρ, θ, φ) to the point (x, y, z) ∈ R3 whose spherical coordinates are
(ρ, θ, φ):
z
θ y
When ρ = 0, we are at the origin. Then changes in θ and φ do not change the point (x, y, z).
Thus, the function is not locally invertible when ρ = 0. Similarly, when φ is a multiple of
π, we are on the z-axis, and changes in θ do not change the point. We will discuss spherical
coordinates in more detail in Section 4.5. △
Warning 1.12.3. The converse to Theorem 1.12.1 is false! See Exercise 1.12.1.
Exercises.
1.12.1. Show that the converse to Theorem 1.12.1 is false by giving a counterexample. Hint:
Try n = 1 and f (x) = x3 .
x2 − y 2
2 2 xy
f : R \ {(0, 0)} → R , f (x, y) = , 2 .
x + y x + y2
2 2
Does this map have a local inverse near (x, y) = (0, 1)?
det DF (r0 , θ0 ) = r0 .
The inverse function theorem 43
(b) When does there exist a smooth inverse function (r(x, y), θ(x, y))? Check directly and
with the inverse function theorem.
Extrema
One of the important applications of single-variable calculus that you saw in previous courses
is finding maxima and minima of functions of one variable. We would now like to extend
these techniques to functions of several variables. A good reference for the material in this
chapter is [FRYc, §2.9, §2.10].
44
Maximum and minimum values 45
Remark 2.1.3. If a ∈ ∂A (that is, a is a boundary point of A), then f is not differentiable
at a. Hence, all boundary points are critical points.
Remark 2.1.4. Some texts (e.g. [FRYc]) use the term critical point only to refer to points a
such that ∇f (a) = 0, and call a ∈ A a singular point if f is not differentiable at a.
You learned in single-variable calculus that every continuous function defined on a closed
interval attains a global maximum and minimum. This result can be generalized to multiple
variables.
Definition 2.1.5 (Bounded, compact). We say that the set A ⊆ Rn is bounded if there
exists some r > 0 such that A ⊆ Br (0). If A is bounded and contains its boundary (that is,
∂A ⊆ A), we say A is compact.
Examples 2.1.6. (a) For r > 0 and a ∈ Rn , the open ball Br (a) is not compact since it
does not contain its boundary.
(b) For r > 0 and a ∈ Rn , the closed ball {x : ∥x − a∥ ≤ r} is compact.
(c) The sphere {(x, y, z) : x2 + y 2 + z 2 = 1} is compact since it is bounded and has no
boundary (hence it contains its boundary, since the empty set is contained in every
set).
(d) The set [0, 1]2 is compact.
(e) The set A = [0, 1] × (0, 1] is not compact, since it does not contain its boundary. For
example, (0.5, 0) is a boundary point of A that is not contained in A.
(f) The set A = R × [0, 1] is not compact, since is it not bounded.
△
Proof. The proof of this theorem uses concepts from real analysis, and is beyond the scope of
this course. For an idea of the proof, see http://www.cut-the-knot.org/fta/fta_note.
shtml.
Example 2.1.8. For a, b ∈ R with a < b. The interval [a, b] ⊆ R2 is bounded and contains
its boundary, hence is compact. Then Theorem 2.1.7 reduces to the extreme value theorem
you saw in single variable calculus: any continuous function f : [a, b] → R attains a global
maximum and minimum. △
Note that this is equivalent to the image f (A) of f being a bounded set in the sense of
Definition 2.1.5.
46 Extrema
Warning 2.1.11. The converse to Theorem 2.1.10 is false, as the following example illustrates.
z = x2 − y 2
4
2
z 0
−2
2
−4
−2 0
−1 0 1 2 −2 y
x
We have ∇f = (2x, −2y). Hence ∇f (0, 0) = 0, and so (0, 0) is a critical point. However,
(0, 0) is not a local extremum. We have f (0, 0) = 0. However, for any r > 0, the ball Br (0, 0)
contains the points 2 , 0 and 0, 2 . Since
r r
f : R2 → R, f (x, y) = x2 y + xy 2 + xy.
We have
∂f
= 2xy + y 2 + y = y(2x + y + 1),
∂x
∂f
= x2 + 2xy + x = x(x + 2y + 1).
∂y
Thus
(y = 0 or 2x + y + 1 = 0) and (x = 0 or x + 2y + 1 = 0).
• x = 0 and y = 0.
• y = 0 and x + 2y + 1 = 0, which implies that x = −1.
• 2x + y + 1 = 0 and x = 0, which implies that y = −1.
• 2x + y + 1 = 0 and x + 2y + 1 = 0. Solving this linear system (exercise!) gives
x = y = − 31 .
− 13 , − 31 .
(0, 0), (−1, 0), (0, −1),
If f has any local extrema, then they must be amongst these points. We will return to this
example in Example 2.2.5. △
48 Extrema
Exercises.
2.1.1. Consider the function f (x, y) = 2(x2 + y 2 )e−x , whose graph looks as follows:
2 −y 2
2 −y 2
z = 2(x2 + y 2 )e−x
0.5
z
0 1
0
−1
0 −1 y
1
x
(Note that a matrix can be neither positive definite nor negative definite.)
For a matrix
b11 · · · b1n
B = ... . . . ... ∈ Matn×n (R),
bn1 · · · bnn
The second derivative test 49
∂xn ∂x1
(a) ∂x∂n ∂x
f
2
(a) · · · ∂ f
∂x2
(a)
n
If f is of class C at a, then it follows from Clairaut’s theorem (Theorem 1.9.6) that Hf (a)
2
is a symmetric matrix.
Theorem 2.2.3 (Second derivative test). Suppose A ⊆ Rn is open, x0 ∈ A, f : A → R is of
class C 2 , and ∇f (x0 ) = 0.
(a) If Hf (x0 ) is positive definite, then x0 is a local minimum.
(b) If Hf (x0 ) is negative definite, then x0 is a local maximum.
(c) If det(Hf (x0 )) ̸= 0 and Hf (x0 ) is neither positive definite nor negative definite, then
x0 is a saddle point.
(d) If det(Hf (x0 )) = 0 then we cannot draw any conclusion without additional work.
Proof. For a detailed proof, see [MT11, §3.3, Th. 5]. For a general idea, note that, when
∇f (x0 ) = 0, the degree 2 Taylor polynomial from (1.19) becomes
1 ⊺
p2 (x0 + h) = f (x0 ) + h (Hf (x0 ))h.
2
If Hf (x0 ) is positive definite, then p2 (x0 + h) > f (x0 ) for h ̸= 0. Since p2 is a good
approximation to f near x0 , this implies that f (x) ≥ f (x0 ) for x near x0 , and hence x0 is
a local minimum. Similarly, if Hf (x0 ) is negative definite, then f (x) ≤ f (x0 ) for x near x0 ,
and so x0 is a local maximum.
50 Extrema
It can be useful for reference purposes to give the n = 2 case of the second derivative test
explicitly.
Corollary 2.2.4 (Two-variable second derivative test). Suppose A ⊆ R2 is open, (a, b) ∈ A,
f : A → R is of class C 2 , and ∇f (a, b) = 0. Then we have
2
∂ 2f ∂ 2f
2
∂ f
det(Hf (a, b)) = (a, b) 2 (a, b) − (a, b) .
∂x2 ∂y ∂x ∂y
(a) If
∂ 2f
(a, b) > 0 and det(Hf (a, b)) > 0
∂x2
then (a, b) is a local minimum.
(b) If
∂ 2f
(a, b) < 0 and det(Hf (a, b)) > 0
∂x2
then (a, b) is a local maximum.
(c) If
det(Hf (a, b)) < 0
then (a, b) is a saddle point.
(d) If
det(Hf (a, b)) = 0
then we cannot draw any conclusion without additional work.
Example 2.2.5. Consider again the function
f : R2 → R, f (x, y) = x2 y + xy 2 + xy.
In Example 2.1.14, we found that the critical points of f are
− 31 , − 31 .
(0, 0), (−1, 0), (0, −1),
The second partial derivatives of f are
∂ 2f ∂ 2f ∂ 2f
= 2y, = 2x + 2y + 1, = 2x.
∂x2 ∂x ∂y ∂y 2
Therefore, the Hessian matrix is
2y 2x + 2y + 1
Hf = ,
2x + 2y + 1 2x
and so
0 1 0 −1
Hf (0, 0) = , Hf (−1, 0) = ,
1 0 −1 −2
" 2 #
−3 − 31
−2 −1
Hf − 31 , − 31 =
Hf (0, −1) = , .
−1 0 − 13 − 32
We can then look at the critical points:
Constrained extrema 51
• Since det(Hf (0, 0)) = −1 < 0, the point (0, 0, 0) is a saddle point.
• Since det(Hf (−1, 0)) = −1 < 0, the point (−1, 0, 0) is a saddle point.
• Since det(Hf (0, −1)) = −1 < 0, the point (0, −1, 0) is a saddle point.
• Since
∂ 2f 2 2
− 13 , − 31 = − 32 < 0 and det Hf − 13 , − 13 = − 32 − − 13 1
2
= 3
> 0,
∂x
z = f (x, y)
saddle points
local maximum
10
0
z
−10
0
−2
−1 0
1 y
x −2
Exercises.
Exercises from [FRYb, §2.9] : Q4–Q29
We want to find the maximum or minimum value of this function, but not on its entire
domain R2 . Rather, we want to find the maximum value on the unit circle:
y
S = {(x, y) : x2 + y 2 = 1},
x
One way to solve this problem would be to parameterize the circle, compose f with the
parameterization, then find the extrema as usual. Since
S = {(cos t, sin t) : t ∈ [0, 2π]},
we can find the extrema of
√
h(t) := f (cos t, sin t) = 3 cos t − sin t.
This is now a single-variable calculus problem. We solve
√ sin t 1 π 5π
0 = h′ (t) = − 3 sin t − cos t =⇒ tan t = = − √ =⇒ t = − or t = .
cos t 3 6 6
At t = − π6 , we have
√
3 1 3 1
cos t = , sin t = − , f (cos t, sin t) = + = 2.
2 2 2 2
At t = 5π
6
, we have
√
3 1 3 1
cos t = − , sin t = , f (cos t, sin t) = − − = −2.
2 2 2 2
Since the domain of h is [0, 2π], which is a closed and bounded interval (hence compact), we
know that h attains a maximum and minimum (by the extreme √value theorem). Therefore,
the maximum value of f on the unit circle is 2, at the point 2 , − 2 , and the minimum
3 1
√
value of f on the unit circle is −2, at the point − 23 , 12 . △
f |S : S → B, f |S (x) := f (x),
is the restriction of f to S. Finding extrema for f |S is the same as finding extrema for f
subject to the constraint that points lie in S. For example a ∈ S is a local minimum for f |S
if there exists r > 0 such that
S = {x ∈ U : g(x) = k}.
If x0 ∈ S is a local extremum for f |S , then ∇f (x0 ) and ∇g(x0 ) are linearly dependent, that
is, either
• ∇g(x0 ) = 0, or
• ∇f (x0 ) = λ∇g(x0 ) for some λ ∈ R (possibly λ = 0). The number λ is a called a
Lagrange multiplier.
We call a point x0 where ∇f (x0 ) and ∇g(x0 ) are linearly dependent a critical point.
p : [−1, 1] → S, p(0) = x0 .
S
p(t)
x0
′
p (0)
∇g(x0 )
y
x
has a local extremum at 0. Since it is of class C 1 , this implies that its derivative is zero.
Using the chain rule (Theorem 1.6.2), we have
Since this is true for all paths in S that pass through x0 at time 0, we see that ∇f (x0 ) is
orthogonal to all tangent vectors to S at x0 . Thus, ∇f (x0 ) is orthogonal to S at x0 .
But we already know, by Proposition 1.8.6, that ∇g(x0 ) is orthogonal to S at x0 . There-
fore, ∇f (x0 ) and ∇g(x0 ) are parallel vectors. In other words, they are linearly depen-
dent.
Example 2.4.2. Let’s return to Example 2.3.1 and find the extrema using the method of
Lagrange multipliers. We define
g(x, y) = x2 + y 2 ,
and we want to find the extrema of
√
f : R2 → R, f (x, y) = 3x − y,
It follows from the first two equations that λ ̸= 0. Thus, we can solve these first two equations
for x and y, giving √ √
(x, y) = 2λ3 , − 2λ
1
= λ1 23 , − 21 . (2.1)
Note that this is the parametric equation of a line through the origin (with the origin removed,
since 1/λ can never be zero). We want to find the points where this line intersects the unit
circle.
y
x
Lagrange multipliers 55
To find these points of intersection, we substitute the expression for x and y from (2.1) into
the constraint, to get
√ !2 2
3 1
1= + − =⇒ 4λ2 = 3 + 1 =⇒ λ2 = 1 =⇒ λ = ±1.
2λ 2λ
Substituting these two values for λ into (2.1) yields the points
√ √
(x, y) = 23 , − 12 and (x, y) = − 23 , 21
S = {(x, y, z) : x2 + y 2 + z 2 = 6}
S
(3, 1, −2)
2
0
z 5
−2
−5 0
0 y
x
5 −5
Instead of minimizing the distance, we will minimize the square of the distance from
(3, 1, −2), since this avoids some nasty square roots. (Since the square function t 7→ t2
is increasing, the square of the distance attains its minimum precisely where the distance
attains its minimum.) So we want to minimize the function
3
2(x − 3) = 2λx ⇐⇒ x= (2.2)
1−λ
1
2(y − 1) = 2λy ⇐⇒ y= (2.3)
1−λ
2
2(z + 2) = 2λz ⇐⇒ z=− (2.4)
1−λ
x2 + y 2 + z 2 = 6. (2.5)
(Note that, in the first equation, λ = 1 yields the contradiction −6 = 0. So we can assume
λ ̸= 1, allowing us to divide by 1 − λ.) Substituting (2.2) to (2.4) into (2.5) gives
√
14 2 2 21
2
= 6 =⇒ 6(1 − λ) = 14 =⇒ 3λ − 6λ − 4 = 0 =⇒ λ = 1 ± ,
(1 − λ) 3
where we used the quadratic formula in the last step. Substituting these values of λ into
(2.2) to (2.4) gives
9 3 6
(x, y, z) = ∓ √ , ∓ √ , ± √ .
21 21 21
The point √921 , √321 , − √621 is clearly closer to (3, 1, −2) than the point − √921 , − √321 , √621
is. So the nearest point is √921 , √321 , − √621 and the farthest point is − √921 , − √321 , √621 . △
Exercises.
Exercises from [FRYb, §2.10] : Q1–Q9, Q12, Q13, Q15–25, Q27, Q28.
f, g1 , . . . , gr : U → R
f : R3 → R, f (x, y, z) = x − y,
contains its boundary. Thus, by the extreme value theorem (Theorem 2.1.7) f |S attains a
maximum and a minimum. So we can use the method of Lagrange multipliers to find these
extrema.
We compute
∇f = (1, −1, 0), ∇g1 = (2x, 2y, 2z), ∇g2 = (1, 1, 1).
Since ∇f and ∇g2 are clearly linearly independent1 , these three vectors are linearly de-
pendent if and only if ∇g1 can be written as a linear combination of ∇f and ∇g2 . So we
have
∇g1 = λ1 ∇f + λ2 ∇g2
for some λ1 , λ2 ∈ R. Therefore, we want to solve the following equations:
2x = λ1 + λ2 (2.6)
2y = −λ1 + λ2 (2.7)
2z = λ2 (2.8)
x + y + z 2 = 11
2 2
(2.9)
x+y+z =3 (2.10)
1
Recall that two vectors are linearly dependent if and only if one is a scalar multiple of the other.
58 Extrema
2x + 2y = 2λ2 = 4z =⇒ x + y = 2z.
2z + z = 3 =⇒ z = 1.
Therefore
x + y = 2 =⇒ x = 2 − y.
Substituting into (2.9) then gives
(2 − y)2 + y 2 + 12 = 11
=⇒ y 2 − 2y − 3 = 0
=⇒ (y − 3)(y + 1) = 0.
Exercises.
Exercises from [FRYb, §2.10] : Q10, Q11, Q14, Q26.
this is usually not so hard for functions of a single variable, but generally much harder for
multivariable functions.
In this course, we will focus mainly on the case where S is compact. Typically, our set S
will be a compact set of the form
S = {x : g(x) ≤ k}
for some C 1 function g : A → R, A ⊆ Rn , and some scalar k ∈ R. Then the boundary will
often (but not always!) be given by
∂S = {x : g(x) = k}.
We then split our search for the local extrema into two parts:
(a) We find the critical points in S \ ∂S. For this we use the first derivative test.
(b) We find the critical points in the boundary ∂S. For this we use the method of Lagrange
multipliers.
Example 2.6.1. Let’s find the global extrema (if they exist) of
f (x, y, z) = x2 + y 3 + z 3
on the set
S = {(x, y, z) ∈ R3 : g(x, y, z) := x2 + 3y 2 + 3z 2 ≤ 100}.
So S is a solid ellipsoid:
z 0
−5 5
0
−5 0 5 −5 y
x
Therefore, by the extreme value theorem (Theorem 2.1.7), f attains a global minimum and
maximum on S.
60 Extrema
First we look for critical points for f in the interior, S \ ∂S. So we solve
(0, 0, 0) = ∇f = (2x, 3y 2 , 3z 2 ) =⇒ x = y = z = 0.
Don’t forget to check that the points you find are actually in S! Here we have
02 + 2 · 02 + 3 · 02 = 0 ≤ 100,
and so (0, 0, 0) ∈ S.
Now we consider the boundary of S, which is the ellipsoid
∂S = {(x, y, z) ∈ R3 : g(x, y, z) = 100}.
We have
∇g = (2x, 6y, 6z).
Note that ∇g = (0, 0, 0) would imply that x = y = z = 0. Since (0, 0, 0) ∈
/ ∂S, we see that
∇g is never zero on ∂S. So ∇f and ∇g are linearly dependent if and only if ∇f = λ∇g for
some λ ∈ R. Therefore the Lagrange equations become:
2x = 2λx (2.11)
3y 2 = 6λy (2.12)
3z 2 = 6λz (2.13)
x2 + 3y 2 + 3z 2 = 100 (2.14)
From (2.11) to (2.13), we have
(λ − 1)x = 0 =⇒ x = 0 or λ = 1,
(y − 2λ)y = 0 =⇒ y = 0 or y = 2λ,
(z − 2λ)z = 0 =⇒ z = 0 or z = 2λ.
Considering all the above possibilities, together with the constraint (2.14), we have the
following critical points (including the one we found in the interior of S), together with the
corresponding value of the function f :
f (x, y) = x2 + 2y
on the set
S = {(x, y) ∈ R2 : g(x, y) := 2x + y 2 ≤ 3}.
The set S looks as follows:
y
2
1
x
−1−1 1 2
−2
Its boundary is
∂S = {(x, y) : g(x, y) = 3}.
Note that S is not bounded, and so we cannot apply the extreme value theorem (Theo-
rem 2.1.7) to conclude that f has a global maximum or minimum on the set S.
Nevertheless, we begin by finding the critical points in the interior of S. Since
2x = 2λ (2.15)
2 = 2λy (2.16)
2x + y 2 = 3 (2.17)
The equation (2.15) gives that x = λ, while (2.16) implies that y = λ1 . (Note that λ = 0
contradicts (2.16), so we can divide by λ.) Substituting into (2.17) gives
1 2
2λ + 2
= 3 =⇒ + y 2 = 3.
λ y
(Note that y = 0 would contradict (2.16), and so we have y ̸= 0.) Multiplying by y, we have
y 3 − 3y + 2 = 0.
Since y = 1 is clearly a solution, we can use polynomial division to find the other factors:
(1, 1) and − 12 , −2 .
62 Extrema
Now, 2
f (1, 1) = 12 + 2 · 1 = 3 and f − 12 , −2 = − 12 + 2(−2) = − 15
4
.
Are the points we found global extrema? Note that (x, 0) ∈ S for arbitrarily large
negative x. Since f (x, 0) = x2 , this means that f takes arbitrarily large values on S. So f
has no global maximum. Since there is no othercritical point on S, the function f does attain
a minimum value of − 15 4
at the point − 21 , −2 . (We use here the fact that f (x, y) → ∞ as
∥(x, y)∥ → ∞.) △
Exercises.
2.6.1. Find the absolute maximum and minimum values of f (x, y) = x2 + 3xy + y 2 + 7 on
D = {(x, y) : x2 + y 2 ≤ 1}.
2.6.2. Find and classify the extreme values (if any) of the following functions defined on R2 :
(a) x2 − y 5
(b) (x + 1)2 + (x − y)2
(c) x2 + xy 2 + y 4
2.6.3. Find the maximum and minimum values of f (x, y) = xy − y + x − 1 subject to the
condition x2 + y 2 ≤ 2.
Chapter 3
x
a b
B(x)
Now, if we wanted to compute the area of D (instead of its mass), we learned in single
variable calculus that we should approximate the region by rectangles, which we call slices.
We do this by splitting the interval [a, b] into n subintervals and choosing a point x∗i in the
63
64 Double and triple integrals
i-th interval. Then, for each interval, we draw a rectangle over that interval, with top edge
at height T (x∗i ) and bottom edge at height B(x∗i ).
a b
•
B(x) (x∗i , B(x∗i ))
n
X b−a
(T (x∗i ) − B(x∗i ))∆x where ∆x = .
i=1
n
We then take the limit as n → ∞. If this limit exists, it is the area between the two curves:
n
X Z b
Area = lim (T (x∗i ) − B(x∗i ))∆x = (T (x) − B(x)) dx .
n→∞ a
i=1
We need to modify this procedure in order to calculate the mass of D, instead of its area.
To do this, we need to replace our approximation (T (x∗i ) − B(x∗i ))∆x of the area of the i-th
slice of D by an approximation of the mass of this slice. To do this, we subdivide the slice
into m rectangles of height
T (x∗i ) − B(x∗i )
∆y = .
m
Let
yj = B(x∗i ) + j∆y
be the y-coordinate of the top of the j-th rectangle. For each j = 1, . . . , m, we choose a
number yj∗ between yj−1 and yj . We then approximate the density on rectangle j in slice i
Vertical slices 65
yj−1
yj∗
yj
x∗i
Taking the limit as m → ∞ (that is, the limit as the height of the rectangles goes to zero),
the Riemann sum becomes an integral:
Z T (x∗i )
Mass of slice i ≈ ∆x f (x∗i , y) dy = F (x∗i )∆x,
B(x∗i )
where
Z T (x)
F (x) = f (x, y) dy .
B(x)
Notice that we started with a function f (x, y), and we have “integrated out” the variable y,
leaving a function F (x) of one variable.
Now we take the limit at n → ∞ (that is, the limit as the width of the slices goes to
zero), giving
n
X Z T (x∗i ) X
Mass = lim ∆x f (x∗i , y) dy = lim F (x∗i )∆x.
n→∞ B(x∗i ) n→∞
i=1 i=1
The last three integrals are called iterated integrals. Note the placement of dx and dy in
these iterated integrals. The integral signs (and their limits of integration) match up with
the dx or dy in the same way that you match parentheses in a mathematical expression.
Example 3.1.2. Let D be the region above the curve x2 = 9y and to the right of the curve
3y 2 = x. Let’s compute ZZ
xy dx dy .
D
x2 = 9y and 3y 2 = x.
Thus, we have
x2 (3y 2 )2
y= = = y 4 =⇒ y 4 − y = 0 =⇒ y(y 3 − 1) = 0 =⇒ y = 0 or y = 1.
9 9
Therefore, the curves intersect at (0, 0) and (3, 1). So the region D looks as follows:
y x2 = 9y
1 3y 2 = x
x
3
x2
For 0 ≤ x ≤ 3, the bottom
p x boundary of the region D is given by y = 9
and the top
boundary is given by y = 3 . Thus, we have
ZZ Z 3 Z √x/3 !
xy dx dy = xy dy dx
0 x2 /9
D
Horizontal slices 67
√x/3
Z 3
= 1 xy 2 dx
0 2 y=x2 /9
Z 3 2
x5
x
= − dx
0 6 162
3 3
x x6
= −
18 6 · 162 0
3 3 3
= − = .
2 4 4
Note that we first integrated with respect to y, treating x as a constant. Then we integrated
with respect to x. △
This is precisely the formula for the area between two curves that you learned in single-
variable calculus! △
Exercises.
Exercises are deferred until Section 3.6.
y x = 2y + 1
1 3y 2 = x
D
x
1 3
3
− 13
68 Double and triple integrals
Using the vertical slice method, we would have to split the region
p x D into two pieces. For
0 ≤ x ≤ 3 , the region is bounded above by the curve y =
1
3
and below by the curve
y = − 3 . Then, for 3 ≤ x ≤ 3, the region is bounded above by y = x3 and below by
px 1
p
y = x−12
. On the other hand, the region D is more simply described as the region to the
right of the curve x = 2y + 1 and to the left of the curve 3y 2 = x.
Let’s consider the general setup. Suppose we want to integrate the function f (x, y) over
the region
c
x
To integrate using horizontal slices, we follow the procedure of Section 3.1, but we draw our
slices horizontally and reverse the roles of x and y everywhere.
y x = R(y)
x = L(y)
d
yi∗
c
x
xj x∗ xj−1
j
Interchanging x and y in our discussion from Section 3.1 then tells us that, in order to
integrate the function f (x, y) over the region D, we compute the double integral
!
Z d
Z R(y)
f (x, y) dx dy .
c L(y)
Example 3.2.2. Let’s compute the integral from Example 3.1.2, but using horizontal slices
instead of vertical slices. So we want to compute
ZZ
xy dx dy
D
x
3
Exercises.
Exercises are deferred until Section 3.6.
R = [a, b] × [c, d] ⊆ R2 .
A regular partition of R of order n is the collection of points {xj }nj=0 and {yj }nj=0 satisfying
where
b−a d−c
xj+1 − xj = , yk+1 − yk =
n n
for 0 ≤ j, k ≤ n − 1. Using these equally spaced points, we subdivide the rectangle R:
y
y4 = d
y3
y2 R
y1
c = y0
x
a = x0 x1 x2 x3 x4 = b
Let
Rjk = [xj , xj+1 ] × [yk , yk+1 ]
and let cjk be any point in Rjk . Suppose f : R → R is bounded. For example, since R is
compact, it follows from Lemma 2.1.9 that f is bounded if f is continuous. Form the sum
n−1
X n−1
X
Sn = f (cjk )∆x∆y = f (cjk )∆A, (3.1)
j,k=0 j,k=0
Double integrals over a rectangle 71
where
b−a d−c
∆x = , ∆y = , ∆A = ∆x∆y.
n n
Thus ∆A is the area of each of the subrectangles into which we have divided R. The sum
(3.1) is called a Riemann sum for f .
We can now give the precise definition of the double integral.
Definition 3.3.1 (Double integral). If the sequence (Sn )∞
n=1 converges to a limit S as n → ∞
and if the limit S is the same for any choice of points cjk in the rectangle Rjk , then we say
that f is integrable over R, and we write
ZZ ZZ ZZ
f (x, y) dA, f (x, y) dx dy, or f dx dy .
R R R
z = f (x, y)
y
x
This function is continuous every except along a continuous curve (in the domain R). It
turns out that this type of function is also integrable.
Theorem 3.3.3. Suppose R is a closed rectangle and f : R → R is bounded. Furthermore,
suppose that the set of points where f is discontinuous lies on a finite union of graphs of
continuous functions. Then f is integrable over R.
Proof. We will not prove this theorem in this course. The basic idea behind the proof is the
same as that for single-variable functions, where a function with only finitely many disconti-
nuities is integrable. In the limit defining the integral, the terms involving the discontinuities
become zero.
72 Double and triple integrals
Theorem 3.3.3 will be key in handling integrals over more general regions, and relating
Definition 3.3.1 to the vertical and horizontal slice methods of Sections 3.1 and 3.2. Before we
do this, we state some properties of integrals that follows from Definition 3.3.1 and properties
of limits.
(a) Linearity:
ZZ ZZ ZZ
(f (x, y) + g(x, y)) dA = f (x, y) dA + g(x, y) dA .
R R R
(b) Homogeneity: ZZ ZZ
cf (x, y) dA = c f (x, y) dA .
R R
ZZ ZZ
f dA ≤ |f | dA . (3.2)
R R
Proof. Since
−|f (x, y)| ≤ f (x, y) ≤ |f (x, y)| for all (x, y) ∈ R,
we can use monotonicity and homogeneity of integration (with c = −1) to conclude that
ZZ ZZ ZZ
− |f | dA ≤ f dA ≤ |f | dA,
R R R
Exercises.
Exercises are deferred until Section 3.6.
Fubini’s theorem 73
Z 3
= [x ln y]2e
y=e dx
Z0 3
= x ln 2 dx
0
3
1 2
= x ln 2
2 x=0
9
= ln 2.
2
As expected, we obtain the same answer. △
exists and Z bZ d ZZ
f (x, y) dy dx = f (x, y) dA .
a c
R
Rb
(b) If the integral a
f (x, y) dx exists for all y ∈ [c, d], then
Z dZ b ZZ
f (x, y) dx dy = f (x, y) dA .
c a
R
Then the integral f dA does not exist. This is because, if we choose the points cjk in
RR
R
Definition 3.3.1 to have rational coordinates, the Riemann sums are all equal to 1, whereas
if we choose them to have irrational coordinates, the Riemann sums are all equal to 0. △
Double integrals over more general regions 75
Remark 3.4.5. If you take more advanced courses in analysis, you will learn about the
Lebesgue integral which does exist for the function f of Example 3.4.4, and is equal to
0, essentially because there are way more irrational numbers than rational numbers.
Exercises.
Exercises from [FRYb, §3.1] : Q1, Q2, Q3a
Further exercises are deferred until Section 3.6.
c
x
a b
76 Double and triple integrals
B, T : [a, b] → R
such that
x
a b
B(x)
L, R : [c, d] → R
such that
c
x
since we have
√ √
D = {(x, y) : −1 ≤ x ≤ 1, − 1 − x2 ≤ y ≤ 1 − x2 }
y √
y = 1 − x2
√
y = − 1 − x2
and
p p
D = {(x, y) : −1 ≤ y ≤ 1, − 1 − y 2 ≤ x ≤ 1 − y 2 }
y p
x = 1 − y2
p
x = − 1 − y2
Definition 3.5.1 is related to the vertical and horizontal slice methods of Sections 3.1
and 3.2 by the following result.
(a) If
D = {(x, y) : a ≤ x ≤ b, B(x) ≤ y ≤ T (x)}
is a y-simple region, then
ZZ Z bZ T (x)
f (x, y) dA = f (x, y) dy dx .
a B(x)
D
(b) If
D = {(x, y) : c ≤ y ≤ d, L(y) ≤ x ≤ R(y)}
is an x-simple region, then
ZZ Z d Z R(y)
f (x, y) dA = f (x, y) dx dy .
c L(y)
D
We can now add one more important property of integrals to those listed in Proposi-
tion 3.3.4. Recall that two sets X and Y are disjoint if X ∩ Y = ∅.
Exercises.
Exercises from [FRYb, §3.1] : Q3
Further exercises are deferred until Section 3.6.
D = {(x, y) : 0 ≤ x ≤ 1, x ≤ y ≤ x + 1}
Examples and applications of double integrals 79
2 y y =x+1
y=x
x
1
and
f : D → R, f (x, y) = 2y − (x + 1)2 .
Then
ZZ Z 1 Z x+1
2y − (x + 1)2 dy dx
f dA =
0 x
D
Z 1 x+1
y 2 − (x + 1)2 y
= y=x
dx
0
Z 1
(x + 1)2 − (x + 1)3 − x2 + x(x + 1)2 dx
=
Z0 1
(x + 1)2 (1 − x − 1 + x) − x2 dx
=
0
Z 1
=− x2 dx
0
3 1
x
=−
3 0
1
=− .
3
Note that D is also an x-simple region. So we could compute the integral by first integrating
with respect to x. But this would involve splitting the region into two pieces. △
One important use of the double integral is computing volumes. You learned in single-
variable calculus how to compute the volume of certain types of solids (e.g. solids of rev-
olution). Using the double integral, we can compute more general volumes. In particular,
suppose D ⊆ R2 is an elementary region, and suppose we have two functions
z = ϕ1 (x, y)
z = ϕ2 (x, y)
y
x
where we have drawn the graphs of the functions ϕ1 , ϕ2 giving the top and bottom of the
solid. The volume of S is given by
ZZ
(3.3)
Vol S := ϕ2 (x, y) − ϕ1 (x, y) dA .
D
S = {(x, y, z) : 0 ≤ z ≤ 1 − x2 − y 2 }.
y
x
S = {(x, y, z) : (x, y) ∈ D, 0 ≤ z ≤ 1 − x2 − y 2 },
where
D = {(x, y) : x2 + y 2 ≤ 1}
is the unit disk. The region D is simple, so we can choose to describe it as an x-simple region
or as a y-simple region. Let’s describe it as an x-simple region:
p p
D = {(x, y) : −1 ≤ y ≤ 1, − 1 − y 2 ≤ x ≤ 1 − y 2 }.
Examples and applications of double integrals 81
Then we have
ZZ
Vol S = (1 − x2 − y 2 ) dA
D
Z 1 Z √1−y2
= √ (1 − x2 − y 2 ) dx dy
−1 − 1−y 2
1
Z 3
√1−y2
x
= (1 − y 2 )x − dy
−1 3 x=−√1−y2
Z 1
2
p
2
1 2 3/2
=2 (1 − y ) 1 − y − (1 − y ) dy
−1 3
4 1
Z
= (1 − y 2 )3/2 dy .
3 −1
t ∈ − π2 , π2 ,
y = sin t,
so that q
p
dy = cos(t) dt and 1− y2 = 1 − sin2 (t) = cos t.
Thus Z π/2
4
Vol S = cos4 (t) dt .
3 −π/2
cos(2t) + 1
cos(2t) = 2 cos2 (t) − 1 =⇒ cos2 (t) = .
2
So we have
2
cos2 (2t) + 2 cos(2t) + 1
4 cos(2t) + 1
cos (t) = =
2 4
cos(4t) + 1 cos(2t) 1 cos(4t) cos(2t) 3
= + + = + + .
8 2 4 8 2 8
Therefore,
Z π/2
4 cos(4t) cos(2t) 3
Vol S = + + dt
3 −π/2 8 2 8
π/2
4 sin(4t) sin(2t) 3t
= + +
3 4·8 2·2 8 t=−π/2
4 3 π π
π
= · + = . △
3 8 2 2 2
82 Double and triple integrals
This iterated integral is quite hard to compute as it is. However, it will become easier once
we change the order of integration.
We first note that the above integral is equal to
ZZ √
(x − 1) 1 + e2y dA,
D
where
D = {(x, y) : 1 ≤ x ≤ 2, 0 ≤ y ≤ ln x}.
y y = ln x
y=0
ln 2
D
x
1 2
D = {(x, y) : 0 ≤ y ≤ ln 2, ey ≤ x ≤ 2}.
v = ey , dv = ey dy .
This gives
1
Z 4 √
Z 2 √
− 1 + u du + 1 + v 2 dv .
4 1 1
Both of these integrals can be computed using methods from single-variable calculus. (For
instance, the antiderivatives are included in standard lists.) For the first integral, we have
4
4 √
Z
2 2 3/2
1 + u du = (1 + u)3/2 5 − 23/2 .
=
1 3 1 3
1 2 3/2 3/2 1 √ √ 1 √ √
− · (5 − 2 ) + 2 5 + ln( 5 + 2) − 2 + ln( 2 + 1) .
4 3 2 2
√ !
1 √ √ 5+2 1 3/2
5 − 23/2 . △
= 2 5 − 2 + ln √ −
2 2+1 6
D = {(x, y) : 1 ≤ x2 + y 2 ≤ 2}?
y
2
1 D
x
1 2
84 Double and triple integrals
The region D is neither x-simple nor y-simple. However, we can use additivity of the integral
(Proposition 3.5.3) and split D into two y-simple regions:
y y
2
1 D1
x x
1 2 1 2
−1 D2
−2
Exercises.
Exercises from [FRYb, §3.1] : Q4–Q29.
∆V = ∆x∆y∆z
Definition 3.7.1 (Triple integral). Suppose that B = [a, b] × [c, d] × [p, q] ⊆ R3 and that f
is a bounded function f : B → R. If limn→∞ Sn = S exists and is independent of the choice
of the points cijk , then we say f is integrable and we call S the triple integral (or simple the
integral ) of f over B. We denote it by
ZZZ ZZZ ZZZ
f dV , f (x, y, z) dV , or f (x, y, z) dx dy dz .
B B B
The triple integral satisfies the properties of linearity, homogeneity, monotonicity, and
additivity. (See Propositions 3.3.4 and 3.5.3.) As for the double integral, we compute the
triple integral in practice by reducing it to an iterated integral.
As for double integrals, we often want to integrate over more general regions W , not just
boxes. As we did in Section 3.5, we do this by enclosing the region in a box and extending
our function by zero outside of W .
An elementary region in R3 is one defined by restricting one of the variables to lie between
two functions of the remaining variables, and where the domain of these two functions is an
elementary region in the plane. For example, if D ⊆ R2 is an elementary region and
then
W = {(x, y, z) : (x, y) ∈ D, γ1 (x, y) ≤ z ≤ γ2 (x, y)}
is an elementary region in R3 .
z z = γ1 (x, y)
D
z = γ2 (x, y)
W = {(x, y, z) : x2 + y 2 + z 2 ≤ 1}
Triple integrals 87
as an elementary region.
First note that we can write
n p p o
W = (x, y, z) : (x, y) ∈ D, − 1 − x2 − y 2 ≤ z ≤ 1 − x2 − y 2 ,
where
D = {(x, y) : x2 + y 2 ≤ 1}
is the unit disk. We can then describe D as a x-simple or as a y-simple region. As a y-simple
region, it is √ √
D = {(x, y) : −1 ≤ x ≤ 1, − 1 − x2 ≤ y ≤ 1 − x2 }.
So we have described D by the inequalities
−1 ≤ x ≤ 1,
√ √
− 1 − x2 ≤ y ≤ 1 − x2 ,
p p
− 1 − x2 − y 2 ≤ z ≤ 1 − x2 − y 2 .
√ z √
z= 1−x2 −y 2 y = 1 − x2
√
y = − 1 − x2
x √
z=− 1−x2 −y 2
and
W = {(x, y, z) : (x, y) ∈ D, γ1 (x, y) ≤ z ≤ γ2 (x, y)}.
Then Z bZ
ZZZ ϕ2 (x) Z γ2 (x,y)
f (x, y, z) dV = f (x, y, z) dz dy dx
a ϕ1 (x) γ1 (x,y)
W
if
D = {(x, y) : a ≤ x ≤ b, ϕ1 (x) ≤ y ≤ ϕ2 (x)}
is a y-simple region, and
ZZZ Z d Z ψ2 (y) Z γ2 (x,y)
f (x, y, z) dV = f (x, y, z) dz dx dy
c ψ1 (y) γ1 (x,y)
W
88 Double and triple integrals
if
D = {(x, y) : c ≤ y ≤ d, ψ1 (y) ≤ x ≤ ψ2 (y)}
is an x-simple region, provided all these integrals exist.
There are analogues of Theorem 3.7.5 for other orders of the variables.
Example 3.7.6. In the setting of Theorem 3.7.5, suppose f (x, y, z) = 1 for all (x, y, z) ∈ W .
Then ZZZ ZZ
dV = (γ2 (x, y) − γ1 (x, y)) dA
W D
Often the hardest part of computing a triple integral is finding the limits of integration.
Example 3.7.7 (cf. [MT11, Example 5.5.5]). Let W be the region bounded by the planes
x = 0, y = 0, and z = 2, and theRRR
surface z = x2 + y 2 , and lying in the quadrant x ≥ 0,
y ≥ 0. Let’s compute the integral y dx dy dz.
W
We draw the region W from three different perspectives:
z z z
y x
x x y
Thus we have
√ √
ZZZ Z 2 Z 2−x2 Z 2
y dx dy dz = y dz dy dx
0 0 x2 +y 2
W
√ √
Z 2 Z 2−x2
= [yz]2z=x2 +y2 dy dx
0 0
√ √
Z 2 Z 2−x2
= y(2 − x2 − y 2 ) dy dx
0 0
Triple integrals 89
Z √
2 2 2 4
√2−x2
xy y
= y2 − − dx
0 2 4 y=0
√
Z 2
1 2 22 1 2 2
= 2 − x − x (2 − x ) − (2 − x ) dx
0 2 4
Z √2
1 4
= x − x2 + 1 dx
0 4
√ 2 √ √ √
2 2 2 √
5
x x3 8 2
= − +x = − + 2= . △
20 3 x=0 5 3 15
Exercises.
Exercises from [FRYb, §3.5] : Q1–Q22.
Chapter 4
Change of variables
In this chapter we discuss changing variables in double and triple integrals. This is a very
useful technique and one can sometimes greatly simplify an integral by changing to more
suitable variables.
to remember the extra factor of φ′ that we need to include. We made these types of change
of variables in Example 3.6.3. Sometime we use the change of variables formula in the other
direction: Z b Z φ(b)
f (φ(x))φ′ (x) dx = f (u) du .
a φ(a)
R √π/2
For example, to compute 0
x cos(x2 ) dx, we can make the substitution
u = φ(x) = x2 , du = 2x dx,
so that
Z √π/2
1 π/2
Z
1 1
x cos(x ) dx = 2
cos u du = [sin u]π/2
0 = .
0 2 0 2 2
RbRd
In an iterated integral a c f (x, y) dy dx, we could make such a change of variables for
x and y independently. However, we often want to change x and y simultaneously, in such
a way that the new variables u and v are functions of both x and y:
90
Two variables 91
Since our definition of the double integral (Definition 3.3.1) involved dividing a region
into rectangles, the key to developing the change of variables formula in two dimensions is
seeing how the change of variables affects the area of a rectangle. In fact, it is enough see
how it affects the area of the unit square.
Recall that any linear map R2 → R2 corresponds to left multiplication by a 2 × 2 matrix.
Consider the matrix
a b
M= ∈ Mat2×2 (R).
c d
This sends the unit square [0, 1] × [0, 1], which has vertices
1 1
(c, d)
M
⇝
(a, b)
1 1
ad − bc = det M,
the determinant of M .
In general, our change of variables may not be given by a linear map. However, we can
approximate it by a linear map. Suppose our change of variables is given by
We learned in Section 1.5 that the best linear approximation to the function T is given by
its Jacobian matrix " ∂x ∂x #
∂u ∂v
DT = ∂y ∂y
.
∂u ∂v
The determinant of this matrix is called the Jacobian determinant, or simply the Jacobian,
of T . It is denoted " ∂x ∂x #
∂(x, y) ∂u ∂v
:= det ∂y ∂y .
∂(u, v)
∂u ∂v
92 Change of variables
In the limit defining the two-variable integral, this approximation becomes better and better.
So when we change variables in two dimensions, we need to adjust by a factor equal to the
absolute value of the Jacobian, to correct for the change in area induced by the change
of variables. Recall that a function is bijective if it is injective (one-to-one) and surjective
(onto).
Theorem 4.1.1 (Change of variables for double integrals). Suppose that D and D∗ are
elementary regions in the plane, and that
∂(x, y)
dA = dx dy = du dv . (4.2)
∂(u, v)
Example 4.1.2. Let P be the parallelogram bounded by
y = 3x, y = x, y = x + 2, y = 3x − 2.
y
• (2,4)
y=x+2
(1,3) •
y=3x y=3x−2
P
1 • (1,1)
y=x
x
1
Since the determinant of this matrix is −2, which is nonzero, it is one-to-one (as you learned
in linear algebra). Thus we can use Theorem 4.1.1. We chose T so that it maps the unit
square P ∗ to P :
v y
• (2,4)
y=x+2
1 (1,3) •
T y=3x y=3x−2
⇝ P
P∗
1 • (1,1)
y=x
u
x
1
1
Furthermore
∂(x, y) 1 1
= det = | − 2| = 2.
∂(u, v) 3 1
Thus, we have
dA = dx dy = 2 du dv .
Therefore, by the change of variables formula (Theorem 4.1.1), we have
ZZ ZZ Z 1 Z 1
xy dx dy = (u + v)(3u + v)2 du dv = 2 (3u2 + 4uv + v 2 ) du dv
0 0
P P∗
1 1 1
v3
Z Z
3 1 14
=2 u + 2u2 v + uv 2 u=0 dv = 2 2
(1 + 2v + v ) dv = 2 v + v + 2
= . △
0 0 3 v=0 3
94 Change of variables
Exercises.
4.1.1. The one-variable analogue of Theorem 4.1.1 recovers (4.1) in the following sense.
Suppose a < b, c < d and let
I = [a, b], J = [c, d]
Furthermore, suppose that φ : [c, d] → [a, b] is a C 1 bijective function. Show that the formula
Z b Z d
f (x) dx = f (φ(u))|φ′ (u)| du
a c
recovers the single-variable change of variable formula (4.1). The subtle point here is the
presence of the absolute value above. Hint: The assumptions on φ imply that it is monotonic.
4.1.2 ([MT11, §6.2, Exercises 1, 2]). For each of the following integrals, suggest a change of
variables that will simplify the integrand. Then find the corresponding Jacobian determinant.
(a)
RR
(3x + 2y) sin(x − y) dA
R
(c)
RR
(5x + y)3 (x + 9y)4 dA
R
(d)
RR
x sin(6x + 7y) − 3y sin(6x + 7y) dA
R
4.1.3 ([MT11, §6.2, Exercise 4]). Let D be the region 0 ≤ y ≤ x and 0 ≤ x ≤ 1. Evaluate
ZZ
(x + y) dx dy
D
4.1.4 ([MT11, §6.2, Exercise 8]). Define T (u, v) = (u2 − v 2 , 2uv). Let
D∗ = {(u, v) : u2 + v 2 ≤ 1, u ≥ 0, v ≥ 0}.
R 1 R x2
4.1.6 ([MT11, §6.2, Exercise 14]). (a) Express 0 0 xy dy dx as an integral over the tri-
angle
D∗ = {(u, v) : 0 ≤ u ≤ 1, 0 ≤ v ≤ u}.
Hint: Find a one-to-one mapping T of D∗ onto the given region of integration.
(b) Evaluate this integral directly and as an integral over D∗ .
θ x
So we have
x = r cos θ, y = r sin θ,
p y
r = x2 + y 2 , θ = arctan .
x
We can apply the general theory of Section 4.1. Taking
we have
∂(x, y) cos θ −r sin θ
= det DT = det = r(cos2 θ + sin2 θ) = r.
∂(r, θ) sin θ r cos θ
Note that T is not one-to-one, since it sends all (0, θ) to (0, 0). However, the change of
variables theorem is still valid. This is because the set of points where T is not one-to-one
lies on the graph of a smooth curve (in particular, it is a one-dimensional region with zero
area) and hence can be neglected for the purposes of integration.
Theorem 4.2.2 (Double integral in polar coordinates). Suppose that D is an elementary
region and that the map T of (4.3) gives a bijection from D∗ to D, except possibly for points
on the boundary of D∗ . Then
ZZ ZZ
f (x, y) dx dy = f (r cos θ, r sin θ)r dr dθ .
D D∗
96 Change of variables
where D is the region in the first quadrant lying between the arcs of the circles x2 + y 2 = a2
and x2 + y 2 = b2 , where 0 < a < b.
θ y
π/2
T
⇝
D∗ D
r x
a b a b
D∗ = {(r, θ) : a ≤ r ≤ b, 0 ≤ θ ≤ π2 } = [a, b] × 0, π2 .
Example 4.2.4. Let’s compute the area of a sector of a circle of radius R and angle α.
D
α x
R
0 ≤ r ≤ R, 0 ≤ θ ≤ α.
Three variables 97
So we have
R α R R
r2
ZZ Z Z Z
1
Area(D) = dA = r dθ dr = α r dr = α = αR2 .
0 0 0 2 r=0 2
D
Exercises.
Exercises from [FRYb, §3.2] : Q1–Q24.
M
⇝
y y
x
x
(x, y, z) = T (u, v, w)
As in Section 4.1, we use the fact that the best linear approximation to T is given by the
Jacobian matrix DT . Therefore, the factor we need to add to our integral when changing
coordinates is the absolute value of the Jacobian determinant
∂x ∂x ∂x
∂u ∂v ∂w
∂(x, y, z) ∂y ∂y ∂y
:= det(DT ) = det ∂u .
∂(u, v, w) ∂v ∂w
∂z ∂z ∂z
∂u ∂v ∂w
Theorem 4.3.1 (Change of variables for triple integrals). Suppose that W and W ∗ are
elementary regions in R3 , and that
T : W ∗ → W, (u, v, w) 7→ (x(u, v, w), y(u, v, w), z(u, v, w)),
is of class C 1 and is bijective, except possibly on a set that is a finite union of graphs of
functions of two variables. Then, for any integrable function f : W → R, we have
ZZZ ZZZ
∂(x, y, z)
f (x, y, z) dx dy dz = f (x(u, v, w), y(u, v, w), z(u, v, w)) du dv dw .
∂(u, v, w)
W W∗
Exercises.
4.3.1 ([MT11, §6.2, Exercise 20]). Let T : R3 → R3 be defined by
T (u, v, w) = (u cos v cos w, u sin v cos w, u sin w).
(a) Show that T is onto the unit sphere; that is, every (x, y, z) with x2 + y 2 + z 2 = 1 can
be written as (x, y, z) = T (u, v, w) for some (u, v, w).
(b) Show that T is not one-to-one.
θ r y
Cylindrical coordinates, denoted (r, θ, z), of a point (x, y, z) ∈ R3 , are defined by:
Cylindrical coordinates 99
• r is the distance from (x, y, 0) to (0, 0, 0), which is the same as the distance from
(x, y, z) to the z-axis;
• θ is the angle between the positive x-axis and the line joining (x, y, 0) to (0, 0, 0);
• z is the signed distance from (x, y, z) to the xy-plane (that is, it is the same z as in
cartesian coordinates).
In other words, (r, θ) are polar coordinates in the xy-plane, and z is the usual z.
Cartesian and cylindrical coordinates are related by:
x = r cos θ, y = r sin θ, z = z,
p
r = x2 + y 2 , θ = arctan xy , z = z.
Its determinant is
∂(x, y, z)
= r(cos2 θ + sin2 θ) = r.
∂(r, θ, z)
Applying Theorem 4.3.1 to this change of variables, we obtain the following result.
Theorem 4.4.1 (Triple integral in cylindrical coordinates). Suppose that W is an elementary
region in R3 and that the map T of (4.4) gives a bijection from W ∗ to W , except possibly
for points on the boundary of W ∗ . Then
ZZZ ZZZ
f (x, y, z) dx dy dz = f (r cos θ, r sin θ, z)r dr dθ dz .
W W∗
W = {(x, y, z) : z ≥ 0, x2 + y 2 ≤ z 2 ≤ 1}.
z
y
x
Suppose the density of the cone at point (x, y, z) is equal to z. What is the mass of the cone?
100 Change of variables
W ∗ = {(r, θ, z) : 0 ≤ z ≤ 1, 0 ≤ θ ≤ 2π, 0 ≤ r ≤ z}
Thus, we have
ZZZ ZZZ Z 1 Z z Z 2π
Mass = z dV = rz dr dθ dz = rz dθ dr dz
0 0 0
W W∗
1 z 1 z 1 1
r2 z z4
Z Z Z Z
3 π
= 2π rz dr dz = 2π dz = π z dz = π = . △
0 0 0 2 r=0 0 4 z=0 4
Exercises.
Exercises from [FRYb, §3.6] : Q1–Q17.
θ y
Spherical coordinates, denoted (ρ, θ, φ), of a point (x, y, z) ∈ R3 are defined by:
The spherical coordinate θ is the same as the cylindrical coordinate θ, while the spherical
coordinate φ is new. We generally restrict these coordinates to the ranges
ρ ≥ 0, 0 ≤ θ ≤ 2π, 0 ≤ φ ≤ π.
Spherical coordinates 101
Example 4.5.2 (Volume of a ball). Consider the ball of radius R, centred at the origin:
B = {(x, y, z) : x2 + y 2 + z 2 ≤ R2 }.
In spherical coordinates the ball is given by
B ∗ = {(ρ, θ, φ) : 0 ≤ r ≤ R, 0 ≤ θ ≤ 2π, 0 ≤ φ ≤ π}.
Therefore
ZZZ ZZZ Z π Z 2π Z R
2
Vol B = dx dy dz = ρ sin φ dρ dθ dφ = ρ2 sin φ dρ dθ dφ
0 0 0
B B∗
π 2π R π 2π π
ρ3 R3 2πR3
Z Z Z Z Z
= sin φ dθ dφ = sin φ dθ dφ = sin φ dφ
0 0 3 ρ=0 3 0 0 3 0
2πR3 4πR3
= [− cos φ]πφ=0 = . △
3 3
Exercises.
Exercises from [FRYb, §3.7] : Q1–Q31.
Chapter 5
Vector fields
Many of the remaining topics in the course will involve vector fields. In this chapter we
define vector fields and discuss some important related concepts. A good reference for the
material in this chapter is [FRYe, §2.1–§2.3].
102
Definitions and first examples 103
v(x, y) = 2i + j = (2, 1)
Example 5.1.6 (Gravity). Suppose an object of mass M is placed at the origin (0, 0, 0).
Newton’s law of gravitation states that the magnitude of the gravitational force experienced
by a mass m at point r = (x, y, z) is given by GMr2
m
, where G is the gravitational constant and
r = x + y + z is the distance to the origin (in other words, r is the distance between the
p
2 2 2
two objects). This force points towards the origin. Thus, since − rr is a unit vector pointing
towards the origin, the gravitational force field for the object of mass m is given by
GM m
F(r) = − r. △
r3
Exercises.
Exercises from [FRYd, §2.1] : Q1–Q15.
If m = 2, then we have
∂f ∂f
∇f = i+ j,
∂x ∂y
and, if m = 3, we have
∂f ∂f ∂f
∇f =
i+ j+ k.
∂x ∂y ∂z
We sometimes remember this by writing
∂ ∂ ∂
∇=i +j +k . (5.1)
∂x ∂y ∂y
Definition 5.2.1 (Conservative vector field, potential, equipotential curve and surface).
(a) A vector field F is said to be conservative if there exists a function φ such that
F = ∇φ.
In this case, φ is called a potential for F. Note that if φ is a potential for F and C ∈ R,
then φ + C is also a potential for F.
(b) If F = ∇φ is a conservative field in three dimensions with potential φ and C ∈ R, then
the set
{(x, y, z) : φ(x, y, z) = C}
is called an equipotential surface. Similarly, in two dimensions,
{(x, y) : φ(x, y) = C}
is called an equipotential curve.
Note that equipotential curves and surfaces are just level sets (see (1.15)). It follows
from Proposition 1.8.6, that if F = ∇φ is a conservative vector field, then the vector field is
orthogonal to the equipotential surface or curve.
Warning 5.2.2. Physicists use a different convention for potentials. In physics, φ is called a
potential for F when F = −∇φ. Note the minus sign.
Example 5.2.3. Recall the gravitational force field of Example 5.1.6. This force is conserva-
tive, with potential
GM m
φ(r) = .
r
Indeed, we have
∂ ∂ GM m 1 GM m · 2x GM m
φ(r) = p =− = − x,
∂x ∂x x2 + y 2 + z 2 2 (x2 + y 2 + z 2 )3/2 r3
∂ ∂ GM m 1 GM m · 2y GM m
φ(r) = p =− 2 2 2 3/2
= − 3 y,
∂y ∂y x2 + y 2 + z 2 2 (x + y + z ) r
∂ ∂ GM m 1 GM m · 2z GM m
φ(r) = p =− = − z.
∂z ∂z x2 + y 2 + z 2 2 (x2 + y 2 + z 2 )3/2 r3
Thus,
GM m GM m
∇φ(r) = − 3
(x, y, z) = − 3 r = F(r). △
r r
106 Vector fields
F(x, y) = xi − yj.
Let’s try to show this field is conservative and find a potential. A potential φ would have to
satisfy
∂φ ∂φ
(x, y) = x and (x, y) = −y. (5.2)
∂x ∂y
In order to satisfy the first equation in (5.2), we would need to have
x2
φ(x, y) = + ψ(y),
2
where ψ(y) is a function of y only. For this to also satisfy the second equation in (5.2), we
need
∂ x2
∂φ
−y = (x, y) = + ψ(u) = ψ ′ (y).
∂y ∂y 2
This holds if and only if
y2
ψ(y) = − + C, C ∈ R.
2
Therefore,
x2 y 2
φ(x, y) = − +C
2 2
is a potential for any C ∈ R. △
Let’s prove by contradiction that v is not conservative. Thus, we begin by assuming that v
is conservative (and then try to arrive at a contradiction). So v = ∇φ for some differentiable
function φ : R2 → R. Thus we have
∂φ ∂φ
(x, y) = −y and (x, y) = x.
∂x ∂y
Proceeding as in Example 5.2.4, we have
Exercises.
Exercises are deferred until Section 5.3.
5.3 Curl
We now introduce the important notion of the curl of a vector field. In addition to being
important in later chapters, we will see that curl gives an important test for conservative
vector fields.
Definition 5.3.1 (Curl). The curl of a vector field F(x, y, z), is defined by
∂F3 ∂F2 ∂F1 ∂F3 ∂F2 ∂F1
curl F = ∇ × F = − i+ − j+ − k
∂y ∂z ∂z ∂x ∂x ∂y
i j k
∂ ∂ ∂
= det ∂x ∂y ∂z .
F1 F2 F3
Note that curl only applies to vector fields in three dimensions. However, any vector field
v(x, y) in two dimensions can be extended to a vector field in three dimensions by setting
GM m GM m
F(r) = − 3
r=− 2 (x, y, z).
r (x + y 2 + z 2 )3/2
We have
∂F1 3GM mxy ∂F1 3GM mxz
= 2 , = 2 ,
∂y (x + y 2 + z 2 )5/2 ∂z (x + y 2 + z 2 )5/2
108 Vector fields
∂v1 ∂v2
= −1 and = 1.
∂y ∂x
Therefore
∇ × v = 2k.
△
Theorem 5.3.3 (Test for conservative vector fields). (a) Suppose that F(x, y) is a C 2 vec-
tor field in the plane. If F is conservative, then
∂F1 ∂F2
= .
∂y ∂x
∇ × F = 0.
Note that, using (5.3), both cases of Theorem 5.3.3 involve the condition ∇ × F = 0.
Proof. We give the proof of (b), since the proof of (a) is similar and easier (and can be
found in [FRYe, Th. 2.3.9].) Suppose F is conservative. Then there exists a potential φ with
F = ∇φ. In other words, we have
∂φ ∂φ ∂φ
= F1 , = F2 , = F3 .
∂x ∂y ∂z
∂F3 ∂ 2φ ∂ 2φ ∂F2
= = = ,
∂y ∂y ∂z ∂z ∂y ∂z
∂F1 ∂ 2φ ∂ 2φ ∂F3
= = = ,
∂z ∂z ∂x ∂x ∂z ∂x
∂F2 ∂ 2φ ∂ 2φ ∂F1
= = = .
∂x ∂x ∂y ∂y ∂x ∂y
Hence ∇ × F = 0.
Curl 109
Example 5.3.4. In Example 5.3.2(b), we saw that the vector field v(x, y, z) = −yi + xj has
∂v1 ∂v2
= −1 ̸= 1 = .
∂y ∂x
Thus, by Theorem 5.3.3(a), we can conclude that v is not conservative. We also proved this
directly in Example 5.2.5. △
Warning 5.3.5. The converse of Theorem 5.3.3 is false! If ∇ × F ̸= 0, then you can use
Theorem 5.3.3 to conclude that F is not conservative. However, if ∇ × F = 0, you cannot
conclude that F is conservative without doing more work.
Then
∂v1 y 2 − x2 ∂v2
= 2 = .
∂y (x + y 2 )2 ∂x
However, v is not conservative. We will prove this in Example 6.4.4. △
We’d like to determine whether or not this vector field is conservative and, if it is, find a
potential.
We first apply the test for conservative vector fields (Theorem 5.3.3). We compute
i j k
∂ ∂ ∂
∇ × F = det ∂x ∂y ∂z
= 0i − (4xz − 4xz)j + (2y − 2y)k = 0.
2 2 2 3
y + 2xz − 1 (2x + 1)y 2x z + z
So F passes the test. However, we must remember (Warning 5.3.5) that this, by itself, does
not imply that F is conservative.
Let’s try to find a potential φ. We need
∂φ
(x, y, z) = y 2 + 2xz 2 − 1, (5.4)
∂x
∂φ
(x, y, z) = (2x + 1)y, (5.5)
∂y
∂φ
(x, y, z) = 2x2 z + z 3 . (5.6)
∂z
From (5.4), we have
φ(x, y, z) = xy 2 + x2 z 2 − x + ψ(y, z).
110 Vector fields
∂
(xy 2 + x2 z 2 − x + ψ(y, z)) = (2x + 1)y
∂y
∂ψ
⇐⇒ 2xy + (y, z) = (2x + 1)y
∂y
∂ψ
⇐⇒ (y, z) = y
∂y
y2
⇐⇒ ψ(y, z) = + ζ(z).
2
Finally, this satisfies (5.6) if and only if
y2
∂ 2 2 2
xy + x z − x + + ζ(z) = 2x2 z + z 3
∂z 2
⇐⇒ 2x2 z + ζ ′ (z) = 2x2 z + z 3
z4
⇐⇒ ζ(z) = + C,
4
for some constant C ∈ R. Thus, one possible potential (the one with C = 0) is
y2 z4
φ(x, y, z) = xy 2 + x2 z 2 − x + + .
2 4
We can double-check our computations by computing the gradient:
Exercises.
Exercises from [FRYd, §2.3] : Q2–Q11.
Chapter 6
In this chapter we will see how to integrate along curves. There are two types of such
integrals. We can integrate a scalar field or a vector field. The first gives what is called a
path integral, while the second is called a line integral. A good reference for the material in
this chapter is [FRYe, Ch. 1 and §2.4].
r : [a, b] → Rn ,
where a, b ∈ R, a ≤ b. (In Section 1.7, we used the notation c(t). We will now start to
use boldface notation, such as r(t), since we want to emphasize the vector nature of these
paths.) Throughout this chapter, we will assume that paths are continuous. A curve is the
image of a path:
C = {r(t) : a ≤ t ≤ b}.
If r(t) is injective (one-to-one), we say that r(t) is a parameterization of the curve C. We
also allow the possibility that r(a) = r(b), in which case we say that C is a closed curve.
111
112 Path and line integrals
is piecewise C 1 .
y
x
In fact, it is piecewise C for all k (hence piecewise C ∞ ).
k
△
Example 6.1.3. A curve can have many different parameterizations. For instance,
r : [0, 1] → R3 , r(t) = (t2 , 1 + t, 1 − t)
and
s : 0, 12 → R3 , s(t) = ((1 − 2t)2 , 2 − 2t, 2t)
x
△
Suppose that we have a curve C that is parameterized as r(t), a ≤ t ≤ b. Suppose
furthermore that this curve is actually a wire whose density at the point r is ρ(r). Let’s try
to compute the mass of this wire.
We divide the interval [a, b] into n equal subintervals, each of length
b−a
∆t = .
n
Let
ti = a + i∆t
denote the endpoint of the i-th interval. We approximate the length of the part of the curve
between r(ti−1 ) and r(ti ) by ∥r(ti ) − r(ti−1 )∥ and the mass of this part of the curve by
ρ(r(ti ))∥r(ti ) − r(ti−1 )∥.
r(ti ) •
•
r(ti−1 ) •
•
•
We then take the limit as n → ∞. If r(t) is piecewise C 1 and ρ(r) is continuous, we get
Z b
dr
Mass of C = ρ(r(t)) (t) dt .
a dt
Path integrals 113
Definition 6.1.4 (Path integral, integral along a curve). Suppose C is a curve parameterized
by
r : [a, b] → Rn , a ≤ t ≤ b,
and f : A → R is a function, with C ⊆ A ⊆ Rn . Then we define
Z Z b
f ds := f (r(t))∥r′ (t)∥ dt . (6.1)
C a
In this notation, C specifies the curve, and ds denotes arc length. The integral (6.1) is called
a path integral.
If r is piecewise C 1 and
R f is piecewise continuous, then the integral (6.1) always exists.
Note that the notation C f ds only involves the curve C, and not the parameterizing path
r. We will justify this in Section 6.3.
Definition 6.1.5 (Length of a curve). Suppose C is a curve parameterized by a piecewise
C 1 path r : [a, b] → Rn , a ≤ t ≤ b. Then the length of C is
Z Z b
Length(C) := ds = ∥r′ (t)∥ dt .
C a
y
x
Furthermore, suppose the density at the point (x, y, z) of the wire is equal to z. Let’s compute
the length and the mass of the wire.
We have
r′ (t) = (−a sin t, a cos t, b),
and so p √
∥r′ (t)∥ = a2 sin2 t + a2 cos2 t + b2 = a2 + b2 .
Thus, the length of the wire is
Z Z b Z b√ √
′
ds = ∥r (t)∥ dt = a2 + b2 dt = (b − a) a2 + b2 .
C a a
Definition 6.1.7 (Average value along a curve). The average value of a function f on a
curve C is R
C
f ds
.
Length(C)
Example 6.1.8. Consider
√
r(t) = ( 3t2 , t3 − t), t ∈ [−1, 1].
Let’s find the average x-value on the corresponding curve C. We have
√
r′ (t) = (2 3t, 3t2 − 1),
√
∥r′ (t)∥2 = (2 3t)2 + (3t2 − 1)2
= 12t2 + 9t4 − 6t2 + 1
= 9t4 + 6t2 + 1
= (3t2 + 1)2
∥r′ (t)∥ = 3t2 + 1.
Therefore, Z Z 1 1
(3t2 + 1) dt = t3 + t t=−1 = 4
Length(C) = ds =
C −1
and
1 √ √ Z 1 4 √ 3t5 t3 1
Z Z
2 2 2
x ds = 3t (3t + 1) dt = 3 (3t + t ) dt = 3 +
C −1 −1 5 3 t=−1
√
√ 6 2 28 3
= 3 5+3 = .
15
√
So the average x-value is 7 3
15
. △
We conclude this section with another intuitive interpretation of path integrals. Suppose
r : [a, b] → R2 is a path with image C and f : C → R is aRfunction taking nonnegative values.
So f (r(t)) ≥ 0 for all t ∈ [a, b]. Then the path integral C f ds is the surface area of a fence
along the curve C whose height at point r(t) is given by f (r(t)).
y
x
Line integrals 115
Exercises.
Exercises from [FRYd, §1.6] : Q1–Q11.
If r(a) = r(b), then the notation r F · ds is also used. If n = 3 and F = (F1 , F2 , F3 ), we also
H
write Z Z
F · ds = (F1 dx +F2 dy +F3 dz).
r r
If n = 2 and F = (F1 , F2 ), we sometimes write
Z Z
F · ds = (F1 dx +F2 dy).
r r
The notation F · ds in Definition 6.2.1 is meant to remind you of the dot product. If
ds = (dx, dy, dy),
is an infinitesimal displacement, then F · ds = F1 dx +F2 dy +F3 dz.
Definition 6.2.3 (Work). If F(t) denotes a force moving a particle whose position is given
by r(t), with t indicating time, then r F · ds is the work done by the force.
R
Example 6.2.4. Recall from Example 5.1.6 that the gravitational force field for an object of
mass m in the presence of a mass M at the origin is given by
GM m
F(r) = − r,
r3
where r = (x, y, z) is the position of the object of mass m. Suppose this object moves along
a path
r : [a, b] → R3 .
What work is done by gravity? By Definition 6.2.3,
Z Z b Z b
GM m
Work = F · ds = ′
F(r(t)) · r (t) dt = − r(t) · r′ (t) dt .
r a a ∥r(t)∥3
Let’s make the substitution
so that
du = 2x(t)x′ (t) dt +2y(t)y ′ (t) dt +2z(t)z ′ (t) dt = 2r(t) · r′ (t) dt .
Thus
Z ∥r(b)∥2 ∥r(b)∥2
GM m GM m GM m GM m GM m GM m
Work = − 3/2
du = √ = − = − ,
∥r(a)∥2 2u u u=∥r(a)∥2 ∥r(b)∥ ∥r(a)∥ rend rstart
where rstart is the distance from the origin at the start of the path and rend is the distance
from the origin at the end of the path. Note that the total work done only depends on the
endpoints of the path, and not on the route taken in between or the time taken to travel
between the points. We will explain this phenomenon in Example 6.4.2. △
Exercises.
Exercises are deferred until Section 6.5.
6.3 Reparameterization
Our definitions of path integrals (Definition 6.1.4) and line integrals (Definition 6.2.1) both
involve a parameterization r(t) of a curve C. It is natural to ask how these integrals depend
on the parameterization. We now examine this question.
Suppose
r : [a, b] → Rn and r̃ : [c, d] → Rn
Reparameterization 117
are two parameterizations of the same curve. In other words, they have the same image C.
Since parameterizations are invertible, we have the composite function
Thus
r̃(u) = r(g(u)), u ∈ [c, d].
We call r̃ a reparameterization of r.
Since we assume that r and r̃ are continuous, the map g is a continuous bijection. There-
fore it is either increasing or decreasing. If g is increasing, we say that r and r̃ have the same
orientation, and that the reparameterization is orientation-preserving. If g is decreasing, we
say that r and r̃ have opposite orientation, and that the reparameterization is orientation-
reversing.
Theorem 6.3.1 (Reparameterization for path integrals). Under the above hypotheses,
Z d Z b
′
f (r̃(u))∥r̃ (u)∥ du = f (r(t))∥r′ (t)∥ dt .
c a
Proof. This follows from the change of variables formula. Setting t = g(u), we have
d
dt = g ′ (u) du, r̃′ (u) = r(g(u)) = r′ (g(u))g ′ (u).
du
If g is increasing, we have g ′ (u) ≥ 0 for u ∈ [a, b], g(c) = a, and g(d) = b. Thus,
Therefore,
Z b Z d Z d
′ ′ ′
f (r(t))∥r (t)∥ dt = f (r(g(u)))∥r (g(u))∥g (u) du = f (r̃(u))∥r̃′ (u)∥ du .
a c c
Theorem 6.3.2 (Reparameterization for line integrals). Suppose F is a vector field on the
piecewise C 1 path r : [a, b] → Rn , and let r̃ : [c, d] → Rn be a reparameterization of r. If r̃ is
orientation-preserving, then Z Z
F · ds = F · ds,
r̃ r
and if r̃ is orientation-reversing, then
Z Z
F · ds = − F · ds .
r̃ r
118 Path and line integrals
Proof. As in the proof of Theorem 6.3.1, we define g as in (6.2) and set t = g(u). Then
d
dt = g ′ (u) du, r̃′ (u) = r(g(u)) = r′ (g(u))g ′ (u).
du
Then
Z Z b
F · ds = F(r(t)) · r′ (t) dt
r a
Z g −1 (b)
= F(r(g(u))) · r′ (g(u))g ′ (u) du
g −1 (a)
Z g −1 (b)
= F(r̃(u)) · r̃′ (u) du
g −1 (a)
(R d
F(r̃(u)) · r̃′ (u) du if r̃ is orientation-preserving,
= Rcc
d
F(r̃(u)) · r̃′ (u) du if r̃ is orientation-reversing
(R
F · ds
r̃ R
if r̃ is orientation-preserving,
=
− r̃ F · ds if r̃ is orientation-reversing.
Theorem 6.3.2 states that line integrals are independent of parameterization of a curve,
except that reversing orientation introduces a negative sign. It thus makes sense to speak
of a oriented curve which is, by definition, a curve C together with a choice of start point
P0 and endpoint P1 . We indicate the orientation of a curve by drawing an arrow along the
curve, in the direction from P0 to P1 :
• P1
P0 •
Exercises.
6.3.1. Complete the proof of Theorem 6.3.1 by treating the case where g is decreasing.
Proof. We give the proof for the three-dimensional setting; the proof in two dimensions is
almost identical. Let
r(t) = (x(t), y(t), z(t)), a ≤ t ≤ b,
be a parameterization of C, with r(a) = P0 and r(b) = P1 . Then,
Z Z b
F · ds = F(r(t)) · r′ (t) dt
C a
Z b
= ∇φ(r(t)) · r′ (t) dt
a
Z b
∂φ dx ∂φ dy
= (x(t), y(t), z(t)) (t) + (x(t), y(t), z(t)) (t)
a ∂x dt ∂y dt
∂φ dz
+ (x(t), y(t), z(t)) (t) dt
∂z dt
Z b
d
= (φ(x(t), y(t), z(t))) dt (by the chain rule)
a dt
= φ(r(b)) − φ(r(a)) (by the fundamental theorem of calculus)
= φ(P1 ) − φ(P0 ).
Provided we are integrating a conservative vector field (whose potential we can find),
Theorem 6.4.1 gives us an extremely easy way to compute line integrals. In one dimension,
Theorem 6.4.1 reduces to the usual fundamental theorem of calculus.
Example 6.4.2. Recall from Example 5.2.3 that the gravitational force field F is conservative,
with potential
GM m
φ(r) = , where r = ∥r∥.
r
Thus, the work done by gravity on a particle moving along a curve C, starting at point
(x0 , y0 , z0 ) and ending at point (x1 , y1 , z1 ) is
Z
GM m GM m
F · ds = p 2 − p .
C x1 + y12 + z12 x20 + y02 + z02
The work does not depend on the curve, only its starting point and endpoint. Furthermore,
it does not depend on how long it takes the particle to travel between these two points. This
agrees with our computation in Example 6.2.4. △
120 Path and line integrals
Consider two curves, both starting at P0 = (0, 0) and ending at P1 = (1, 1).
C1
P0 x
•
(a) We parameterize C1 by
Then
F(r(t)) = t2 i + (t2 + 1)j and r′ (t) = i + j.
and so
Z Z 1 Z 1
′
t2 i + (t2 + 1)j · (i + j) dt
F · ds = F(r(t)) · r (t) dt =
C1 0 0
1 1
2t3
Z
2 5
= (2t + 1) dt = +t = .
0 3 t=0 3
(b) We parameterize C2 with a piecewise C 1 path. The curve C2 is the union of the
curves C3 and C4 , where C3 is the straight line from (0, 0) to (0, 1) and C4 is the straight
line from (0, 1) to (1, 1). We parameterize C3 by
and C4 by
p(t) = ti + j, t ∈ [0, 1].
Then
Z Z Z
F · ds = F · ds + F · ds
C2 C3 C4
Line integrals of conservative vector fields 121
Z 1 Z 1
′
= F(r(t)) · r (t) dt + F(p(t)) · p′ (t) dt
0 0
Z 1 Z 1
2
= (t + 1)j · j dt + (ti + 2j) · i dt
0 0
Z 1 Z 1
2
= (t + 1) dt + t dt
0 0
3 1 2 1
t t 11
= +t + = .
3 t=0 2 t=0 6
We see that the integrals along C1 and C2 are different, even though they have the same
starting point and endpoint. (See [FRYe, Example 2.4.3] for integrals along some additional
curves.) Using Theorem 6.4.1, this implies that F is not a conservative vector field. △
We claimed in Example 5.3.6 that this vector field is not conservative, even though it is
curl free (that is, ∇ × F = 0). One way to prove it is not conservative, is to show that the
integrals along two curves with the same starting point and endpoint are different.
Let
P0 = (−1, 0), P1 = (1, 0),
r(t) = (− cos t, sin t), t ∈ [0, π],
p(t) = (− cos t, − sin t), t ∈ [0, π].
y
r
P0 P1
• •
x
Then we have
Z Z π
F · ds = F(r(t)) · r′ (t) dt
r
Z0 π
= (− sin t, − cos t) · (sin t, cos t) dt
0
Z π
= (− sin2 t − cos2 t) dt
Z0 π
= (−1) dt
0
122 Path and line integrals
= −π
and
Z Z π
F · ds = F(p(t)) · p′ (t) dt
p
Z0 π
= (sin t, − cos t) · (sin t, − cos t) dt
0
Z π
= (sin2 t + cos2 t) dt
Z0 π
= dt
0
= π.
Since the two integrals are different, the vector field F cannot be conservative. △
Remark 6.4.5. Note that the vector field F of Example 6.4.4 is defined on R2 \ {(0, 0)}, and
that the paths C1 and C2 go on opposite sides of the origin. In a certain sense F is the only
vector field defined on R2 \ {(0, 0)} that has zero curl but is not conservative. More precisely,
any other such vector field is a scalar multiple of F plus a conservative vector field. This is
related to an advanced mathematical theory known as de Rham cohomology.
Exercises.
Exercises are deferred until Section 6.5.
Theorem 6.5.1. Suppose that F is a vector field that is defined and continuous on all of R2
or R3 . Then the following statements are equivalent:
(a) F is conservative.
(b) For any closed curve C, we have
H
C
F · ds = 0.
Path independence 123
Proof. (a) =⇒ (b): Suppose (a) is true. Thus, there exists a function φ such that F = ∇φ.
Then, if C is a closed curve that starts and ends at P0 , we have, by Theorem 6.4.1,
I
F · ds = φ(P0 ) − φ(P0 ) = 0.
C
(b) =⇒ (c): Suppose (b) is true. Let C1 , C2 be paths that start at P0 and end at P1 . Define
C̃2 to be the opposite of the path C2 (that is, we follow C2 in the opposite direction). Then
let C be the path C1 followed by the path C̃2 . Thus C is a closed path starting and ending
at P0 . Then, using (b) and Theorem 6.3.2, we have
I Z Z Z Z
0= F · ds = F · ds + F · ds = F · ds − F · ds
C C1 C̃2 C1 C2
Hence
R R
C1
F · ds = C2
F · ds.
(c) =⇒ (a): Suppose (c) is true. We want to find a function φ such that F = ∇φ. By
Theorem 6.4.1, such a function would have to satisfy
Z
F · ds = φ(x) − φ(0)
C
for any path C from 0 to x. Since adding a constant to a potential yields another potential,
we can assume that φ(0) = 0. So let’s define φ by
Z
φ(x) = F · ds,
C
where C is any path from 0 to x. Since we’re assuming (c) is true, this definition of φ(x)
does not depend on the path C we choose.
Now we want to show that F = ∇φ. Fix a point x and any curve Cx that starts at 0
and ends at x. Then, for any vector u, let Du be the curve with parameterization
ru (t) = x + tu, 0 ≤ t ≤ 1.
For s ∈ R, let Cx + Dsu be the curve that first follows Cx from 0 to x and then follows Dsu
from x to x + su. Then
Z
φ(x + su) = F · ds
Cx +Dsu
Z Z
= F · ds + F · ds
Cx Dsu
Z Z 1
= F · ds + F(x + tsu) · (su) dt,
Cx 0
124 Path and line integrals
since r′su = su. In the second integral, make the change of variables
v = ts, dv = s dt .
Then Z Z s
φ(x + su) = F · ds + F (x + vu) · u dv .
Cx 0
The first integral is independent of s. Using the fundamental theorem of calculus for the
second integral, we have
d
φ(x + su) = F(x + su) · u = F(x) · u.
ds s=0 s=0
as desired.
Theorem 6.5.2. Let F be a vector field that is defined and C 1 on all of R2 or R3 . Then F
is conservative if and only if ∇ × F = 0 (i.e. F is curl free).
Proof. We’ll give the proof for the R3 case. The proof for the R2 case is very similar and can
be found in [FRYe, Th. 2.4.8]. We’ve already seen in Theorem 5.3.3 that every conservative
vector field is curl free. It remains to show the reverse implication.
Suppose F = (F1 , F2 , F3 ) is a vector field that is defined and C 1 on all of R3 . Furthermore,
suppose that ∇ × F = 0. Thus
∂φ ∂φ ∂φ
(x, y, z) = F1 (x, y, z), (x, y, z) = F2 (x, y, z), (x, y, z) = F3 (x, y, z). (6.5)
∂x ∂y ∂z
The function φ obeys the first equation in (6.5) if and only if there is a function ψ(y, z) such
that Z x
φ(x, y, z) = F1 (X, y, z) dX +ψ(y, z).
0
Z x
∂F1 ∂ψ
= (X, y, z) dX + (y, z).
0 ∂y ∂y
So we have
Z x
∂ψ ∂F1
(y, z) = F2 (x, y, z) − (X, y, z) dX
∂y ∂y
Z0 x
∂F2
= F2 (x, y, z) − (X, y, z) dX (by (6.4))
0 ∂x
x
= F2 (x, y, z) − F2 (X, y, z) (by the fundamental theorem of calculus)
X=0
= F2 (0, y, z).
Thus, we can choose Z y
ψ(y, z) = F2 (0, Y, z) dZ +χ(z)
0
for some function χ depending only on χ(z).
So now we have
Z x Z y
φ(x, y, z) = F1 (X, y, z) dX + F2 (0, Y, z) dZ +χ(z).
0 0
is a potential for F.
126 Path and line integrals
Later, when we learn Stokes’ Theorem, we’ll be able to give an alternative proof of
Theorem 6.5.2. See Remark 8.4.9.
Remark 6.5.4. In fact, we can generalize Theorem 6.5.1 somewhat. We can replace R2 or
R3 by any region that is simply connected. Intuitively, a region is simply connected if any
closed curve can be contracted to a point. The region R2 \ {(0, 0)} is not simply connected
since loops around the origin cannot be contracted to a point.
Exercises.
Exercises from [FRYd, §2.4] : Q1–Q38.
Chapter 7
Surface integrals
In this chapter we discuss integration of real-valued functions and vector fields over surfaces.
A good reference for the material in this section is [FRYe, Ch. 3].
127
128 Surface integrals
Example 7.1.3 (Planes). Let P be a plane that is parallel to two vectors a and b and passes
through the point p. Then
Φ(u, v) = p + ua + vb
is a parameterization of P .
•p
b
Example 7.1.4 (Sphere). Using spherical coordinates (see Section 4.5), we can parameterize
the sphere of radius R centred at the origin by
Φ(u, v) = (R sin v cos u, R sin v sin u, R cos v), (u, v) ∈ [0, 2π] × [0, π]. △
Example 7.1.5 (Torus). The torus obtained by revolving a circle of radius r in the yz-plane,
centred at (0, R, 0), about the z-axis, can be parameterized by
Φ(θ, ψ) = (R + r cos θ) cos ψ, (R + r cos θ) sin ψ, r sin θ , 0 ≤ θ, ψ ≤ 2π.
Exercises.
Exercises from [FRYd, §3.1] : Q1–Q7.
Φ = (ϕ1 , ϕ2 , ϕ3 ) : D → R3 .
Define
∂ϕ1 ∂ϕ2 ∂ϕ3 ∂ϕ1 ∂ϕ2 ∂ϕ3
Φu := , , , Φv := , , .
∂u ∂u ∂u ∂v ∂v ∂v
Tangent planes 129
Then Φu (u0 , v0 ) and Φv (u0 , v0 ) are tangent to the surface at Φ(u0 , v0 ). Thus, if they are
linearly independent, they generate the tangent plane. In other words, the tangent plane to
the surface at the point Φ(u0 , v0 ) is given by
is a vector that is perpendicular to both a and b (with direction given by the right-hand
rule) and with magnitude
∥a × b∥ = ∥a∥ ∥b∥ sin θ,
where θ is the angle between a and b.
z
a
a×b y
It follows that
Φu (u0 , v0 ) × Φv (u0 , v0 )
is a normal vector to the surface parameterized by Φ at the point Φ(u0 , v0 ), that is, it
is orthogonal to the tangent plane at that point. This vector is nonzero precisely when
Φu (u0 , v0 ) and Φv (u0 , v0 ) are linearly independent.
Definition 7.2.1 (Regular point). We say that a surface S parameterized by Φ is regular
at Φ(u0 , v0 ) ∈ S when Φu (u0 , v0 ) × Φv (u0 , v0 ) ̸= 0.
Example 7.2.2. Let’s find the tangent plane to the helicoid parameterized by
y
x
This vector is never equal to 0 (since sin θ and cos θ are never zero for the same value of θ)
and so the helicoid is regular at all points. △
We have
Φr = (cos θ, sin θ, 1), Φθ = (−r sin θ, r cos θ, 0).
Thus
Φr × Φθ = (−r cos θ, −r sin θ, r cos2 θ + r sin2 θ) = (−r cos θ, −r sin θ, r).
Therefore the cone is regular everywhere except at the origin (when r = 0). △
Surface area 131
Exercises.
Exercises from [FRYd, §3.2] : Q1–Q17.
Assumption 7.3.1. We assume that all surfaces are unions of images of parameterized surfaces
Φi : Di → R3 for which:
(a) Di is an elementary region in the plane,
(b) Φi is of class C 1 and one-to-one, except possibly on the boundary of Di , and
(c) the image of Φi is regular, except possibly at a finite number of points.
In what follows, we will use the fact ∥a × b∥ is the area of the parallelogram determined
by the vectors a and b.
a
θ Area = ∥a∥ ∥b∥ sin θ = ∥a × b∥. (7.2)
b
Before defining general integrals of scalar functions over surfaces, we start with the defi-
nition of surface area. For simplicity, we assume that D is a rectangle. As in Section 3.3, we
subdivide the rectangle D by subdividing each of its sides into n segments of equal length.
z
v
∆uΦu
∆u
Φ(ui ,vj ) •
∆v Φ
−→ ∆vΦv
•
(ui ,vj ) y
x
The sub-rectangle with bottom-right corner at (ui , vj ) is sent to a piece of the surface in-
cluding the point Φ(ui , vj ). We approximate this piece of the surface by the parallelogram
determined by the vectors ∆uΦu (ui , vj ) and ∆vΦv (ui , vj ). The area of this parallelogram is
∥∆uΦu (ui , vj ) × ∆vΦv (ui , vj )∥ = ∥Φu (ui , vj ) × Φv (ui , vj )∥∆u∆v. (7.3)
132 Surface integrals
Adding up all the areas of these parallelograms, our approximation for the area of the surface
is
n−1 X
X n−1
∥Φu (ui , vj ) × Φv (ui , vj )∥∆u∆v. (7.4)
i=0 j=0
As n → ∞, these sums should converge to the area of the surface. This leads us to the
following definition.
Definition 7.3.2 (Area of a parameterized surface). The surface area of a surface S param-
eterized by Φ : D → R3 is ZZ
Area(S) := ∥Φu × Φv ∥ dA . (7.5)
D
If S is a union of surfaces Si , then its area is the sum of the areas of the Si .
Note that our definition of surface area involves the parameterization. We will see in
Theorem 7.7.2 that, in fact, it is independent of the parameterization. We saw such a
phenomenon already for path integrals in Theorem 6.3.1.
Example 7.3.3 (Area of a cone). Let’s compute the area of the cone parameterized by
y
x
and (0, 0, 0) is the only point where the surface is not regular. Furthermore, Φ is one-to-
one on the interior of its domain D = [0, 1] × [0, 2π]. Since D is an elementary region,
Assumption 7.3.1 is satisfied. We have
p √
∥Φr × Φθ ∥ = r2 cos2 θ + r2 sin2 θ + r2 = 2r.
Example 7.3.4. Let’s compute the surface area of the helicoid (see Example 7.2.2) parame-
terized by
Φ(r, θ) = (r cos θ, r sin θ, θ), 0 ≤ r ≤ 1, 0 ≤ θ ≤ 2π.
We have
Φr (r, θ) = (cos θ, sin θ, 0), Φθ (r, θ) = (−r sin θ, r cos θ, 1),
and so
Φr × Φθ = (sin θ, − cos θ, r cos2 θ + r sin2 θ) = (sin θ, − cos θ, r).
Since Φr × Φθ is never 0 (since sin θ and cos θ are never zero for the same value of θ), the
surface is regular at all points. The domain [0, 1] × [0, 2π] is an elementary region, and Φ is
one-to-one. Thus, Assumption 7.3.1 is satisfied. We have
p √
∥Φr × Φθ ∥ = sin2 θ + cos2 θ + r2 = r2 + 1.
Thus, with the use of a table of integrals, we can compute that the surface area is
ZZ Z 2π Z 1 √ √ √
∥Φr × Φθ ∥ dr dθ = r2 + 1 dr dθ = π 2 + ln(1 + 2) . △
0 0
D
Example 7.3.5 (Graphs). Recall, from Example 7.1.2, that if g : D → R is a function, with
D ⊆ R2 , then
Φ(x, y) = (x, y, g(x, y)), (x, y) ∈ D,
is a parameterization of the graph of f . Then we have
∂g ∂g
Φx (x, y) = 1, 0, (x, y) and Φy (x, y) = 0, 1, (x, y) ,
∂x ∂y
and so
∂g ∂g
Φx × Φy = − ,− ,1 ,
∂x ∂y
which is never zero. Thus, provided D is an elementary region and f is C 1 , Assumption 7.3.1
is satisfied. Therefore, the surface area of the graph is
s
ZZ 2 2
∂g ∂g
Area(graph of g) = + + 1 dx dy . (7.6)
∂x ∂y
D
Exercises.
Exercises from [FRYd, §3.3] : Q1–Q3, Q5–Q10, Q14
134 Surface integrals
Compare to (7.4). As n → ∞, these sums should converge to the mass of the surface. This
leads us to the following definition, which applies more generally (i.e. not only when the
function f (x, y, z) represents a density).
Note that when f (x, y, z) = 1, (7.7) reduces to the formula (7.5) for surface area, as
expected.
y
x
Let’s compute ZZ p
f dA, where f (x, y, z) = x2 + y 2 .
Φ
Integrals of scalar functions over surfaces 135
Therefore,
ZZ p ZZ √
2 2
x + y dA = r 1 + r2 dA
Φ D
Z 1 Z 2π √
= r 1 + r2 dθ dr
Z0 1 0
√
= 2πr 1 + r2 dr .
0
u = 1 + r2 , du = 2r dr .
Then 2
2 √ 2π √
ZZ p Z
2π 3/2
x2 + y 2 dA = π u du = u = (2 2 − 1). △
1 3 u=1 3
Φ
Example 7.4.3 (Graphs). In Example 7.3.5, we developed a formula for the surface area of
a graph (which is a special type of surface). Repeating the argument there, but for the
integral of a function f (x, y, z) over the graph, we can give a formula for the integral of a
scalar function over a graph. Suppose
g : D → R, D ⊆ R2
z = x + y3, 0 ≤ x ≤ 3, 0 ≤ y ≤ 1.
z
x
Note that this surface is the graph of the function
g(x, y) = x + y 3 .
Exercises.
Exercises from [FRYd, §3.3] : Q11, Q15, Q20, Q25, Q26, Q32, Q34
Definition 7.5.1 (Centre of mass). Suppose an object occupies a volume V ⊆ R3 , and its
density at point (x, y, z) is given by ρ(x, y, z). The centre of mass of the object is (x0 , y0 , z0 ),
where
ZZZ
1
x0 = xρ(x, y, z) dV ,
M
Z VZ Z
1
y0 = yρ(x, y, z) dV ,
M
Z VZ Z
1
z0 = zρ(x, y, z) dV ,
M
V
and ZZZ
M= ρ(x, y, z) dV
V
z 0.5
0 0.5
0
−0.5 0
0.5 −0.5
1 y
x
If both have constant density 1, which is more stable? More precisely, which has a lower
centre of mass?
Let’s first consider the solid cone. Using cylindrical coordinates, we compute that its
mass is
138 Surface integrals
1 z 2π 1 z
r2
ZZZ Z Z Z Z
MV = dV = r dθ dr dz = 2π dz
0 0 0 0 2 r=0
V
1 3 1
z2
Z
z π
= 2π dz = 2π = .
0 2 6 z=0 3
Next, we compute
1 z 2π 1 z
zr2
ZZZ Z Z Z Z
z dV = zr dθ dr dz = 2π dz
0 0 0 0 2 r=0
V
1 4 1
z3
Z
z π
= 2π dz = 2π = .
0 2 8 z=0 4
Thus π
4 3
z0 = π = .
3
4
By symmetry, we have x0 = y0 = 0. Thus, the centre of mass of the solid cone is 0, 0, 34 .
Now let’s consider the empty cone. Our computation of the surface area of a cone in
Example 7.3.3 gives √
MS = 2π.
As in Example 7.3.3, we parameterize the cone by
and we have √
∥Φr × Φθ ∥ = 2r.
Thus,
" √ #1 √
ZZ Z 1 Z 2π √ Z 1 √ 2 2π 2 2π
z dA = r 2r dθ dr = 2π 2r2 dr = r3 = .
0 0 0 3 3
S r=0
Therefore, √
2 2π
32
z0 = √ = .
2π 3
Again, by symmetry, we have x0 = y0 = 0. Thus, the centre of mass is 0, 0, 32 . So the
empty cone has a lower centre of mass than the solid cone. △
For the next definition, we consider rotating some object about an axis. The axis is a
line in R3 . The moment of inertia about this axis measures the torque needed per unit of
angular acceleration. It is the rotational analogue of usual inertia.
distance from the point (x, y, z) to the axis. Then the moment of inertia of the object about
the axis is ZZZ
ρ(x, y, z)d(x, y, z)2 dV .
V
S = {(x, y, z) : x2 + y 2 + z 2 = 1, z ≥ 0}
with density
ρ(x, y, z) = z.
Let’s find its moment of inertia about the z-axis. As in Example 7.7.1, we parameterize
this surface by
Φ(θ, φ) = (cos θ sin φ, sin θ sin φ, cos φ), θ ∈ [0, 2π], φ ∈ [0, π2 ],
and we have
q
= cos2 θ sin4 φ + sin2 θ sin4 φ + sin2 φ cos2 φ
q q
= sin φ + sin φ cos φ = sin2 φ = sin φ,
4 2 2
since sin φ ≥ 0 for φ ∈ [0, π2 ]. Then the moment of inertia about the z-axis is
ZZ Z 2π Z π/2
2 2
cos φ cos2 θ sin2 φ + sin2 θ sin2 φ sin φ dφ dθ
z(x + y ) dA =
0 0
S
Z 2π Z π/2
= cos φ sin3 φ dφ dθ
0 0
Z 2π π/2 Z 2π
1 4 1 π
= sin φ dθ = dθ = . △
0 4 φ=0 0 4 2
Exercises.
Exercises from [FRYd, §3.3] : Q4, Q13, Q21, Q22, Q27,
θ v∆t
v∆t v∆t
The green line is a side view of the piece ∆S of the surface. In a small time interval ∆t,
If we denote by θ the angle between n and v, the dark grey region has base ∆S and height
∥v∆t∥ cos θ. Thus, its volume is
because n(x, y, z) has length one. The mass of the fluid that crosses ∆S during the time
interval ∆t is then
ρ(x, y, z)v(x, y, z) · n(x, y, z)∆t∆S
and the rate at which fluid is crossing through ∆S is
Adding up these amounts over all pieces ∆S of the surface and then taking the limit as the
size of the pieces ∆S and the time interval ∆t goes to zero, we find that the rate at which
fluid is crossing through the surface S is the flux integral
ZZ
ρ(x, y, z)v(x, y, z) · n(x, y, z) dS .
S
Now, comparing to (7.7), we see that to compute this integral, we should parameterize
the surface by a function Φ : D → R3 . Then dS is replaced by ∥Φu ×Φv ∥ dA, and we integrate
over D. Recall that the point Φ(u0 , v0 ) ∈ S is regular if Φu (u0 , v0 ) × Φv (u0 , v0 ) ̸= 0. Since
Φu and Φv are tangent to the surface, their cross product Φu × Φv is normal to the surface.
Therefore, informally speaking, we have
n dS = Φu × Φv .
Note that Definition 7.6.1 depends on a choice of parameterization of the surface. Later,
in Section 7.7, we will see that, up to a sign, the integral does not depend on the parame-
terization.
z
Φu × Φv
F
S
y
x
We compute
F(Φ(u, v)) · (Φu × Φv ) = (cos u, 0, v) · (cos u, sin u, 0) = cos2 u = 21 (1 + cos(2u)).
Thus,
ZZ Z 1 Z 2π Z 2π
1 1
F · dS = (1 + cos(2u)) du dv = (1 + cos(2u)) du
0 0 2 2 0
Φ
2π
1 1
= u + sin(2u) = π. △
2 2 u=0
Example 7.6.3. Recall the helicoid S of Examples 7.2.2 and 7.4.2, with parameterization
Φ(r, θ) = (r cos θ, r sin θ, θ), 0 ≤ r ≤ 1, 0 ≤ θ ≤ 2π.
Let’s integrate the following vector fields over Φ:
F(x, y, z) = (x, 0, 0), G(x, y, z) = (xz, 0, 0).
So we have the following picture:
z
S
F
G
x
Surface integrals of vector fields 143
First we compute
Therefore
ZZ
1 1 2π 1 π
G · dS = − 2 θ cos(2θ) + 41 sin(2θ) θ=0 = ((−π + 0) − (0 + 0)) = − . △
4 4 4
Φ
Example 7.6.4 (Graphs). Recall, from Example 7.3.5, that if g : D → R is a function, with
D ⊆ R2 , then
Φ(x, y) = (x, y, g(x, y)), (x, y) ∈ D,
is a parameterization of the graph of g and we have
∂g ∂g
Φx × Φy = − , − , 1 .
∂x ∂y
Therefore, the integral of a vector field F = (F1 , F2 , F3 ) over the graph is given by
ZZ ZZ
∂g ∂g
F · dS = −F1 − F2 + F3 dx dy . △
∂x ∂y
Φ D
144 Surface integrals
Exercises.
Exercises from [FRYd, §3.3] : Q16, Q23, Q24, Q28–Q31, Q33, Q35–Q37
(u, v) ∈ D such that Φ(u, v) = (x, y, z) and (s, t) ∈ E such that Ψ(s, t) = (x, y, z).
Φu × Φv and Ψs × Ψt
are positive scalar multiples of each other. We say they have they have opposite orientation
at (x, y, z) if they are negative scalar multiples of each other. (Since Φu × Φv and Ψs × Ψt are
both normal to S, these are the only possibilities, provided both vectors are nonzero.) We
say that Φ and Ψ have the same (respectively, opposite) orientation if they have the same
(respectively, opposite) orientation at all points of S.
{(x, y, z) : x2 + y 2 + z 2 = 1, z ≥ 0}
with parameterizations
Φ(θ, φ) = (cos θ sin φ, sin θ sin φ, cos φ), θ ∈ [0, 2π], φ ∈ [0, π2 ],
√
Ψ(u, v) = (u, v, 1 − u2 − v 2 ), (u, v) ∈ {(x, y) : x2 + y 2 ≤ 1}.
(Note that Φ is not actually one-to-one as given. To fix this, we could modify the domain
to be [0, 2π] × (0, π2 ] ∪ {(0, 0)}.) We compute
u v
= √ ,√ ,1 .
1 − u2 − v 2 1 − u2 − v 2
(Note that Φu × Φv is not defined on the boundary u2 + v 2 = 1 of its domain.) For φ ∈ (0, π2 ],
we have − sin φ cos φ < 0. So the z-coordinate of Φθ × Φφ is negative, while the z-coordinate
of Ψu × Ψv is positive. Thus, Φ and Ψ have opposite orientations.
To further illustrate this, let’s consider a specific point, say
1 1
Ψ 2, 0 =
√ √
2
, 0, 2 = Φ(0, π4 ).
√1
At this point, the normal vectors coming from the two parameterizations are
Φθ × Φφ 0, 4 = − 2 , 0, − 2 . and Ψu × Ψv
π 1 1 1
√
2
, 0 = (1, 0, 1).
(c) We have ZZ ZZ
f dS = f dS .
Φ Ψ
Proof. The proof of this theorem is similar to that of Theorem 6.3.1. It involves using a
change of variables. We omit the details.
Theorem 7.7.2 tells us that integrals of scalar functions over surfaces are independent of
parameterization, while integrals of vector fields over surfaces depend only on the orientation
of the parameterization. This is analogous to Theorems 6.3.1 and 6.3.2, which state that
integrals of scalar functions along paths are independent of parameterization, while integrals
of vector fields along paths depend only on the orientation of the parameterization.
In light of Theorem 7.7.2, if f is a real-valued function defined on a surface S, we can
define ZZ ZZ
f dS := f dS,
S Φ
146 Surface integrals
Example 7.7.3. Suppose we choose to orient the half-sphere S of Example 7.7.1 upward.
That is, we our normal vectors to have positive z-component. Consider the vector field
F(x, y, z) = (0, 0, 1). As we noted in Example 7.7.1, the parameterization Ψ gives normal
vectors with positive z-components. So Ψ is a parameterization of the oriented surface S.
Therefore,
ZZ ZZ ZZ ZZ
F · dS = F · dS = (0, 0, 1) · (Ψu × Ψv ) dS = 1 du dv = Area(D) = π,
S Ψ Ψ D
ZZ ZZ Z π/2 Z 2π
F · dS = (0, 0, 1) · (Φθ × Φφ ) dS = (− sin φ cos φ) dθ dφ
0 0
Φ Φ
Z π/2 π/2
1 2
= −2π sin φ cos φ dφ = −2π sin φ = −π,
0 2 φ=0
Integrating a vector field over the Möbius strip is problematic! For further discussion on
orientation see [FRYe, §3.5].
Chapter 8
Integral theorems
In this chapter we discuss three theorems: the divergence theorem, Green’s theorem, and
Stokes’ theorem. They are all generalization of the fundamental theorem of calculus. Recall
that the fundamental theorem of calculus states that
Z b
df
(t) dt = f (b) − f (a).
a dt
Thus, it relates the integral of the derivative of a function f over an interval [a, b] to the
values of that function on the boundary {a, b} of that interval. The divergence theorem,
Green’s theorem, and Stokes’ theorem will also have this form. However, we will now work
in higher dimensions. For the divergence theorem, the integral on the left-hand side is over
a three-dimensional volume, and the right-hand side is an integral over the boundary of the
volume, which is a surface. For Green’s theorem and Stokes’ theorem, the integral on the
left-hand side is over a surface, and the right-hand side is an integral over the boundary of
the surface, which is a curve. A good reference for the material in this section is [FRYe,
Ch. 4].
148
Gradient, divergence, and curl 149
∂ 2f ∂ 2f ∂ 2f
∆f = ∇2 f = ∇ · ∇f = + + .
∂x2 ∂y 2 ∂z 2
proving (b).
150 Integral theorems
(a) ∇2 (f + g) = ∇2 f + ∇2 g.
(b) ∇2 (cf ) = c∇2 f for all c ∈ R.
(c) ∇2 (f g) = f ∇2 g + 2(∇f ) · (∇g) + g∇2 f .
∇2 (f g) = ∇ · ∇(f g)
(by Proposition 8.1.2(c))
= ∇ · (∇f )g + f (∇g)
(by Proposition 8.1.3(a))
= ∇ · (∇f )g + ∇ · f (∇g)
= (∇2 f )g + (∇f ) · (∇g) + (∇f ) · (∇g) + f ∇2 g, (by Proposition 8.1.3(c))
as desired.
(a) ∇ · (∇ × F) = 0.
(b) ∇ × (∇f ) = 0.
(c) ∇ · f (∇g × ∇h) = ∇f · (∇g × ∇h).
Example 8.1.7. Let r(x, y, z) = x2 i + yj + z 3 k, and let ψ(x, y, z) be any real-valued function.
Let’s prove that
∇ · (r × ∇ψ) = 0.
Using Proposition 8.1.3(d), we have
Thus, ∇ · (r × ∇ψ) = 0, as desired. In fact, this is true for any curl free r. △
Next we discuss divergence. Fix a point (x0 , y0 , z0 ). Then, for ε > 0, let
Sε = {(x, y, z) : (x − x0 )2 + (y − y0 )2 + (z − z0 )2 = ε2 }
be the sphere of radius ε centred at (x0 , y0 , z0 ). Then one can show that, for a C 1 vector
field F : R3 → R3 , we have
ZZ
1
∇ · F(x0 , y0 , z0 ) = lim F · dS,
ε→0 Vol(Sϵ )
Sε
where Vol(Sϵ ) = 43 πε3 is the volume of Sε . (For a proof of this fact, see
RR[FRYe, Lem. 4.1.20].)
If we interpret F as the velocity of some fluid, then the flux integral F · dS represents the
Sε
total flow across the surface. Thus
∇ · F(x)
is the rate at which the fluid is exiting an infinitesimal sphere, centered at x, per unit time,
per unit volume. So it is a measure of the strength of the “source” at x.
F(x, y, z) = xi + yj + zk.
If this vector represents the flow of fluid, then fluid is being created and pushed out through
any sphere centred at the origin. The divergence is
∇ · F = 3 > 0,
In this case, the fluid is going in circles. No fluid actually exits the sphere. The divergence
is
∇ · F = 0,
consistent with our interpretation. △
In this case, the fluid is moving uniformly upwards. Fluid enters the sphere at the bottom
and leaves through the top at exactly the same rate. So the net rate at which fluid crosses
the sphere is zero. The divergence is
∇ · F = 0,
consistent with our interpretation. △
Finally, we turn our attention to curl. Fix ε > 0 and a, n ∈ R3 . Let Cε be the circle
centred at a, with radius ε, and in the plane through a orthogonal to n. We orient Cε using
the right-hand rule. If the thumb of your right hand points in the direction of n and you
curl your fingers, they point in the orientation of Cε .
n
ε
a•
Cε
where πε2 is the area of the disk with boundary Cε . (For a proof of this fact, see [FRYe,
Lem. 4.1.25].) Now suppose that F represents the velocity of a fluid and you place an
infinitesimal paddlewheel at a with its axle in direction n. If the paddlewheel is rotating
at a rate of Ω radians per minute, then the speed of the paddles is Ωε. (Remember that
radians are defined as arc length divided by radius). This must be equal to the average value
of speed of the paddles in the direction of rotation. Since the circumference of Cε is 2πε, we
have I I
1 1 1
Ωε = F · ds =⇒ Ω = F · ds = ∇ × F(a) · n.
2πε Cε 2π Cε 2
154 Integral theorems
Thus, our infinitesimal paddlewheel rotates at 21 ∇ × F(a) · n radians per unit time. In
particular, to maximize the rotation, we should orient the paddlewheel so that n is parallel
to ∇ × F(a).
Let’s return to the vector fields of Examples 8.1.8, 8.1.9, and 8.1.10 and consider our
interpretation of their curls.
F(x, y, z) = xi + yj + zk.
If this vector represents the flow of fluid, then fluid is moving parallel to the paddles, so the
paddlewheel should not rotate at all. Indeed, we have
i j k
∂ ∂ ∂
∇ × F(0) = det ∂x ∂y ∂z
= 0 =⇒ ∇ × F(0) · k = 0,
x y z
as expected. △
In this case, the fluid is rotating counterclockwise. So the paddlewheel should also rotate
counterclockwise. Hence it should have positive angular velocity. Indeed, we have
i j k
∂ ∂ ∂
∇ × F(0) = det ∂x ∂y ∂z
= 2k =⇒ ∇ × F(0) · k = 2k · k = 2 > 0,
−y x 0
as expected. △
Divergence theorem 155
In this case, the fluid is pushing the right paddle counterclockwise and the left paddle
clockwise, with the same force. So the paddle should not rotate. Indeed, we have
i j k
∂ ∂ ∂
∇ × F(0) = det ∂x ∂y ∂z = 0 =⇒ ∇ × F(0) · k = 0,
1 0 0
as expected. △
Exercises.
Exercises from [FRYd, §4.1] : Q1–Q13.
is piecewise smooth, but not smooth, since it has a “sharp curve” where the half-sphere (the
ice cream) meets the cone. △
156 Integral theorems
Recall from Section 1.1 that ∂A denotes the boundary of a set A. Thus, if V is a bounded
solid, then ∂V denotes the surface of V .
Theorem 8.2.3 (Divergence theorem). Suppose that V is a bounded solid with piecewise
smooth boundary ∂V , and let F be a C 1 vector field on V . We give ∂V the outward orientation
(i.e. we choose normal vectors pointing away from V ). Then
ZZ ZZZ
F · dS = ∇ · F dV .
∂V V
Proof. For a proof of this theorem, see [FRYe, Th. 4.2.2]. Given the physical interpretation
of divergence we discussed in Section 8.1.2, the divergence theorem should not be surprising.
The right-hand integral ZZZ
∇ · F dV
V
adds up the divergence over all points of the solid V . For any points in the interior of V ,
flow out of a small sphere is cancelled by the flow into neighbouring spheres. Thus, what we
are left with is the net flow out of the boundary, which is the left-hand integral
ZZ
F · dS .
∂V
The divergence theorem is sometimes called Gauss’ theorem or Gauss’ divergence theorem.
Like the fundamental theorem of calculus, the divergence theorem states that the integral of
a derivative of a function (here the “derivative” is the divergence) over some region is equal
to the integral of the function over the boundary of that region. When using the divergence
theorem, we will always choose the outward orientation for the surface of a solid region.
Thus
ZZ ZZ
F · dS = − F · dS (because of the reversed orientation)
∂V Φ
Z π Z 2π
=− (cos2 φ − 1) sin φ dθ dφ
0 0
Z π
= −2π (cos2 φ − 1) sin φ dφ
0
Z −1
= 2π (u2 − 1) du (u = cos φ, du = − sin φ dφ)
1
3 −1
u
= 2π −u
3
u=1
= 2π − 32 + 2 = 8π 3
.
V = {(x, y, z) : x, y, z ≥ 0, x + y + z ≤ 1}
z
y
x
Thus
F(Φ1 (u, v)) · Φ1u × Φ1v = (uv, 0, 0) · (0, 0, 1) = 0,
F(Φ2 (u, v)) · Φ2u × Φ2v = (0, 0, uv) · (0, −1, 0) = 0,
F(Φ3 (u, v)) · Φ3u × Φ3v = (0, uv, 0) · (1, 0, 0) = 0,
F(Φ4 (u, v)) · Φ4u × Φ4v = (uv, v(1 − u − v), u(1 − u − v)) · (1, 1, 1) = u + v − uv − u2 − v 2 .
and so
ZZ ZZ ZZ ZZ ZZ
F · dS = − F · dS + F · dS − F · dS + F · dS .
∂V Φ1 Φ2 Φ3 Φ4
ZZ
= F · dS
Φ4
Z 1 Z 1−u
= (u + v − uv − u2 − v 2 ) dv du
Z0 1 0
(1 − u)2 u(1 − u)2 (1 − u)3
2
= u(1 − u) + − − u (1 − u) − du
0 2 2 3
1 1 3
Z
= (5u − 9u2 + 1) du
6 0
1 5 9 1 3 1
= − +1 = · = .
6 4 3 6 4 8
On the other hand
∂ ∂ ∂
∇·F= (xy) + (yz) + (xz) = x + y + z,
∂x ∂y ∂z
and so
ZZZ Z 1 Z 1−x Z 1−x−y
∇ · F dx dy dz = (x + y + z) dz dy dx
0 0 0
V
1 1−x
(1 − x − y)2
Z Z
= (x + y)(1 − x − y) + dy dx
0 0 2
Divergence theorem 159
1 1 1−x
Z Z
1 − x2 − 2xy − y 2 dy dx
=
2 0 0
1 1 (1 − x)3
Z
2 2
= (1 − x )(1 − x) − x(1 − x) − dx
2 0 3
1 1 3
Z
= x − 3x + 2 dx
6 0
1 1 3 1 3 1
= − +2 = · = . △
6 4 2 6 4 8
The divergence theorem can also be used to simplify some surface integrals, as the fol-
lowing example illustrates.
C = {(x, y, z) : x2 + y 2 = (1 − z)2 }
z
y
x
Note that
∂ 2 ∂ ∂
∇·F= x + (−xy) + (1 − xz) = 2x − x − x = 0.
∂x ∂y ∂z
Therefore, by the divergence theorem,
ZZ
F · dS = 0
S
for any piecewise smooth closed surface S (i.e. any surface S that is the boundary of a
bounded solid). Let’s close off the cone by adding the bottom
B = {(x, y, 0) : x2 + y 2 ≤ 1}
So we have turned our integral over C into an integral over B, which is easier to compute!
We parameterize B by
We have
Φu × Φv = (0, 0, 1),
which is opposite to the orientation of B. Since
we have
ZZ ZZ ZZ ZZ
F · dS = − F · dS = F · dS = 1 du dv = Area(D) = π. △
C B Φ D
Warning 8.2.7. In the divergence theorem, it is crucial that F is C 1 at every point of V . For
example, the divergence theorem fails for
r
F= , V = {(x, y, z) : x2 + y 2 + z 2 ≤ 1},
∥r∥3
where r(x, y, z) = (x, y, z). Note that F is not defined at the origin, which is contained in
V . See [FRYe, Example 4.2.7] for details.
Exercises.
Exercises from [FRYd, §4.2] : Q1–Q34.
simple curve simple closed curve not a simple curve piecewise smooth curve
△
Theorem 8.3.3 (Green’s theorem). Suppose D is a bounded region in the xy-plane and that
the boundary, C, of D consists of a finite number of piecewise smooth, simple closed curves,
each oriented so that if you walk along in the direction of the orientation, then D is on your
left.
y
D
y
C
D x
C C
C
x
Remarks 8.3.4. (a) Remember that C is just an alternate notation for C that is some-
H R
Proof Theorem 8.3.3. We will use the divergence theorem (Theorem 8.2.3) to prove Green’s
theorem. Define
z
z=1
y
D
C
x
To compute the integral over the side, we parameterize it. Suppose that
is a parameterization of C. Then
Using the right-hand rule, we see that this gives the outward normal.
•
C
r′
r′ ×k
Therefore,
ZZ ZZ
G · dS = G · dS
∂V side
Z b Z 1
= G(Φ(t, z)) · (Φt × Φz ) dz dt
a 0
Z bZ 1
F2 (x, y), −F1 (x, y), 0 · y ′ (t), −x′ (t), 0 dz dt
=
a 0
Z bZ 1
F2 (x, y)y ′ (t) + F1 (x, y)x′ (t) dz dt
=
a 0
Z b
F2 (x, y)y ′ (t) + F1 (x, y)x′ (t) dt
=
a
Z b
= F · r′ (t) dt
Ia
= F · ds .
C
F(x, y) = (−y, x)
D = {(x, y) : x2 + y 2 ≤ 1}.
Note that this has the correct (that is, counterclockwise) orientation. We have
Example 8.3.6. Let D be the triangle with vertices at (0, 0), (0, 1), and (1, 0). Let’s verify
Green’s theorem on D for
F(x, y) = (x, y).
We parameterize the boundary C of D by
Note that the pieces r2 (t) and r3 (t) are oriented in the wrong direction, so we will need to
add a negative sign in front of the corresponding terms. So the left-hand side of Green’s
theorem is
I Z Z Z
F · ds = F · ds − F · ds − F · ds
C r1 r2 r3
Z 1 Z 1 Z 1
= (t, 0) · (1, 0) dt − (0, t) · (0, 1) dt − (t, 1 − t) · (1, −1) dt
0 0 0
Z 1 Z 1 Z 1
= t dt − t dt − (t − 1 + t) dt
0 0 0
1
= − t2 − t t=0 = 0.
Green’s theorem can be used to compute the areas of regions bounded by curves, as the
next example illustrates.
Exercises.
Exercises from [FRYd, §4.3] : Q1–Q25.
y
x
we have
∂S = {(x, y, 1) : x2 + y 2 = 1}. △
y
x
we have
∂S = {(x, y, z) : x2 + y 2 = 1, z ∈ {0, 1}}. △
S = {(x, y, z) : x2 + y 2 + z 2 = 1}
z
we have ∂S = ∅. △
We will be most interested in surfaces whose boundaries are finite unions of piecewise
smooth, simple, closed curves. We want a convention for orienting these boundaries. For
Green’s theorem (Theorem 8.3.3), we oriented the boundary such that, when you walk along
the curve following the orientation, the surface is on your left. We essentially want the same
convention for surfaces in R3 , except that now we need to specify where your feet and head
are as you walk!
• with the vector from your feet to your head having the same direction as the chosen
normal n for S (giving its orientation),
z n
∂S
y
Theorem 8.4.5 (Stokes’ theorem). Suppose that S is a bounded piecewise smooth oriented
surface whose boundary ∂S is a finite union of piecewise smooth, simple, closed curves, with
orientation chosen according to Convention 8.4.4. Furthermore suppose that F is a C 1 vector
field on S. Then I ZZ
F · ds = (∇ × F) · dS .
∂S
S
Proof. For a proof of Stokes’ theorem, see [FRYe, Th. 4.4.1]. The proof involves reducing to
Green’s theorem; see Remark 8.4.6.
Remark 8.4.6. Suppose that F has zero z-component, so that F = (F1 , F2 , 0), and that the
surface S lies in the xy-plane, so that we have
S = {(x, y, 0) : (x, y) ∈ D}
for some D ⊆ R2 . Then, using (5.3), we have
ZZ ZZ ZZ
∂F2 ∂F1
(∇ × F) · dS = (∇ × F) · k dS = − dx dy .
∂x ∂y
S S D
that the right-hand side is also zero. We parameterize S, which is the unit sphere, using
spherical coordinates:
Φ(θ, φ) = (sin φ cos θ, sin φ sin θ, cos φ), θ ∈ [0, 2π], φ ∈ [0, π].
As we computed in Example 7.7.1, we have
Φθ × Φφ = (− cos θ sin2 φ, − sin θ sin2 φ, − sin φ cos φ).
We have
∂ ∂ ∂ ∂ ∂ ∂
curl F = 0− (xy), (xy) − 0, (xy) − (xy) = (0, 0, y − x),
∂y ∂z ∂z ∂x ∂x ∂y
170 Integral theorems
and so
curl F(Φ(θ, φ)) = (0, 0, sin θ sin φ − cos θ sin φ).
Therefore
ZZ Z 2π Z π
curl F · dS = curl F(Φ(θ, φ)) · (Φθ × Φφ ) dφ dθ
0 0
S
Z 2π Z π
= sin2 φ cos φ(cos θ − sin θ) dφ dθ
Z0 π 0
y
x
To compute the left-hand side, we parameterize the two pieces of the boundary of the cylin-
der:
as expected. △
We conclude with some connections between Stokes’ theorem and other concepts we have
seen in this course. Suppose S1 and S2 are two oriented surfaces having the same boundary
curve C with orientation chosen according to Convention 8.4.4. (Note that we are assuming
that Convention 8.4.4 applied to both S1 and S2 yields the same orientation on C.) Then,
by Stokes’ theorem
ZZ I ZZ
(∇ × F) · dS = F · ds = (∇ × F) · dS .
C
S1 S2
C = {(x, y, 0) : x2 + y 2 = 1}
S1 = {(x, y, z) : x2 + y 2 ≤ 1, z = 0} and
S2 = {(x, y, z) : x2 + y 2 + z 2 = 1, z ≥ 0},
172 Integral theorems
S2
n
S1
C
V = {(x, y, z) : x2 + y 2 + z 2 ≤ 1, z ≥ 0}
∂V = S1 ∪ S2 .
where, in the last equality, we used the vector identity ∇ · (∇ × F) = 0 from Proposi-
tion 8.1.6(a).
Stokes’ theorem implies that C F · ds = 0 for all closed curves C. Thus, by Theorem 6.5.1,
F is conservative. This gives an alternative proof of Theorem 6.5.2.
Exercises.
Exercises from [FRYd, §4.4] : Q1–Q28.
Bibliography
[FRYd] J. Feldman, A. Rechnitzer, and E. Yeager. CLP-4 Vector Calculus problem book.
URL: https://secure.math.ubc.ca/~CLP/CLP4/.
[FRYe] J. Feldman, A. Rechnitzer, and E. Yeager. CLP-4 Vector Calculus textbook. URL:
https://secure.math.ubc.ca/~CLP/CLP4/.
[MT11] J. E. Marsden and A. Tromba. Vector Calculus. WH Freeman, 6th edition, 2011.
173