110.211 Honors Multivariable Calculus Professor Richard Brown

You might also like

Download as pdf or txt
Download as pdf or txt
You are on page 1of 4

February 18, 2019

LECTURE 7: DIRECTIONAL DERIVATIVES.

110.211 HONORS MULTIVARIABLE CALCULUS


PROFESSOR RICHARD BROWN

Synopsis. Today, we move into directional derivatives, a generalization of a partial deriva-


tive where we look for how a function is changing at a point in any single direction in the
domain. This gives a powerful tool, both conceptually as well as technically, to discuss the
role the derivative of a function plays in exposing the properties of both functions on and
sets within Euclidean space. We define the gradient of a real-valued function (finally) and
its interpretations and usefulness, and move toward one of the most powerful theorems of
multivariable calculus, the Implicit Function Theorem.

The Directional Derivative.

7.0.1. Vector form of a partial derivative. Recall the definition of a partial derivative evalu-
ated at a point: Let f : X ⊂ R2 → R, x open, and (a, b) ∈ X. Then the partial derivative
of f with respect to the first coordinate x, evaluated at (a, b) is
∂f f (a + h, b) − f (a, b)
(a, b) = lim .
∂x h→0 h
Here, we vary only the first coordinate, leaving the y coordinate value b fixed,andwrite
a
(a + h, b) = (a, b) + (h, 0). In vector notation, this is like taking the vector a = , and
b
adding to it a small
 amount h, but only in the x-direction. Indeed, this means adding to a
1
the vector h = hi, where here we use the standard convention for unit vectors in R2
0
and R3 , namely i = e1 , j = e2 , etc. We get
         
a 1 a h a+h
a + hi = +h = + = .
b 0 b 0 b

Then the definition of a partial derivative becomes


∂f f (a + hi) − f (a)
(a) = lim .
∂x h→0 h
However, one can take a derivative of f at a point (a, b),
a
or the point a = in any direction in the domain: Let
b
v ∈ X. Then
f (a + hv) − f (a)
lim Figure 7.1. A directional derivative in
h→0 h
the x-direction is the partial.
is perfectly well defined as long as the quantity a + hv
remains in X, which, since X is open, will be the case for
1
2 110.211 HONORS MULTIVARIABLE CALCULUS PROFESSOR RICHARD BROWN

small enough h. This is the derivative of f at (a, b) in the direction of v, also known as the
directional derivative of f at (a, b) with respect to v:
f (a + hv) − f (a)
Dv f (a) = lim .
h→0 h
How does this work? For f differentiable at a, compose
f with the affine function g : R → R2 , where
   
a v1
g(t) = a + tv = +t .
b v2
Here, g parameterizes a line in R2 where at t = 0, g(0) =
a, and at t = 1, g(0) = a + v. g is also C 1 , and g0 (t) = v.
In particular, g0 (0) = v.
Now let F (t) = f (g(t)) = (f ◦ g) (t) = f (a + tv), like
Figure 7.2. A directional derivative in
the direction of v ∈ X. in our definition of directional derivative. Here, F , as the
composition of two differentiable function, will also be dif-
ferentiable, and
F (t) − F (0) f (a + tv) − f (a)
F 0 (0) = lim = lim .
t→0 t−0 t→0 t
But, using the Chain Rule, we can write

d
F (0) = Dv f (a) = f (a + tv) = Df (g(0)) g0 (0) = Df (a)v.
0
dt t=0

Definition 7.1. Let f : X ⊂ R → R be C 1 . Then the gradient function of f is the function


n

fx1 (x)
 
 fx2 (x) 
∇f : X ⊂ Rn → Rn , ∇f (x) =  .. .
 . 
fxn (x)
n
The gradient vector of f at a ∈ X is a vector in R based at a:
fx1 (a)
 
 fx2 (a) 
∇f (a) =  .. .
 . 
fxn (a)

Notes:
• The gradient function carries the same information as the derivative matrix of f , but
is a vector of functions so that
Df (x) = (∇f )T , where T = transpose.
• The gradient is only defined for scalar-valued functions.
Using this gradient function, we can write
Dv f (a) = Df (a)v = ∇f (a) · v .
| {z } | {z }
matrix mult. dot product
LECTURE 7: DIRECTIONAL DERIVATIVES. 3

Warning! The choice of v is really a choice of direction only! Thus, it is vitally important
that ||v|| = 1 for this choice.
Exercise 1. Show that for k ∈ R and w = kv, that Dw f (a) = kDv f (a).

The directional derivative specifies how f is changing in the direction of v ∈ X. But what
does this mean? Imagine standing in X ∈ Rn at a point a where a real-valued f is defined
and differentiable. How is f changing in the particular direction that you are facing at the
moment? For any v ∈ Rn , where ||v|| = 1, Dv f (a) = ∇f (a) · v. So recall that
x · y = ||x|| ||y|| cos θ,
where θ is the angle between x and y. Remember that, for any n > 1, any two non-collinear
(what does this mean?) vectors in Rn span a plane. Within that plane, there is a well-defined
angle between them. So
Dv f (a) = ∇f (a) · v = ||∇f (a)|| cos θ,
since ||v|| = 1.
But notice then that
− ||∇f (a)|| ≤ Dv (a) ≤ ||∇f (a)|| .
Thus the directional derivative of f at a will achieve its maximum when θ = 0, and its
minimum when θ = π. And, of course, the directional derivative will be 0 precisely when
θ = ± π2 . All of this comes from the Dot Product of the gradient vector and the chosen
unit-length directional vector v. Geometrically, what does this mean? Here is a beautiful
and important interpretation:
Theorem 7.2. Let X ∈ Rn be open and f : X → R a C 1 -function. For x0 ∈ X, let

Sx0 = x ∈ X f (x) = f (x0 ) = c .
The ∇f (x0 ) ⊥ Sx0 .

Another way to say this is that any vector v tangent to Sx0 will be perpendicular to
∇f (x0 ) (See Figure 7.3. The proof of this is constructive and very informative.

Figure 7.3. Geometrically, the gradient vector is always perpendicular to the level sets of a function.

Proof. Let I = (a, b), an open interval in R, and c : I → Rn be a C 1 -parameterized curve,


x1 (t)
 

with c(t) =  ...  such that


xn (t)
4 110.211 HONORS MULTIVARIABLE CALCULUS PROFESSOR RICHARD BROWN

(1) c(t0 ) = x0 , for some t0 ∈ I, and


(2) c(I) ⊂ Sx0 .
Then the composition (f ◦ c) : I → R is C 1 on I since both f and the curve are, and
f (c(t)) = f (x1 (t), . . . , xn (t)) = c.
Differentiate this last equation inplicitly with respect to t. we get
d d
[f (x1 (t), . . . , xn (t))] = [c] = 0
dt dt
Df (c(t)) c0 (t) = 0.
Now, at t = t0 , c(t0 ) = x0 , and Df (x0 ) v = 0, where v = c0 (t0 ), a vector tangent to the
curve and hence tangent to Sx0 . Hence the result follows. 
Definition 7.3. For any (n − 1)-dimensional hypersurface in Rn defined as the c-level set
of a C 1 function f : X ⊂ Rn → R,

S = x ∈ X f (x) = c ,
the tangent space to S at a ∈ S is the space of all vectors perpendicular to ∇f (a); it is
defined by h(x) = ∇f (a) · (x − a) = 0, or
n
X ∂f
h(x) = (a)(xi − ai ) = 0.
i=1
∂xi

Note: Compare this to the tangent space of graph(f ) ⊂ R3 , where f : R2 → R and


graph(f ) is defined by the equation in R3 given by z = f (x, y).

You might also like