An Introduction To The Mathematical Foundations of The Finite Element Method

An Introduction to the Mathematical
Foundations of the Finite Element Method
Vitoriano Ruas
October 23, 2002

ii
Aos meus bons amigos da

Universidade Federal Fluminense
Contents
Preamble 1
1 Motivation 3
2 Some Concepts of Functional Analysis 11

2.1 Normed Functions Spaces . . . . . . . . . . . . . . . . . . . . . . 11
2.2 Inner Product Spaces . . . . . . . . . . . . . . . . . . . . . . . . . 13
2.3 Closed Sets in Normed Spaces . . . . . . . . . . . . . . . . . . . . 18
2.4 Complete Normed Spaces . . . . . . . . . . . . . . . . . . . . . . 23
2.5 Orthogonal Projections . . . . . . . . . . . . . . . . . . . . . . . . 29
2.6 Bounded Linear Functionals . . . . . . . . . . . . . . . . . . . . . 31
3 The Finite Element Method in One Dimension Space 37

3.1 An Abstract Linear Variational Problem . . . . . . . . . . . . . . 37
3.2 The Ritz Approximation Method . . . . . . . . . . . . . . . . . . 42
3.3 Piecewise Linear Finite Element Approximations . . . . . . . . . . 45
3.4 More General Cases and Higher Order Methods . . . . . . . . . . 52
4 Extension to Multi-dimensional Problems 59

m
4.1 Sobolev Spaces H (Ω) . . . . . . . . . . . . . . . . . . . . . . . . 59
4.2 Green’s Formulae . . . . . . . . . . . . . . . . . . . . . . . . . . . 62
4.3 Model Variational Problems in N-dimensional Spaces . . . . . . . 64
4.4 Lagrange Finite Element Solution Methods . . . . . . . . . . . . . 67
References 73
iv Contents
Preamble
The finite element method is nowadays a widespread tool to solve Engineering

problems governed by differential equations of various types. Since it was intro-
duced its mathematical foundations have been generally acknowledged to deserve
special attention for the best use of this class of numerical methods.
In this course the essential aspects of the mathematical theory of finite element
methods will be addressed, more particularly in the framework of boundary value
problems for ordinary differential equations. For this propose some useful topics
of Functional Analysis will be recalled, and a number of classical function spaces
will be handled.
A first goal of the course is motivating the setting of boundary value problems
for differential equations in variational form, followed by a finite element dis-
cretization. The sequel is aimed at supplying the basic convergence results for
finite element methods in one space dimension. A final objective is understan-
ding in main lines, how the material previously presented in the course may be
extended to problems posed in two or three space dimensions.
2 Contents
1
Motivation
As it is well-known, the finite element method is employed in most cases, to

solve problems of Physics and of Continuum Mechanics governed by boundary
value differential equations. In order to illustrate how important it is to know the
foundations of this method from the mathematical point of view, we shall begin
this course with the study of a problem of the above mentioned type. Although this
is rather simple, it will allow us to introduce some concepts of much wider scope.
For instance, by just considering such model problem, we will attempt to show the
relationship between different possible formulations under which one may write
the differential equation, namely, this form itself, the variational form and the
underlying minimization problem in some cases. Moreover we will endeavour to
give insight on the criteria that lead to the best choice of the type of finite element
method, in order to face a given situation encountered in practice.
Let us then consider the elongational deformations of a string Ω of length L,
having one of its ends fixed, under the action of a distribution of forces acting
along its length, represented by a function f (x), 0 ≤ x ≤ L. We denote by u(x)
the resulting length variation of the string at a point x, or equivalently, the dis-
placement undergone in the longitudinal direction of Ω, at this point.
4 1. Motivation
The physical problem just described can be set in the form of a boundary
value ordinary differential equation for the unknown function u(x), known as the
Sturm-Liouville problem 1 , namely,

 0 0
 −(pu ) + qu = f in (0, L)

u(0) = 0 (fixed left end) (P1 )


 u0 (L) = 0 (free right end)
du
In (P1 ) u0 denotes and p and q are functions that represent physical charac-
dx
teristics of the material which the string is made of. We shall assume that p and
q satisfy respectively for certain constants α, β, A and B :
A ≥ p(x) ≥ α > 0 and B ≥ q(x) ≥ β ≥ 0, ∀x ∈ [0, L]
Moreover temporarily we shall further assume that p is continuous in (0, L)

and also differentiable there, except eventually, at a certain number of points,
and that p0 is also bounded, that is, there exists C > 0 such that |p0 (x)| ≤ C at
every point x ∈ [0, L] where p0 (x) is well defined.
Now without caring about regularity assumptions, neither on the data p, q and
f, nor on the solution u of (P1 ), we shall rewrite this problem in variational form.
We only assume that all the operations performed for this purpose are feasible
and well defined, leaving for later the discussion about the conditions under which
it is really so.
Let us then multiply both sides of the differential equation with a ”test func-
tion” v, and then integrate over the interval (0, L). Using integration by parts we
obtain: Z L Z L Z L
0 L 0 0
−pu v|0 + pu v dx + quvdx = f vdx
0 0 0
1 Classically the Sturm-Liouville problem stems from the partial differential equation that describes the free
or the forced vibrations of a string. If u(x, t) represents the transversal displacement of a particle of the string Ω
located at a point x at time t, this equation (with suitable initial a boundary conditions) writes ρutt −(rux )x = F
in Ω×]0, ∞), F representing given forces and ρ and r being physical characteristics of the string depending only on
x. For instance, if F is periodic in time, say, F (x, t) = eiwt f (x), then one may legitimately set u(x, t) = eiwt u(x),
thereby obtaining the differential equation (P1 ) for u(x) with p = r and q = −w2 ρ. However in this case q is
negative, which makes this problem for u(x) fundamentally different from the one to be considered here.
1. Motivation 5
for every (test) function v. The expression ”test function” means that v is sup-
posed to behave somehow like the solution u, in the sense that u itself could be
one of such functions. We shall simply say that v sweeps a function space V , some
properties of which being specified below in order to simplify things.
First of all we note that since u0 (L) = 0, if we require that v vanish at the
origin like u itself, then our variational problem writes
Z L Z L Z L
0 0
pu v dx + quvdx = f vdx, ∀v ∈ V (P2 )
0 0 0
where u belongs also to V . Since all the integrals above must carry a meaning, as
we will see later on, it is necessary to require that f together with every function
in V and its first order derivative are square integrable in the sense of Lebesgue
in the interval (0, L). This implies that we are actually defining V to be:
±
V = {v v, v 0 ∈ L2 (0, L), v(0) = 0 }
where L2 (0, L) = {f /f 2 has a finite (Lebesgue) integral in (0, L) }
Remark 1.1 As proven later, every v such that v 0 ∈ L2 (0, L) is continuous.

Therefore there is no contradiction in requiring that v(0) = 0.
Now what have we gained by writing the Sturm-Liouville problem (P1 ) in the
variational form (P2 )?
First of all we note that the highest derivative order that appears in the latter is
one, while in the former it is two. Thus for instance, if we choose to approximate
u by a function belonging to the class of piecewise polynomials like the finite
element method does, it is readily seen that continuous functions may be used for
(P2 ) whereas continuously differentiable ones would be required for (P1 ).
Another advantage of formulation (P2 ) over (P1 ) is the fact that the boundary
condition u0 (L) = 0 may be disregarded in the former, for it is implicitly satisfied.
Indeed if we still assume that all the operations performed below are legitimate,
we have from (P2 ).
Z L Z L Z L
0 L 0 0
pu v|0 − (pu ) vdx + quvdx = f vdx (1.1)
0 0 0
6 1. Motivation
for every v ∈ V . Let us choose v ∈ V such that v(L) = 0, but otherwise arbitrary.
In so doing we get
Z L
[−(pu0 )0 + qu − f ] vdx, ∀v ∈ V such that v(L) = 0
0
Here the reader should admit that the class os such functions v is wide enough
for the above relation to imply that
−(pu0 )0 + qu = f almost everywhere in (0, L)
From this result (1.1) simply becomes
p(L)u0 (L)v(L) − p(0)u0 (0)v(0) = 0, ∀v ∈ V
Since v(0) = 0, choosing now v ∈ V such that v(L) = 1 we immediately infer that
u0 (L) = 0 as required, as p is strictly positive by assumption.
Another clear advantage of (P2 ) over (P1 ) is related to the regularity of the
datum p. Even assuming that the material of the string is homogeneous, the
function p varies with its cross section. Eventually the latter could change of
shape abruptly in such a way that p is a discontinuous function. While on the one
hand this does not bring about any difficulty as far as problem (P2 ) is concerned,
much care is needed in handling the term (pu0 )0 of equation (P1 ) in this case.
Finally a more tricky but fundamental point: formulation (P2 ) is more general
than equation (P1 ) from the physical point of view. In order to clarify this as-
sertion let us consider a distribution of forces fε having a resultant P , uniformly
acting on a very small length ε around the abscissa x = L2 . In current applications
such system of forces is represented by P δ L , namely, a single force of modulus
2
equal to P applied at the point given by x = L2 .2 Of course if we stick to the first

definition of fε , both (P1 ) and (P2 ) carry a meaning. Actually denoting by uε the
2 In
old days many scientists used to call this distribution of forces the “Dirac function” δ L multiplied by P ,
2
Z L
L L
where δ L would be such that δ L (x) = 0, ∀x ∈ [0, L], x 6= 2 , δ L ( 2 ) = +∞ and “ δ L dx” = 1.
2 2 2 0 2
1. Motivation 7
corresponding solution, the latter writes:

Z L Z L Z L
+ 2ε
2 P
pu0ε v 0 dx + quε vdx = vdx, ∀v ∈ V
0 0 L
2
− 2ε ε
Now since v is a continuous function, the Mean Value Theorem for integrals
implies that the right hand side of the above equation tends to P v( L2 ) as ε goes
to zero. Otherwise stated, the variational problem (P2 ) is perfectly well defined
in the case of the single force applied to a particle located at point x = L2 , as
Z L Z L
0 0 L
pu v dx + quvdx = P v( ), ∀v ∈ V, u itself in V
0 0 2
On the other hand the equation (P1 ) in this case could only be based on an
equality of the type −(pu0 )0 + qu = P δ L , whose exact meaning lies beyond the
2
scope of ordinary functions. In other words this form is not suitable for being
exploited numerically.
Since forces applied pointwise may act everywhere and may also be combined,
we conclude that there are at least infinitely many system of forces that are
admissible for the model (P2 ) of the elongational deformation of a string, although
they do not correspond to a differential equation of the form (P1 ), at least as far
as functions defined in accordance with the classical concept are concerned.
Summarizing we have exhibited at least four advantages of formulation (P2 )
over (P1 ), in spite of the fact that a rather simple linear ordinary differential
equation was used as a model for this comparative analysis.
Let us conclude this chapter with a few considerations about the relationship
between smoothness of the data and the choice of the space V in which the
solution is to be searched for.
To begin with we assume again that p is continuous, and differentiable except
eventually at a certain number of points of the interval (0, L) and that |p0 | is
bounded by a constant C wherever it is defined.
Our goal is to define the largest possible space where the solution u of (P1 )
belongs to, under the above assumptions on p and q, by assuming that f belongs
8 1. Motivation
to L2 (0, L). Of course u should be at least a continuous function in [0, L], otherwise
u0 would carry no meaning in the term −(pu0 )0 of the differential equation that
holds in (0, L), and both u(0) and u0 (L) might not be defined.
This implies that u ∈ L2 (0, L) and moreover since
Z L Z L
2 2
(qu) dx ≤ B u2 dx
0 0
we have qu ∈ L2 (0, L) as well. This implies in turn that −(pu0 )0 ∈ L2 (0, L), as
the difference between f and qu.
Actually (pu0 )0 is also (Lebesgue) integrable in (0, L) due to the fact that
whenever f and g are functions in L2 (0, L) then f g is integrable in (0, L). In-
Z L Z L
2 2
deed, since f ± g ∈ L (0, L) and (f ± g) dx ≥ 0, we have ± f gdx ≤
·Z L Z L ¸ 0 0
1 2 2
f dx + g dx 3 .
2 0 0
The primitive of the integrable function (pu0 )0 must be continuous, and hence,
since p is continuous in [0, L] so must be u0 . This implies that u0 ∈ L2 (0, L) too,
as does u00 according to the following argument:
(pu0 )0 − p0 u0
Whenever u00 is defined it must be equal to . Since the following
p
bound holds:
Z L · ¸2 ½Z L Z L ¾
(pu0 )0 − p0 u0 2 0 0 2 2 0 2
dx ≤ 2 [(pu ) ] dx + B |u | dx ,
0 p α 0 0
the function u00 must be defined as an element of L2 (0, L).

As a conclusion, whenever f is square integrable in (0, L) the solution u together
with its two first derivatives u0 and u00 are also functions in L2 (0, L), provided p, q
and p0 satisfy our given properties. This leads to the assertion that the right space
in which the solution of (P1 ) should be searched for is in this case
© ± ª
S = v v, v 0 , v 00 ∈ L2 (0, L), v(0) = 0 and v 0 (L) = 0
3 Now take f = (pu0 )0 and g ≡ 1.

1. Motivation 9
As a matter of fact S is a subspace of the classical Sobolev space H2 (0, L)

defined as follows:
© ± ª
H2 (0, L) = v v, v 0 , v 00 ∈ L2 (0, L)
The additional conditions on v(0) and v 0 (L) in the definition of S are applicable
to functions in H2 (0, L), since those are necessarily continuously differentiable in
[0, L] as are can easily check.
Now the above argument leading to such space S can be extended to the case
where p0 is not defined in [0, L]. The conclusion is still that (pu0 )0 ∈ L2 (0, L), which
means in particular that pu0 is a continuous function. However in this case u0 will
not be continuous in general, though still a function in L2 (0, L). This assertion
can be easily verified by the reader as an exercise. Then the natural solution space
for problem (P1 ) becomes
© ± ª
S = v v, v 0 , (pv 0 )0 ∈ L2 (0, L), v(0) = (pv 0 )(L) = 0
Notice that (pv 0 )0 cannot be viewed as the sum of pv 00 and p0 v 0 for every v ∈ S,
since neither v 00 nor p0 are defined in [0, L] in general.
Actually when the string cross section undergoes an abrupt change at some
point, the displacement’s derivative must also have an abrupt change at this
point, in such a way that the product pu0 remains continuous there.
As a final comparison between problems (P1 ) and (P2 ), we recall that in the
case of the latter the space in which we must search for the solution is simply
the subspace V of the Sobolev space H1 (0, L) consisting of those functions v for
which v(0) = 0, where H1 (0, L) = {v /v, v 0 ∈ L2 (0, L) }.
Of course whenever the right hand side of the variational equality of (P2 ) is
Z L
expressed in the form f vdx where f ∈ L2 (0, L), its solution u lies indeed in
0
S. However we gain nothing by imposing that u belong to S beforehand, simply
because as seen before, some conditions in the definition of this space are just
too stringent for practical purposes. Moreover if we do so we rule out solutions
to (P2 ) that do not really belong to S. This is the case whenever the right side
10 1. Motivation
corresponds to forces that are not functions in L2 (0, L) or not even functions at
all. In order to illustrate this assertion, assume that q = 0, p = 1 and the right
hand side in the variational equality is given by P v( L2 ). By a direct calculation
we can check that the function u given by
 · ¸

 L
 P x, for x ∈ 0,
u(x) = µ 2 ¶

 L L
 P for x ∈ ,L
2 2
satisfies ∀v ∈ V :
Z L Z L Z L
2
0 0
pu v dx + quvdx = P v 0 dx
0 0 0
= P v( L2 ) − P v(0)
= P v( L2 ), ∀v ∈ V
It is hence a solution of (P2 ) but belongs by no means to S since u00 is not defined.
In the sequel we will give additional arguments and results in favor of the choice
of V as an optimal solution space relying upon strictly rigorous mathematical
analyses.
2
Some Concepts of Functional Analysis
2.1 Normed Functions Spaces
Before going into a rigorous mathematical analysis of problem (P2 ) and its
relation with equation (P1 ), it is useful to recall some basic aspects of Functional
Analysis as applied to normed vector spaces.
First of all let V be a real vector space with null vector ~0V . V is said to be a
normed space if it is equipped with a mapping k·k : V → R having the following
properties:
(i). kvk ≥ 0, ∀v ∈ V and kvk = 0 if and only if v = ~0V ;
(ii). kαvk = |α| kvk , ∀α ∈ R,∀v ∈ V ;
(iii) kv1 + v2 k ≤ kv1 k + kv2 k , ∀v1 , v2 ∈ V ;
We will mostly be dealing with function vector space. Let us then define as
examples, the following norms for the space C 0 [0, L] of continuous functions in
12 2. Some Concepts of Functional Analysis
the closed interval [0, L]:

def
kvk0,∞ = max |v(x)| (Maximum norm or L∞ -norm)
x∈[0,L]
Z L
def
kvk0,1 = |v(x)| dx (L1 -norm)
0
·Z L ¸ 12
def
kvk0,2 = |v(x)|2 dx (L2 -norm)
0
It is well-known that all the above three expressions fulfill the conditions (i) −
(ii) − (iii). Actually only property (i) for the norms k·k0,1 and k·k0,2 cannot
be established without fully exploiting the continuity of functions in C 0 [0, L].
This is because such expressions do not really define norms for spaces containing
discontinuous functions such as L2 (0, L). Indeed, if f ∈ L2 (0, L) and kf k0,2 = 0,
then f is a function that vanishes almost everywhere in (0, L), i.e., at every point
in this interval but eventually at points that form a set whose measure is equal
to zero - in short, a nullset (for instance, a countable set of points). But since the
norm k·k0,2 is quite natural and handy for square integrable functions, in order
to overcome this difficulty, instead of L2 (0, L) itself one usually works with the
quotient space
def
L2 (0, L) = L2 (0, L)/N (0, L),
where N (0, L) is the space of those functions that vanish almost everywhere in
(0, L). This is because L2 (0, L) is indeed a normed space when equipped with the
norm k·k0,2 (applied to any function representing a given class belonging to this
quotient space), for
kf k0,2 = 0 ⇒ f ∈ N (0, L), or yet [f ] = [O]
where O denotes the null function and [f ] the class of f in the quotient space.
Nevertheless, as most authors do, we shall identify L2 (0, L) and L2 (0, L) in
this text, by abusively regarding the latter as a space of functions not to be
distinguished from eachother, as long as they are identical almost everywhere in
(0, L).
2.2 Inner Product Spaces 13
Finally we recall that an application |·| : V → R that fulfills (ii) and (iii) and
such that |v| ≥ 0, ∀v ∈ V is called a semi-norm of V .
2.2 Inner Product Spaces
We shall be particularly concerned about a class of normed spaces whose norm

is defined through a given inner product. We recall that a real vector space V is
said to be equipped with an inner product (·|·), if this is a mapping from V × V
onto R that satisfies the following properties:
(I). (v|v) ≥ 0, ∀v ∈ V and (v|v) = 0 if and only if v = ~0V ;
(II). (v|u) = (u|v), ∀u, v ∈ V ;
(III). (α1 u1 + α2 u2 |v) = α1 (u1 |v) + α2 (u2 |v), ∀α1 , α2 ∈ R and ∀u1 , u2 , v ∈ V .
If can be easily checked that to a given inner product (·|·) corresponds a norm
defined by:
def 1
kvk = (v|v) 2 , ∀v ∈ V
For example the norm k·k0,2 of C 0 [0, L] is associated with the following inner
product: Z L
def
(u|v)0 = u(x)v(x)dx
0
The fact that the above expression defines indeed an inner product can be easily
checked and is left as an exercise.
Notice that the expression (·|·)0 is not an inner product for L2 (0, L) since pro-
perty (i) does not hold in this case. However like in the case of the norm k·k0,1
and k·k0,2 , we may replace L2 (0, L) with L2 (0, L) over which (·|·)0 becomes indeed
an inner product (provided we regard this space as a function space in the sense
already stated).
In this respect we recall that whenever two functions f and g belong to L2 (0, L)
then their product belongs to the space L1 (0, L) of (Lebesgue) integrable functions
in (0, L). This result could actually have been established by virtue of the Cauchy-
Schwartz inequality applying to any inner product (·|·) defined over a real vector
space V , namely
def 1
|(u|v)| ≤ kuk kvk , ∀u, v ∈ V, where kvk = (v|v) 2
Indeed, owing to properties (i),(ii) and (iii) we have ∀u, v ∈ V and ∀α ∈ R:
(u|u) + 2α(u|v) + α2 (v|v) ≥ 0
Since the quadratic function of α on the left hand side of the above inequality
can only be non-negative if
|2(u|v)|2 − 4(u|u)(v|v) ≤ 0,
the result follows.

Now if suffices to take V = L2 (0, L) and inner product (·|·)0 to derive
Z L ·Z L ¸ 12 ·Z L ¸ 12
± f (x)g(x)dx ≤ f 2 (x)dx g 2 (x)dx , ∀f, g ∈ L2 (0, L).
0 0 0
Notice that not every norm is associated with a given inner product. In fact
this is true if and only if the norm of V , say k·k, obeys the Parallelogram Law,
namely
¡ ¢
ku + vk2 + ku − vk2 = 2 kuk2 + kvk2 , ∀u, v ∈ V.
In this case the associated inner product is given by
1¡ ¢
(u|v) = ku + vk2 − ku − vk2
4
As the reader may check, the Parallelogram’s Law for a given inner product can
be established in a straightforward manner. On the other hand, proving that the
above expression defines an inner product whenever the given norm obeys this
law is otherwise complex.
Let us just stress here that to some very useful norms correspond no inner
product, such as the maximum norm. Indeed let us take
x (L − x)
u(x) = and v(x) = ,
L L
both belonging to C 0 [0, L].

We have kuk0,∞ = kvk0,∞ = 1. Moreover ku + vk0,∞ = 1 and ku − vk0,∞ = 1.
Therefore the Parallelogram Law is violated.
As we should also point out, some norms or inner product are not well suited
to certain spaces, as the corresponding mapping may be unbounded or undefined.
For instance the expression (f |g)0 is not defined for all the functions f and g in
1
L1 (0, L) (take e.g. f (x) = g(x) = x− 2 ) and the maximum norm cannot be a norm
1
for L2 (0, L) (take for instance f (x) = x− 4 )
Let us now give an application related to the Sturm-Liouville problem:
We saw that a convenient test function space for problem (P2 ) is the space
© ± ª
V = v v ∈ H1 (0, L), v(0) = 0 ,
where H1 (0, L) is the Sobolev space of those functions in L2 (0, L) whose first
order derivative also belongs to L2 (0, L).
A natural inner product for H1 (0, L) is the following one:
Z L
def
(u|v)1 = [u(x)v(x) + u0 (x)v 0 (x)] dx
0
as one can easily verify, with corresponding norm:
1
kvk1,2 = (v|v)12
Notice that the shorter expression well defined for every u, v ∈ H1 (0, L), namely
Z L
((u|v))1 = u0 (x)v 0 (x)dx
0
corresponds to no norm. Indeed, ((v|v))1 = 0 implies that v is an arbitrary con-

stant in [0, L], which means that actually the expression
·Z L ¸ 21
def
|v|1,2 = [v 0 (x)]2 dx
0
1
defines just a semi-norm over H (0, L). However over the space of test functions
V , the expression ((·|·))1 defines indeed an inner product and hence |·|1,2 is a norm
of V .
Actually we can even establish the following result:
Proposition 2.1 There exist two strictly positive constants C1 and C2 such that
C1 kvk1,2 ≤ |v|1,2 ≤ C2 kvk1,2 , ∀v ∈ V
The inequality on the right is trivially satisfied with C2 = 1, whereas the one
on the left is a consequence of the following lemma:
Lemma 2.2 (The Friedrichs-Poincaré inequality) There exists a constant

Cp such that
kvk0,2 ≤ Cp |v|1,2 , ∀v ∈ V
Proof. ∀x ∈ [0, L] we have:

Z x £ ¤0
v 2 (y) dy = v 2 (x),
0
which implies that

Z x
2
v (x) = 2 v(y)v 0 (y)dy, ∀x ∈ [0, L],
0
or yet Z x
2
v (x) ≤ 2 |v(y)| |v 0 (y)| dy, ∀x ∈ [0, L].
0
Thus ∀x ∈ [0, L]
Z L
2
v (x) ≤ 2 |v(y)| |v 0 (y)| dy ≤ 2 kvk0,2 |v|1,2 ,
0
from which we obtain

Z L Z L
kvk20,2 = 2
v (x)dx ≤ 2 kvk0,2 |v|1,2 dx.
0 0
This proves the lemma with Cp = 2L. q.e.d. ¥

The existence of the constant C1 in Proposition 2.1 directly follows from the
Friedrichs-Poincaré inequality with,
1
C1 = (1 + Cp2 )− 2 .
Remark 2.3 Two norms k·ka and k·kb are said to be equivalent if there exist two
strictly positive constants C1 and C2 such that for every v in the vector space V
we have
C1 kvka ≤ kvkb ≤ C2 kvka
Proposition 2.1 thus means that both norms k·k1,2 and |·|1,2 are equivalent over
the test function space V for problem (P2 ). ¥
An inner product allows for the extension of the usual geometric concept of
orthogonality to real vector spaces.
Definition 2.4 Let V be a real vector space equipped with inner product (·|·).
Two vectors u and v belonging to V are said to be orthogonal if (u|v) = 0. In this
case we write u⊥v.
Definition 2.5 Let S be a subset of a real vector space V equipped with an inner
product (·|·). The orthogonal of S denoted by S ⊥ is the set of elements in V which
are orthogonal to every element of S.
Note that the only element in V which is orthogonal to every element of this
space is ~0V . Indeed if this is the case of u ∈ V we have
(u|v) = 0, ∀v ∈ V.
Thus taking v = u we readily conclude that u = ~0V . Therefore ~0V ∈ S ⊥ whatever

the set S. Actually S ⊥ is always a subspace of V .
Lemma 2.6 (Pythagorean Theorem) If u⊥v then
ku + vk2 = kuk2 + kvk2
Proof. Straightforward using the properties of the norm k·k = (˙ · |·) 2 .

1
¥
⊥
When the subset S consists of just an element v0 (S = {v0 }) we denote S by
v0⊥ (rather than {v0 }⊥ ). Notice that ~0⊥
V is V itself and that V
⊥
is {~0V }.
Proposition 2.7 Let v0 be a given element of V , v0 6= ~0V . Then ∀v ∈ V there

exists a unique splitting of the form v = αv0 + u where u ∈ v0⊥ , α ∈ R.
(v|v0 )
Proof. Take α = and u = v − αv0 . Uniqueness of this decomposition can
kv0 k2
be established as an exercise. ¥
2.3 Closed Sets in Normed Spaces
Let S : N → V be a given sequence (v1 , v2 , ..., vn , ...) of elements of V also

denoted by {vn }n or simply {vn }.
Definition 2.8 A sequence {vn } contained in a normed real space V is said to

converge to an element v ∈ V if ∀ε > 0 there exists an integer N (ε) ∈ N such
that kvn − vk < ε ∀n > N (ε). In this case we write vn → v in V .
Clearly the above concept implies that lim kvn − vk = 0 and due to (iii) we
n→∞
also have lim kvn k = kvk if vn → v, as one can easily check.
n→∞
Definition 2.9 A subset S of a normed real vector space V is said to be a closed

set if every sequence {vn } ⊂ S that converges to an element v ∈ V is such that
v ∈ S.
Clearly enough V itself is a closed set, as is the empty set ∅ by definition. A

subspace S of V that is a closed set is said to be a closed subspace of V .
Proposition 2.10 ∀S ⊂ V, S ⊥ is a closed subspace of V .
Proof. Let {vn } ⊂ S ⊥ be a sequence converging to v0 ∈ V . We have (vn |v) = 0,

∀v ∈ S and hence (v0 − vn |v) = (v0 |v), ∀v ∈ S. Owing to the Cauchy-Schwartz
inequality we then have:
|(v0 |v)| ≤ kvn − vk kvk , ∀v ∈ S, and ∀n ∈ N
Since by assumption kvn − v0 k → 0 we must necessarily have (v0 |v) = 0, ∀v ∈ V .

q.e.d. ¥
Definition 2.11 Let S be a subset of a normed real vector space V . The closure
of S denoted by S̄ is the set of all the elements in V that are limits of a sequence
of elements contained in S.
2.3 Closed Sets in Normed Spaces 19
Clearly S ⊂ S̄ and if S is a closed set S̄ = S. However in general S̄ 6= S and

the question is how to characterize S̄ in practical cases. Let us give an example
linked to the Sturm-Liouville problem. We consider the space V of test functions
for problem (P2 ) and the solution space S for (P1 ), where
© ± ª
S = v v ∈ V, v 00 ∈ L2 (0, L), v 0 (L) = 0
We equip V with the inner product (·|·)1 and we would like to assert the S̄ is V
itself. First of all we note that S̄ ⊂ V, since any converging sequence of S in the
norm of V must converge to an element of V . Next let v be an arbitrary element
of V . We will construct a sequence {vn } ⊂ S that converges to v, in the following
way.
First for a given m ∈ N we define a function by scaling of v as follows; set
L
ε= 2m
and
 £ ¡ ¢¤

 v x + ε x − L2 for Lε/2 ≤ x ≤ L(1+ε/2)

 1+ε 1+ε



 h i
L(1+ε/2)
ṽm (x) = ṽm 1+ε
= v(L) for L(1+ε/2)
1+ε
<x≤L








0 for 0 ≤ x < Lε/2
1+ε
Now we extend ṽm to the whole R by zero for x < 0 and by ṽm (L) for x > L. We
still denote this extension by ṽm .
Notice that ṽm ∈ V ∀m, but not necessarily to ṽm ∈ S. Nevertheless we can
“modify” ṽm by using an auxiliary sequence of functions {ρl } ⊂ S having a small
support, say, of length O( 1l ), such as:


 l(1 − 2lx)(lx + 1)2 for − 1l ≤ x ≤ 0






ρl (x) = 0 |x| > 1l







 l(1 + 2lx)(lx − 1)2 for 0 ≤ x ≤ 1
l
1 Lε/2
thereby defining for a given m and ∀l such that l
< 1+ε
;
Z +∞
l
vm (x) = ṽm (y)ρl (y − x)dy,
−∞
i.e., by convolution.
Z +∞
Noticing that ρl (x)dx = 1, ∀l and that ṽm is continuous, it is not difficult
−∞
l
to establish that ∀x ∈ (0, L), vm (x) → ṽm (x) as l goes to infinity. For this it
suffices to figure out that, by the Mean Value Theorem for integrals we have
Z 1 Z 1
l
l θ l
vm (x) = ṽm (x + z)ρl (z)dz = ṽm (x + ) ρl (z)dz,
− 1l l − 1l
for a given θ, |θ| < 1. Then the continuity of ṽm ensures the rest.
2.3 Closed Sets in Normed Spaces 21
Moreover we have ∀x ∈ [0, L]:

¯Z 1 ¯2
¯ l ¯2 ¯ l ¯
¯vm (x)¯ = ¯¯ ¯
ṽm (x + z)ρl (z)dz ¯
¯ −1 ¯
l
Z 1 Z 1
l l
≤ ρ2l (z)dz 2
ṽm (x + z)dz
− 1l − 1l
Z 1
l
2
≤ 2l ṽm (x + z)dz
− 1l
Z x+ 1l
2
= 2l ṽm (y)dy
x− 1l
≤ 4 max |v(x)|2
x∈[0,L]
Hence by the Dominated Convergence Theorem (see e.g. [10],WEIR, Lebesgue

° l °
Integration and Measure, ’73) °vm − ṽm ° → 0 as l goes to infinity.
0,2
Notice that the above argument allows us to bound above the L2 -norm of vm
l
√
independently of l and m by 2 L kvk0,∞ .
°¡ ¢ °
° l 0 0 °
Similarly but surely less simply, one can establish that ° vm − ṽm ° con-
0,2
verges to zero as well, but here some more sophisticated tools of integration theory
must be employed. More precisely, from (see next Remark):
Z +∞
¡ l ¢0 0
−∞
l
we infer that vm ∈ L2 (0, L), ∀l, and if ṽm
0
is continuous in [0, L] the same argu-
ment above applies. It can be adapted with rather simple modifications to the
0 2 0
case where (vm ) is Riemann-integrable, but not at all if vm is just an arbitrary
function of L2 (0, L). In this case a further regularization is needed, based on the
approximation of v 0 by a Riemann-integrable function arbitrarily close to v 0 in
the L2 -norm.
Remarks:
1. The reader may check that

Z +∞
¡ l ¢0 0
−∞
using integration by parts.
2. The possibility to approximate a function in L2 (0, L) by a Riemann in-

tegrable function arbitrarily close to it in the k·k0,2 norm is guaranteed
by the fact that L2 (0, L) is the completion of the space of Riemann-
integrable functions in (0, L) with respect to this norm as seen in the
next paragraph. ¥
Now notice that

Z +∞ Z L+ 1l
¡ ¢
l 0 0 0
vm (L) = ṽm (y)ρl (y − L)dy = ṽm (y)ρl (y − L)dy = 0,
−∞ L− 1l
0 1 (1 + ε/2)
since ṽm (y) = 0, ∀y ≥ L − >L . Furthermore we have
l 1+ε
Z x+ 1l
¡ ¢
l 00
vm (x) = ṽm (y)ρ00l (y − x)dy,
x− 1l
and by a very similar argument to the one employed for hbounding above the
¡ l ¢2 ¡ l ¢00 i2
integral of vm in (0, L), we can established that the one of vm is bounded,
l
though not independently of l. Hence we do have vm ∈ S, ∀l.
Now the sequence{vn } ⊂ S converging to v ∈ V will be chosen as follows; first
we note that classical results or Sobolev spaces ensure that the scaled sequence
{ṽm } converges to v in the norm k·k1,2 (or yet of H1 (0, L)). Hence for a given n
one may choose m such that
1
kṽm − vk1,2 ≤ .
2n
2.4 Complete Normed Spaces 23
1 Lε/2 ° l ° 1
Then for such ṽm we choose l such that < and °vm − ṽm °1,2 ≤ . In
l 1+ε 2n
so doing, from (iii) we see that
° l ° 1
°v m − v ° ≤ .
1,2 n
l
Finally setting vn = vm for the pair of indices m and l chosen in this way, we
note that ∀ε0 > 0 its suffices to take N (ε0 ) as the smallest integer such that
1
N (ε0 ) > to have kvn − vk1,2 < ε0 , ∀n > N (ε0 ). This implies that V ⊂ S̄ and
ε0
hence V = S̄ as asserted.
In the particular example considered above the normed vector space V is the
closure of one of its subspaces S. This situation is often exploited in practical
applications such as convergence analysis of numerical methods, and gives rise to
Definition 2.12 A subset D of a normed vector space V is said to be dense in

another subset C of V , if D ⊂ C and ∀c ∈ C and ∀ε > 0 there exists an element
d ∈ D such that kc − dk < ε.
Notice that if C is a closed set then any set D ⊂ C that is dense in C satisfies
D̄ = C, and in any case D̄ = C̄. In the case of the above example we can see that
S is dense in V from the construction of the sequence {vn } ⊂ S.
2.4 Complete Normed Spaces
The concept of convergence in normed vector spaces implicitly involves the limit
of sequences. However a converging sequence also fits the following one:
Definition 2.13 A sequence {vn } of a normed space V is a Cauchy sequence

if ∀ε > 0, ∃N (ε) such that kvm − vn k < ε, ∀m, m > N (ε).
ε
Indeed if {vn } ⊂ V converges to v ∈ V then ∀ > 0 there exists N (ε) such that
2
ε ε
kvn − vk < and kvm − vk < , provided both m and n are greater than N (ε).
2 2
Therefore kvn − vm k ≤ kvn − vk + kvm − vk < ε, ∀m, n > N (ε), and {vn } is a
Cauchy sequence.
The question then is: would every Cauchy sequence in a normed vector space V
be a converging sequence in this space? The answer is yes only for certain classes
of spaces called complete spaces.
Let us first give two examples showing that the same vector space can be
complete or not depending on the norm it is equipped with.
Consider the space C 0 [0, L] of continuous functions in [0, L]. First we equip it
with the normk·k0,2 , and we consider the sequence {fn } ⊂ C 0 [0, L], where fn is
defined as follows: for n = 1, fn (x) = 1, ∀x ∈ [0, L] and for n > 1,

 for 0 ≤ x < L2 − 2n L
 0 £

¡ ¢¤
2n
fn (x) = x − L2 1 − n1 for L2 − 2nL
≤ x < L2


L
 1 for L2 ≤ x ≤ L
Let now ε > 0 be given. For any m, n ∈ N (with m > n for example) we have:
Z L− L µ ¶2 Z L µ ¶2
2
2 2m 2n 2
2 2x
kfn − fm k0,2 = x − n + 1 dx + (n − m) − 1 dx
L
2
− L
2n
L L
2
− L
2m
L
µ ¶³ µ ¶2
L L n ´2 L n−m
≤ − 1− +
2n 2m m 2m m
L ³ n ´2 L
= 1− <
2n m 2n
1
Hence ∀ε > 0 if N (ε) is the smallest integer satisfying N (ε) > 2ε2
we have
kfn − fm k0,2 < ε provided both m and n are greater than N (ε), and {fn } is a
Cauchy sequence in this normed space. However this sequence clearly does not
converge to any element f belonging to C 0 [0, L], for if such function exists it must
satisfy:
(Z L L Z L Z )
− 2 2n ¯ 2 ¯2 2
L
¯f (x)¯ dx + |fn (x) − f (x)|2 dx + [f (x) − 1]2 dx →0
L 1 L
0 2
− 2n 2
as n goes to ∞. Since f is necessarily bounded as is fn ∀n, it must satisfy:

Z L Z L
2 ¯ ¯
¯f 2 (x)¯2 dx + [1 − f (x)]2 dx = 0
L
0 2
L L
This means that f (x) = 0 for 0 ≤ x < 2
and f (x) = 1 for 2
< x ≤ 1! Therefore
0
the Cauchy sequence {fn } has no limit in C [0, L] equipped with the norm k·k0,2 .
As a second example we consider the same C 0 [0, L] equipped with the norm
k·k0,∞ . Now we claim that such normed space is complete. In order to prove this,
let {fn } be an arbitrary Cauchy sequence in this normed space. This means that
∀ε > 0, ∃N (ε) > 0 such that max |fn (x) − fn (x)| < ε, ∀m, n > N (ε), x ∈ [0, L].
x∈[0,L]
In particular we have ∀x ∈ [0, L], |fn (x) − fm (x)| < ε, ∀m, n < N (ε), which
means that {fn (x)} is a Cauchy sequence of R for every x ∈ [0, L]. We have then
found a limiting function f of {fn }, since the space of real numbers (equipped
with the absolute value as a norm) is a complete space from a well-known axiom.
Thus for every x ∈ [0, L] , {fn (x)} must have a limit and we call this limit f (x).
The question then is: Is f a continuous function in [0, L]?
By taking an arbitrary point x0 ∈ [0, L] we first note that keeping n = N (ε) + 1
and letting m go to infinity we obtain
|fN +1 (x) − f (x)| ≤ ε, ∀x ∈ [0, L].
Now since the function fN +1 is continuous at x0 there exists δ > 0 such that
|x − x0 | < δ implies that |fN +1 (x) − fN +1 (x0 )| < ε, x ∈ [0, L]. Hence we have
∀x ∈ [0, L] such that |x − x0 | < δ:
|f (x) − f (x0 )| ≤ |f (x) − fN +1 (x)|+|fN +1 (x) − fN +1 (x0 )|+|fN +1 (x0 ) − f (x0 )| < 3ε
Thus ∀ε̄ > 0 there exists δ > 0 such that |f (x) − f (x0 )| < ε̄, provided |x − x0 | < δ
and x ∈ [0, L] (by taking ε in the above equal to ε̄/3).
Finally we must establish that {fn } converges to f in the norm k·k0,∞ . Indeed
we know that
|f (x) − fn (x)| ≤ ε, ∀x ∈ [0, L] and ∀n > N (ε),
which implies that
max |f (x) − fn (x)| ≤ ε, ∀n > N (ε).

x∈[0,L]
Hence ∀ε0 > 0 we take N0 (ε0 ) = N (ε0 /2) to conclude that kfn − f k0,∞ < ε0 , ∀n >
N0 (ε0 ).
Remark 2.14 The reader at this point may check that the sequence {fn } of the
previous example is not a Cauchy sequence of C 0 [0, L] equipped with the maximum
norm. ¥
We just emphasized that the same space may be complete or not depending on
the norm being considered. However we recall that whenever two norms k·ka and
k·kb are equivalent for the same space, then if it is complete for norm k·ka it will
also be complete for norm k·kb
Definition 2.15 A complete normed space, whose norm is associated with an

inner product is called a Hilbert space.
Definition 2.16 A complete normed space whose norm is not associated with an
inner product is called a Banach space.
We just saw that C 0 [0, L] equipped with the maximum norm is a Banach space,
for we had established that the Parallelogram’s law is violated for this norm. We
also saw that C 0 [0, L] equipped with the inner product (· |·)0 is not a Hilbert
space. This is however, the case of L2 (0, L), as long as we consider it as a space
of classes of functions (a quotient space), as seen hereafter.
Definition 2.17 The completion of a normed vector space V is the space of all
the vectors that are limits of Cauchy sequences of V .
We may define L2 (0, L) as the completion of the space of equivalence classes of

functions which only differ from eachother over a null set (a set having a measure
equal to zero), whose squares are Riemann-integrable functions in (0, L). In so
doing L2 (0, L) equipped with the inner product (· |·)0 is a Hilbert space. Similarly
L1 (0, L) = L1 (0, L)/N (0, L) equipped with the k·k0,1 is a Banach space, as the
completion of the space of (equivalence classes of) functions which are Riemann-
integrable in (0, L).
Next we give a result connected to our model problems (P1 ) and (P2 ):
Theorem 2.18 The space V = {v /v ∈ H1 (0, L), v(0) = 0 } equipped with the in-
ner product (· |·)1 is a Hilbert space.
Proof. Let {vn } ⊂ V be a Cauchy sequence for the norm k·k1,2 . This means that
∀ε > 0, ∃N (ε) such that ∀m, n > N (ε)
0
kvn − vm k0,2 < ε and kvn0 − vm k0,2 < ε.
Therefore both {vn } and {[vn0 ]} are Cauchy sequences of L2 (0, L) (where [vn0 ]
denotes the equivalence class of vn0 ; notice that [vn ] contains only vn itself since
this function must be continuous).
Since L2 (0, L) is complete for norm k·k0,2 there exists two functions v and w,
where [v] ∈ L2 (0, L) and [w] ∈ L2 (0, L) such that:
kvn − vk0,2 → 0 and kvn0 − wk0,2 → 0 as n → ∞.
Now we define a function v̄ by

Z x
v̄(x) = w(y)dy.
0
v̄ does belong to V as one can easily check and moreover:

Z L ¯Z x ¯2
¯ ¯
2
kvn − v̄k0,2 = ¯ (vn − w) (x)dy ¯¯ dx
0
¯
0 0
Z L ·Z x ¸2
≤ |(vn0 − w) (x)| dy dx
0 0
Z L ·Z L ¸2
≤ |(vn0 − w) (x)| dy dx
0 0
Z L
2
≤ L kvn0 − wk0,2 dx
0
= L2 kvn0 − wk20,2
Hence vn → v̄ in L2 (0, L) which means that v̄ = v by the uniqueness of the limit.

Finally, since w = v̄ 0 = v 0 a. e. in [0, L], v ∈ V , and we do have kv − vn k1,2 → 0
as n → ∞ as required q.e.d. ¥
Remark 2.19 It can be established (see e. g. [11] ) that there exits a constant C
such that
kvk0,∞ ≤ C kvk1,2 , ∀v ∈ H1 (0, L) (2.1)
On the basis of inequality (2.1) one can demonstrate as an exercise too, that
H1 (0, L) equipped with the inner product (· |·)1 is a Hilbert space. ¥
We conclude this section by clarifying some links between closed spaces and
complete spaces.
As we know every normed vector space V is closed, since it is viewed as a
universe and by definition every converging sequence in V converges necessarily
to an element of V . However V may be complete or not for the given norm. In
this sense both concepts are essentially different. However, whenever a space V is
complete they are equivalent, as far as subspaces of V are concerned, according
to:
Proposition 2.20 If a normed vector space V is complete, then a subspace W

of V is complete if and only if it is closed.
Proof. Let W be complete. If {vn } ⊂ W converges, say, to v ∈ V, then it is also a

Cauchy sequence of W . Since W is complete the limit v must belong to W which
is thus closed.
Conversely let W be closed and {vn } ⊂ W be a Cauchy sequence. This is also a
Cauchy sequence of V and since this space is complete there exists v ∈ V such
that vn → v. However W is closed and hence necessarily v ∈ W which implies
that W is complete. q.e.d. ¥
As an additional exercise, the reader may establish that the subspace V of
H1 (0, L) equipped with the inner product (· |·)1 is a Hilbert space, as a closed
subspace of a Hilbert space. In order to do this inequality (2.1) should be taken
for granted.
2.5 Orthogonal Projections 29
A final result in this section rather easy to establish is the following:
Proposition 2.21 If a subspace W of a complete normed space V is dense in V ,

then the latter is the completion of W with respect to the given norm. ¥
Notice that, as a consequence of Proposition 2.21, the completion of S with

respect to the norm k·k1,2 is V .
2.5 Orthogonal Projections
The setting of problem (P2 ) like in many others cases of the kind, is strictly
related to the principle of finding a function u that minimizes a certain functional
E over a given function space V . However in practical situation determining u
itself may be out of reach. Then we are lead to an approximate problem in which
instead of u, an approximation of it, say uW is searched for in suitably chosen
subspace W of V . The usual requirement is that uW minimizes the distance from
u to an arbitrary element w ∈ W measured in a norm defined upon an inner
1
product, that is, for k·k = (·| ·) 2 :
ku − uW k ≤ ku − wk , ∀w ∈ W, where uW ∈ W . (2.2)
This inner product is actually a natural one for the problem being considered,
and we will see later on that in the case of (P2 ) this is nothing but:
Z L Z L
0 0
(u| v)a = p(x)u (x)v (x)dx + q(x)u(x)v(x)dx.
0 0
Posed in such terms we are actually dealing with the problem of finding the
so-called orthogonal projection uW of u ∈ V onto a subspace W of V , that is,
the vector uW such that u − uW ∈ W ⊥ , where the orthogonality is defined in
connection with an inner product (·| ·) of V . Indeed, assume that there exists
such uW that satisfies (2.2). Then
(u − uW | u − uW ) ≤ (u − w| u − w) , ∀w ∈ W
and in particular for vectors of the form w = uW + εv, v ∈ W and ε ∈ R. Thus
0 ≤ −2ε (u − uW | v) + ε2 (v| v) , ∀v ∈ W and ∀ε ∈ R
or yet
2ε (u − uW | v) ≤ ε2 (v| v) , ∀v ∈ W and ∀ε ∈ R.
Since this implies that |(u − uW | v)| ≤ |ε| (v| v) , ∀v ∈ W and ∀ε ∈ R we conclude
that (u − uW | v) = 0, ∀v ∈ W , i.e., u − uW ∈ W ⊥ .
Conversely if u − uW ∈ W ⊥ we have:
(u − uW | w − uW ) = 0, ∀w ∈ W
or yet
(u − uW | u − uW ) + (u − uW | w − u) = 0, ∀w ∈ W .
Hence
ku − uW k2 = (u − uW | u − w) ≤ ku − uW k ku − wk
and uW is seen to satisfy (2.2).

Actually if u − uW ∈ W ⊥ then uW is unique, for any other element of W , say
uW such that u − uW ∈ W ⊥ will satisfy
∼ ∼
(u − uW |uW − uW ) = (u − uW | uW − uW ) = 0,
∼ ∼ ∼
which trivially implies that uW = uW .

∼
We are then left the following question: Under which assumptions on V and
W there exists uW satisfying (2.2) for any given u ∈ V ? A partial answer to it is
provided by the following Theorem:
Theorem 2.22 (The Classical Theorem of the Orthogonal Projection) If

V is a Hilbert space and W is a closed subspace of V for every u ∈ V there exists
a unique uW ∈ W that minimizes ku − vk over W .
Proof. Let δ = inf ku − wk and {wn } ⊂ W be a minimizing sequence, that is:

w∈W
ku − wn k → δ as n → ∞.
2.6 Bounded Linear Functionals 31
By the Parallelogram law we have ∀m, n ∈ N:

° °2
¡ 2¢
° wn + wm °
2 °
2 ku − wn k + ku − wm k = 4 °u − ° + kwn − wm k2
2 °
Since W is a subspace of V , ku − (wn + wm ) /2k2 ≥ δ 2 and hence

£¡ ¢ ¡ ¢¤
kwn − wm k2 ≤ 2 ku − wn k2 − δ 2 + ku − wm k2 − δ 2 .
Now for a given ε > 0 it is possible to choose N (ε) such that ku − wn2 k − δ 2 <
ε2
ε, ∀n > N (ε). Hence for a given ε0 > 0 there exists N0 (ε0 ), say, N ( 40 ), such
that kwn − wm k < ε0 , ∀m, n > N0 (ε0 ). The minimizing sequence {wn } is hence
a Cauchy sequence of W and since this is a complete subspace of V because it
is closed and V is a Hilbert space, there exists an element uW ∈ W such that
wn → uW . Noticing that necessarily ku − wn k converges to ku − uW k, we have
δ = ku − uW k ≤ ku − wk , ∀w ∈ W as required.
The fact that uW is unique was already established. q.e.d. ¥
2.6 Bounded Linear Functionals
In many applications such as the deformation of a string described by the

Sturm-Liouville problem, the physical data appear in the form of a linear func-
tional F : V → R. where V is a space of functions representing unknown fields,
in which the actual solution must be searched for. In general V is equipped with
a given natural norm, say, k·k, and it is required that F be bounded, according
to the following:
Definition 2.23 A functional F : V → R, where V is a normed vector space is

bounded if
∃C > 0 such that |F (v)| ≤ C kvk , ∀v ∈ V
Typical examples of bounded functionals may be found precisely in the frame-

work of our model problem (P2 ), for which V was defined in Chapter 1, and we
considered two types of physical data (loads):
Z L
• A density of forces f per unit of length → F (v) = f vdx, f ∈ L2 (0, L);
0
• A force acting pointwise (e.g. at x = L2 ) → F (v) = P v( L2 ), P ∈ R
Notice that if we equip V with the norm k·k1,2 , in both cases F is bounded.
Indeed: ¯Z ¯
¯ L ¯
¯ f vdx¯¯ ≤ kf k0,2 kvk0,2 ≤ kf k0,2 kvk1,2 ⇒ C = kf k0,2 ;
¯
0
¯ µ ¶¯
¯ ¯
¯P v L ¯ ≤ |P | C kvk ⇒ C = P C, where C verifies (2.1).
¯ 2 ¯ 1,2
Now we recall that a given functional F applied to elements of a normed

vector space V is continuous at a given v0 ∈ V if ∀ε > 0, ∃δ(ε, v0 ) such that
|F (v) − F (v0 )| < ε, ∀v satisfying kv − v0 k < δ. Notice that this definition is
equivalent to the following one:
F is continuous at v0 ∈ V if every sequence {vn } converging to v0 in V satisfies
F (vn ) → F (v0 ) in R.
A strong result that can be established for linear functionals is the following:
Proposition 2.24 A linear functional F on a normed vector space V is contin-

uous at every v ∈ V if and only if it is bounded.
Proof. First of all we note that since F is linear, if it is continuous at ~0V it will
be continuous at every v ∈ V . Indeed if F is continuous at ~0V then ∀ε > 0, ∃δ
such that |F (v)| < ε if kvk < δ. Then ∀v0 ∈ V , |F (v) − F (v0 )| < ε provided
kv − v0 k < δ since F (v) − F (v0 ) = F (v − v0 ) by linearity.
ε
Now assuming that F is bounded, ∀v ∈ V and ∀ε > 0 it suffices to take δ = to
C
establish the continuity at ~0V .
Conversely, it F is continuous take a given ε > 0 and a corresponding δ > 0 such
that kvk < δ ⇒ |F (v)| < ε. Now for an arbitrary v 0 ∈ V let v = δv 0 /2 kv 0 k if
v 0 6= ~0V and v = ~0V otherwise. Clearly |F (v)| < ε and by the linearity of F we
have:
δ |F (v 0 )|
< ε for v 0 6= ~0V ,
2 kv 0 k
2ε
and in any case |F (v 0 )| ≤ C kv 0 k , ∀v 0 ∈ V with C = δ
. q.e.d. ¥
At this point it seems interesting to show an example of a linear functional which
is not bounded on our space V of test functions for Problem (P2 ): Take L = 1
xn
and set F (v) = v 0 (L). Now consider the sequence {vn } ⊂ V with vn (x) = . We
1
n
have F (vn ) = 1, ∀n and |vn |1,2 = (2n − 1)− 2 . Thus according to Proposition 2.1,
kvn k1,2 tends to zero as n → ∞ and hence this functional is unbounded (this is
not surprising since there are functions in V whose first order derivative are not
3
bounded in (0, 1), such as v(x) = (1 − x) 4 ).
In more physical terms, this means that “pinching” loads, or yet “dipoles”,
yielding a right hand side of (P2 ) of the form Qv 0 (x0 ), 0 ≤ x0 ≤ L, (where x0
is the point where the string is “pinched”), Q ∈ R are not admissible to this
problem. On the other hand, as already stated, loads applied pointwise, that is
those of the “Dirac” type, corresponding to functionals of the form P v(x0 ) are
bounded in V equipped with the k·k1,2 norm and therefore admissible to problem
(P2 ).
Definition 2.25 The kernel of a given functional F : V → R is the set

Ker(F ) = {v /v ∈ V, F (v) = 0 }.
Clearly if F is a linear functional then Ker(F ) is a subspace of V called the

nullspace of F . If V is a normed space it is easy to prove that Ker(F ) is a closed
subspace of V .
Now we conclude this Chapter by giving a classical and powerful result related
to bounded linear functionals on Hilbert spaces.
Theorem 2.26 (The Riesz Representation Theorem) Let V be a Hilbert

space for inner product (·| ·) and F : V → R be a bounded linear functional. Then
there exists a unique vF ∈ V such that F (v) = (vF | v) , ∀v ∈ V and moreover
F (v)
sup = kvF k .
v∈V kvk
v6=~0V
Proof. If F (v) = 0, ∀v ∈ V then clearly vF = ~0V fulfills all the required properties.
Otherwise Ker(F ) is not the whole V , and we may select a w ∈ V such that
w ∈Ker(F
/ ).
Since Ker(F ) is closed, from the Theorem of the Orthogonal Projection there
exists w0 ∈ Ker(F ) such that z = w − w0 ∈ Ker(F )⊥ with z 6= ~0V . Hence, noting
z
that F (z) = F (w) 6= 0 we can define z 0 = , z 0 ∈ Ker(F )⊥ .
F (z)
Let now v be an arbitrary element of V . By linearity we have v −F (v)z 0 ∈ Ker(F )
and on the other hand,
(v − F (v)z 0 | z 0 ) = 0 since z 0 ∈ Ker(F )⊥ .
Therefore (v| z 0 ) = F (v) (z 0 | z 0 ), and since z 0 6= ~0V we have
z0
F (v) = (vF | v) , ∀v ∈ V , with vF = .
kz 0 k2
Furthermore we have:
(vF | vF ) (vF | v) F (v) kvF k kvk

kvF k = ≤ sup = sup ≤ sup = kvF k .
kvF k v∈V kvk v∈V kvk v∈V kvk
v6=~0V v6=~0V v6=~0V
Finally if v̄F ∈ V is such that F (v) = (v̄F | v) , ∀v ∈ V , then ∀v ∈ V (vF − vF0 | v) =

0, and hence vF = vF0 . q.e.d. ¥
Remark 2.27 Notice that for a bounded linear functional F with
F (v) ≤ C kvk , ∀v ∈ V ,
we have
F (v)
CF = sup ≤ C,
v∈V kvk
v6=~0v
and also F (v) ≤ CF kvk , ∀v ∈ V . Otherwise stated
CF = inf {C ∈ R /F (v) ≤ C kvk , ∀v ∈ V } ,

and this infimum is necessarily a minimum given by kvF k. CF is actually a norm

of F in the vector space V 0 consisting of all bounded linear functionals defined on
V (cf. [6],’69), the basic operations on it being defined as follows:
For F1 , F2 ∈ V 0 , F1 + F2 is given by (F1 + F2 )(v) = F1 (v) + F2 (v), ∀v ∈ V ;

For F ∈ V 0 and α ∈ R, αF is given by (αF )(v) = αF (v), ∀v ∈ V ;
The thus normed space V 0 is called the dual space1 of V . The Riesz Representa-
tion Theorem then states that for every Hilbert space V there exists an isometry
between itself and its dual space. ¥
The reader may practice the above concepts by searching for the function vF ∈
H1 (0, L) such that
(vF | v)1 = F (v), ∀v ∈ H1 (0, L),
where
Z (
L L
L x for 0 ≤ x < 2
F (v) = v( ) + f vdx, with f (x) = .
2 0
L
for L
≤x<L
2 2
1 Some authors prefer to call it the topological dual space of V thereby stressing the fact that it is associated
with a given metrics.
3
The Finite Element Method in One Dimension
Space
In this Chapter we shall introduce the classical finite element method to solve
ordinary differential equations with given boundary values, taking the Sturm-
Liouville problem as a model. Since this is a numerical solution procedure we will
also be concerned about the evaluation of the underlying approximation error,
together with the convergence of successively computed solutions to the exact
one as one goes into an ideal limiting process.
Before doing so we will study a general functional framework for boundary
value differential equations among others problems.
3.1 An Abstract Linear Variational Problem
Let V be a vector space equipped with inner product (·| ·) and a : V × V → R

and F : V → R be two mappings satisfying the following assumptions, in which
1
k·k denotes the norm given by (·| ·) 2 .
38 3. The Finite Element Method in One Dimension Space
Assumption 1
(a). a is bilinear, i.e., linear with respect to each argument;
(b). a is bounded: ∃M > 0 such that a(u, v) ≤ M kuk kvk , ∀u, v ∈ V ;
(c). a is symmetric: a(u, v) = a(v, u), ∀u, v ∈ V ;
(d). a is coercive: ∃µ > 0 such that a(v, v) ≥ µ kvk2 , ∀v ∈ V .
Assumption 2
(e). F is linear;
(f ). F is bounded.
Now we consider the general linear variational problem, of the form

(
Find u ∈ V such that
(3.1)
a(u, v) = F (v), ∀ ∈ V
together with the following “approximate” one, in the sense that its solution
provides an approximation of u: Letting W ⊂ V be a given subspace of V , we
wish to solve (
Find uW ∈ W such that
(3.2)
a(uW , w) = F (w), ∀w ∈ W
For problems (3.1) and (3.2) we have the following:
Theorem 3.1 Under the Assumptions 1 and 2 both problem (3.1) and (3.2) have
a unique solution provided V is a Hilbert space equipped with inner product (·| ·)
and W is closed. Moreover in this case the following “error” bound applies:
s
M
ku − uW k ≤ inf ku − wk . (3.3)
µ w∈W
Proof. First we note that bilinear form a defines an inner product on V , according
1
to Assumption 1. The underlying norm denoted by k·ka = a(·, ·) 2 is equivalent to
the standard one k·k, since
µ kvk2 ≤ kvk2a ≤ M kvk2 , ∀v ∈ V

3.1 An Abstract Linear Variational Problem 39
Therefore V is also complete when equipped with norm k·ka , and hence it is also
a Hilbert space for the inner product given by a(·, ·).
Now from the Riesz Representation Theorem we know that there exists a unique
v ∈ V such that F (v) = a(vFa , v), ∀v ∈ V . Hence problem (3.1) is equivalent to
(
a(u, v) = a(vFa , v), ∀v ∈ V
This problem has clearly a unique solution, namely, u = vFa .

Now we consider problem (3.2). Since W is a closed subspace of Hilbert space
V for the inner product (·| ·), it is also a closed subspace for the inner product
a(·, ·) due to the equivalence of the corresponding norms. Hence W is also a
Hilbert space for inner product a(·, ·) and here again by the Riesz Representation
Theorem restricted to the closed subspace W, problem (3.2) also has a unique
solution, since W is necessarily complete.
Actually noticing that
F (w) = a(u, w) = a(uW , w), ∀w ∈ W
we infer that uW is nothing but the orthogonal projection of u onto the closed
subspace W for the inner product a(·, ·). Therefore we have:
a(u − uW , u − uW ) ≤ a(u − w, u − w), ∀w ∈ W ,
which yields
µ ku − uW k2 ≤ M ku − wk2 , ∀w ∈ W ,
and finally (3.3). q.e.d. ¥

Now we introduce an energy functional related to problem (3.1),
namely E : V → R, with
1
E(v) = a(v, v) − F (v).
2
The above definition gives rise to the following result:
Theorem 3.2 If V is a Hilbert space for inner product (·| ·), under Assumptions
1 and 2 there exists a unique u ∈ V that minimizes E over V , that is, the solution
of (
(3.4)
E(u) ≤ E(v), ∀v ∈ V
Moreover u is precisely the solution of (3.1).
Proof. Let u be the solution of (3.1). We have
a(u, u − v) = F (u − v), ∀v ∈ V
and from (a).(c),(d) and (e) we derive successively:

1
a(u, u − v) ≤ F (u − v) + a(u − v, u − v), ∀v ∈ V
2
1 1
a(u, u) − a(u, v) ≤ F (u) − F (v) + a(u, u) − a(u, v) + a(v, v), ∀v ∈ V
2 2
Hence
E(u) ≤ E(v), ∀v ∈ V .
The existence of u that solves (3.4) having been established all that is left to do is
proving that this is the only solution of this problem. Let then ū ∈ V be another
solution of (3.4). We have simultaneously
u + ū u + ū
E(u) ≤ E( ) and E(ū) ≤ E( )
2 2
Adding up and developing both inequalities we easily obtain:
1 1 1
a(u, u) + a(ū, ū) ≤ [a(u, u) + a(ū, ū) + a(u, ū) + a(ū, u)]
2 2 4
which yields
1
a(u − ū, u − ū) ≤ 0.
4
Finally from (d) we derive u = ū. q.e.d. ¥
3.1 An Abstract Linear Variational Problem 41
Remarks:
1. We have actually established the equivalence of (3.1) and (3.4) under

Assumptions 1 and 2, without taking into account the fact that V is a
Hilbert space. However this assumption is essential in order to ensure
that the solution to (3.1), and hence to (3.4), does exist.
2. (c) is not essential for the existence and uniqueness of a solution to

(3.1). Actually the case where a is not symmetric is addressed by the
celebrated Lax-Milgram Theorem (cf. [2] ’76). However in this case
problem (3.4) does not make sense, and there is no equivalence between
(3.1) and the minimization of an energy functional in general.
3. If a does not satisfy (c) we can still establish the following error bound
M
for problem (3.2): ku − uW k ≤ inf ku − wk. ¥
µ w∈W
Let us now apply the results of theorems (3.1) and (3.2) to the Sturm-Liouville
problem. First we recall that problem (P2 ) fits into general form (3.1) if we set
Z L Z L
0 0
a(u, v) = pu v dx + quvdx (3.5)
0 0
where the space V is the one defined in Chapter 1. On the other hand, the linear
functional F may take any form for which there exists a constant C such that
F (v) ≤ C kvk1,2 , ∀v ∈ V (3.6)
We recall that V is a closed subspace of H1 (0, L) equipped with the k·k1,2 norm,
and is hence a Hilbert space. Note also that the form a given by (3.5) fulfills all
the four conditions specified in Assumption 1, provided p is bounded above and
below by strictly positive constants A and α, and that B ≥ q ≥ 0, ∀x ∈ [0, L˙] In
particular in this case we have
a(u, v) ≤ M kuk1,2 kvk1,2 , ∀u, v ∈ V

and
a(v, v) ≥ µ kvk21,2 , ∀v ∈ V
with M = max(A, B) and µ = αC12 , where C1 is one of the two constants that
appear in Proposition 2.1 of Chapter 2.
Taking all these facts into account, according to Theorem 3.1, problem (P2 )
has a unique solution u ∈ V . Notice that this can eventually belong to one of
the spaces S defined in Chapter 1 or even to a smaller space, depending on the
regularity of the data f, p and q (for instance, if f ∈ H1 (0, L), p0 ∈ C 0 [0, L] and q ∈
H1 (0, L) we have u ∈ S∩H3 (0, L), where H3 (0, L) = {v /v, v 0 , v 00 , v 000 ∈ L2 (0, L)}).
However we gain nothing by searching u in such spaces, which incidentally are
not Hilbert spaces for the natural inner product (·| ·)1 . This is because one might
be facing the non existence of a solution in doubtful cases. Moreover in any case
if the solution is more regular than functions belonging “only” to H1 (0, L) it will
find itself in the right space naturally.
Let us conclude this section by applying the result of Theorem 3.2: The solution
u ∈ V is the one that minimizes the total energy of the string under the action
of the forces corresponding to functional F that is
E(u) ≤ E(v), ∀v ∈ V
with ·Z Z ¸
L L
1 0 2 2
E(v) = p(v ) dx + qv dx − F (v).
2 0 0
3.2 The Ritz Approximation Method
Now we are ready to go into the main purpose of these lectures by considering
first the simplest possible finite element method in use to solve second order
ordinary differential equations corresponding to a minimization of a (convex)
functional such as (P1 ), namely, piecewise linear approximation methods. The
main principle stems from the celebrated Ritz method, which is based on the
3.2 The Ritz Approximation Method 43
definition of a finite dimensional subspace of V , say W = VN , where N is the

dimension of VN . This gives rise to the following approximate problem:
(
Find uN ∈ VN such that
(3.7)
a(uN , v) = F (v), ∀v ∈ VN
According to Theorem 3.1, (3.7) has a unique solution, since any finite di-
mensional subspace is closed in any norm one may equip the whole space V (1) .
Moreover, if {ϕi }N i=1 is a basis of VN then there exists a unique set of N constants
Xn
c1 , c2 , . . . , cN such that uN = cj ϕj .
j=1
Since we may take as test function v ∈ VN either one of the functions ϕi ,
problem (3.7) yields the following system of equations for the unknowns cj , j =
1, 2, . . . , N : Ã N !
X
a cj ϕj, ϕi = F (ϕi ) for i = 1, 2, . . . , N,
j=1
or yet by linearity:
N
X
cj a(ϕj, ϕi ) = F (ϕi ) for i = 1, 2, . . . , N, (3.8)
j=1
This is nothing but the linear system of equations
A~c = f~ (3.9)
¡ ¢
with ~c = (c1 , c2 , . . . , cN ), where A = {aij } is an n × n matrix with aij = a ϕj, ϕi
and f~ ∈ RN is given by:
f~ = (F (ϕ1 ), F (ϕ2 ) . . . , F (ϕN )).
The other way around, one may “multiply” each equation of (3.8) with an
arbitrary scalar vi ∈ R and add them up from i = 1 up to i = N , thereby
1 This is an easy-to-prove consequence of the completeness of RN , which is in turn a consequence of the

completeness of R.
obtaining again by linearity:

Ã N
! N
X X
a uN, vi ϕi = F( vi ϕi ), ∀~v = (v1 , v2 , . . . , vN ) ∈ RN ,
i=1 i=1
which brings us back to problem (3.7) since every element v in VN may be written
XN
in the form v = vi ϕi for suitable scalars vi . Otherwise stated problem (3.7) is
i=1
equivalent to system (3.8) or yet to (3.9).
Remark 3.3 We do not need to require the coerciveness of bilinear form a, in

order to apply the principles of Ritz’s method. As a matter of fact, any linear
differential equation of the form D(u) = f , where D is a differential operator and
f is a given function, may be written in an equivalent variational form, that is, a
weak form, as follows:
(D(u)| v) = (f | v) , ∀v ∈ V
for a suitable inner product (·| ·) and a suitable space V .

If the expression (D(u)| v) gives rise to an appropriate bilinear form, say, a(u, v) =
(D(u)| v) , ∀u, v ∈ V , which has the properties (a) and (b), but not necessarily
(c) and (d), and F (v) = (f | v) , ∀v ∈ V has the properties (f ), we may still ap-
proximate the corresponding problem of the form (3.1) by (3.7). In so doing we are
actually employing the so-called Galerkin method or yet the Ritz-Galerkin
method. ¥
Let us now turn our attention to the practical solution of (3.7). The equivalent
linear system (3.8) or (3.9) has an invertible matrix, since problem (3.7) is well-
posed. As a matter of fact matrix A is symmetric and positive-definite, and hence
there should be no serious problem to solve it. However this assertion somewhat
masks the fact that the choice of the basis {ϕi } plays certainly a crucial role, as
far as computational complexity and round-off error control are concerned, among
other practical aspects such as storage requirements.
3.3 Piecewise Linear Finite Element Approximations 45
Just to illustrate the latter point, assume that we are able to construct an
orthogonal basis {ϕi } with respect to the energy inner product a(·, ·)˙ Then the
fi
solution ~c to (3.9) will appear explicitly, that is: ci = , i = 1, 2, . . . , N .
a(ϕi , ϕi )
However the construction of such basis is in most cases much too complex to
justify its use. Instead one could be satisfied with the use of an easy-to-handle
basis for which a(ϕj , ϕi ) vanishes for a very large number of pairs of indices (i, j).
This is precisely the case of the finite element method, whose basic version is
presented in the following section.
3.3 Piecewise Linear Finite Element Approximations
In its most classical version, the piecewise linear finite element method corre-
sponds to the following choice of a basis in the Ritz method. First we subdivide
L
the interval [0, L] into N equal intervals (or “elements”) of length h = . Then
N
we associate with every interval end (or “node”) xi = ih, a function ϕi having
the following properties for i = 0, 1, . . . , N :
1. ϕi restricted to any interval (xj−1 , xj ) is a polynomial of degree less than or

equal to one, j = 1, 2, . . . , N .
2. ϕi (xj ) = δ ij .
The figure below illustrates the aspect of a generic ϕi :

It is an easy matter to prove that the so-defined functions ϕ0 , ϕ1 , . . . , ϕN are

N
X XN
linearly independent. Indeed if αj ϕj ≡ 0 then αj ϕj (xi ) = 0, ∀i, 0 ≤ i ≤ N ,
j=0 j=0
and hence αi = 0, ∀i. Actually the set {ϕi }N
i=1 forms a basis of the following space:
n . o
Vh = v v ∈ C 0 [0, L], v(0) = 0, v|[xi−1 ,xi ] ∈ P1 , for i = 1, 2, . . . , N ,
where Pk denotes the space of polynomials (in one variable) of degree less than or
equal to k. Notice that Vh is a subspace of V , for in the case of spaces of piecewise
polynomial functions v, we have v ∈ C 0 [0, L] ⇔ v ∈ H1 (0, L), as one may easily
check.
Remark 3.4 Since ϕi (0) = 0, ∀i ≥ 1 and every v ∈ Vh is easily found to be equal

N
X
to v(xi )ϕi , the set {ϕi }N
i=1 is indeed a basis of Vh . ¥
i=1
Now we consider the application of the Ritz-method to the case where VN is the
particular space Vh defined as above. This gives rise to the following approximation
of (P2 ):
(
Find uh ∈ Vh such that
(3.10)
a(uh , v) = F (v), ∀v ∈ Vh
where a is the form given by (3.5) and F is a linear functional satisfying (f) or
yet (3.5).
Clearly problem (3.10) is of the form (3.2) with W = Vh . Hence according to
Theorem 3.1 it has a unique solution, and moreover we have:
s
M
ku − uh k1,2 ≤ inf ku − vk1,2 . (3.11)
µ v∈Vh
We recall that M and µ are given by M = max(A, B) and µ = αC12 where C1 is

defined in Proposition 2.1, Chapter 2, A = kpk0,∞ , B = kqk0,∞ .
Before turning our attention to the convergence analysis of the finite element
method, we find it instructive to give an idea of the aspect matrix A and vector
f~ take in practical situations. For this purpose we consider, the particular case
where p, q and f are constant. Taking N = 4, for instance, we have:
 2p 2hq 
+ 3 − hp + qh 0 0
 p qh 2p 2qh6
h
p qh 
 −h + 6 h + 3 −h + 6 0 
A= 


 0 − h + 6 h + 3 − h + qh
p qh 2p 2qh p
6 
p qh p qh
0 0 −h + 6 h
+ 3
and h iT
f~ = fh fh fh fh
2
.
The reader may easily find out that for any N, N ≥ 2, we have:
2p 2qh
aij = 0 if |i − j| > 1, and aii = + for 1 ≤ i ≤ N − 1,
h 3
p qh p qh
aN N = + , ai,i+1 = − + for 1 ≤ i ≤ N − 1, and
h 3 h 6
fh
aij = aji , ∀i, j, while fi = f h for 1 ≤ i ≤ N − 1 and fN = .
2
Notice that A is indeed a sparse matrix (at least for N ≥ 4), more precisely a
tridiagonal matrix.
Now in order to establish the convergence of uh to u in V as h goes to zero, we
address the basic issue in the mathematical analysis of the finite element method,
namely, deriving a possibly optimal upper bound for the term inf ku − vk1,2 .
v∈Vh
Clearly, according to the Theorem of the Orthogonal Projection, this infimum is
actually a minimum attained for v = Πh (u), namely, the orthogonal projection
onto Vh of the exact solution u of (P2 ), with respect to the inner product (· |·)1 .
However it turns out that the best we can hope for, at least qualitatively in terms
of estimates in the Sobolev norms, is attained if we just take the upper bound
of this minimum given by ku − Ih (u)k1,2 , where Ih (u) is the function of Vh that
interpolates the (continuous) solution of (P2 ) at the nodes xi (of the “mesh”
{[xi−1 , xi ]}N
i=1 ), i = 0, 1, . . . , N .
Let us then derive an approximate estimate for the term ku − Ih (u)k1,2 . We
N h
X i
2
have: ku − Ih (u)k1,2 = ku − Ih,i (u)k20,2,ki + ku0 − Ih.i
0
(u)k20,2,ki , where Ih,i (u)
i=1
is the restriction of Ih (u) to the “element” Ki = [xi−1 , xi ], i = 1, 2, . . . , N , and

k·k0,2,Ki denotes the norm of L2 (Ki ), i.e.,
·Z xi ¸ 12
k·k0,2,Ki = (·)
xi−1
At this point we first assume that the solution u of (P2 ) belongs to H2 (0, L).
In so doing we define ∆i = u − Ih,i (u) and we note that ∆i (xi ) = ∆i (xi−1 ) = 0.
Hence we may legitimately consider Fourier’s sinus series to expand ∆i , that is:
∞
X mπ(x − xi−1 ) i
∆i (x) = cim sin , cm ∈ R, ∀m.
m=1
h
˜ i (y) = ∆i (x), we
Performing the change of variable y = x − xi−1 and setting ∆
have: Z Z ∞
xi h
˜ h X ¡ i ¢2
∆2i dx = ∆i (y)dy = c ,
xi−1 0 2 m=1 m
since according to well-known results,
Z h Z h
mπy nπy mπy nπy h
cos cos dy = sin sin dy = δ mn , ∀m, n ∈ Z.
0 h h 0 h h 2
For this reason we also have:

∞
X Z ∞
˜ 0i (y) mπ mπy xi
2 h X ¡ i ¢2 m2 π 2
∆0i (x) = ∆ = cim cos ⇒ (∆0i ) dx = c
m=1
h h xi−1 2 m=1 m h2
and
∞
X Z ∞
˜ 00i (y) = − m2 π 2 mπy xi
2 h X ¡ i ¢2 m4 π 4
∆00i (x) =∆ cim 2 sin ⇒ (∆00i ) dx = c .
m=1
h h xi−1 2 m=1 m h4
Notice that since uh is linear in [xi−1 , xi ] we have ∆00i (x) = u00 (x), ∀x ∈ Ki , and
moreover from the fact that m ≥ 1 we have
∞
X ¡ ¢2 m 4 π 2 h2 00 2
k∆0i k0,2,Ki ≤ cim = ku k0,2,ki , ∀i, i = 1, 2, . . . , N.
m=1
h2 π2
Therefore summing up from i = 1 up to i = N we obtain

2 h2 00 2
ku0 − Ih0 (u)k0,2 ≤ ku k0,2
π2
A similar calculation leads to
h4 00 2
ku − Ih (u)k20,2 ≤ 4 ku k0,2
π
which yields
· ¸1
h h2 2
ku − Ih (u)k1,2 ≤ 1 + 2 |u|2,2 , (3.12)
π π
where |·|2,2 is the seminorm of H2 (0, L) given by |v|2,2 = kv 00 k0,2 .
This estimate immediately allows us to state that both error terms ku − uh k0,2
and ku0 − u0h k0,2 are at least an O(h) term. However as far as the former is con-
cerned, this is unsatisfactory, since one should expect a better estimate for this
term than the one that holds for first order derivatives. That is why we further
refine this estimate of ku − uh k0,2 by using a duality argument known as Nitsche’s
trick. This can be applied under the assumption that p0 is bounded, for we know
that in this case the solution v of the Sturm-Liouville problem of type (P1 ) be-
longs to H2 (0, L) for any right hand side g in L2 (0, L). Furthermore it can also
be established quite easily, that there exists a constant C̃ such that
|v|2,2 ≤ C̃ kgk0,2 , ∀g ∈ L2 (0, L) (3.13)

√ £ ¡ ¢ ¤1
(where C̃ may be taken equal to 3 1 + CCp2 + BCp4 /α2 2 /α, Cp being the
constant of the Friedrichs-Poincaré inequality and C being the upper bound of
|p0 |).
(u − uh |g)0
Now we first note that ku − uh k0,2 = sup .
g∈L2 (0,L) kgk0,2
g6=0
Hence if for a given g ∈ L2 (0, L), v ∈ H2 (0, L) satisfies
−(pv 0 )0 + qv = g, v(0) = v 0 (L) = 0
then
(u − uh |−(pv 0 )0 + qv)0
ku − uh k0,2 ≤ C̃ sup
v∈S |v|2,2
v6=0
where we recall that S = {v /v ∈ H2 (0, L), v(0) = v 0 (L) = 0 }.

Noting that (u − uh |−(pv 0 )0 + qv)0 = a(u − uh , v) and that a(u − uh , vh ) =
0, ∀vh ∈ Vh we have:
a(u − uh , v − vh )
ku − uh k0,2 ≤ C̃ sup , ∀vh ∈ Vh ,
v∈S |v|2,2
v6=0
or yet
ku − uh k1,2 kv − vh k1,2
ku − uh k0,2 ≤ C̃M sup , ∀vh ∈ Vh .
v∈S |v|2,2
v6=0
Taking vh = Ih (v) and recalling (3.12) we obtain

( · ¸1 )
h h2 2
ku − uh k0,2 ≤ ku − uh k1,2 C̃M 1+ 2 (3.14)
π π
Summarizing all the above estimates we have the following theorems:
Theorem 3.5 Assume that the solution of (P2 ) belongs to H2 (0, L). Then we
have s · ¸1
Mh h2 2
ku − uh k1,2 ≤ 1 + 2 |u|2,2 .
µ π π
Proof. This is an immediate consequence of the estimates (3.11) and (3.12). q.e.d.
¥
Theorem 3.6 Assume that the solution of (P2 ) belongs to H2 (0, L) for any right
hand side in L2 (0, L). Then if f ∈ L2 (0, L) the solutions u of (P1 ) and uh of
(3.10) satisfy s µ ¶
M h2 h2
ku − uh k0,2 ≤ C̃M 1+ 2 |u|2,2
µ π2 π
where C̃ is the constant of inequality (3.13).
Proof. This is a direct consequence of Theorem 3.5, and of the estimate (3.14).
q.e.d. ¥
Let us now consider the case where the right hand side of (P2 ) is given by an
arbitrary bounded linear functional F over V , or even that p is just a bounded
function, i.e. 0 < α ≤ p(x) ≤ A, ∀x ∈ [0, L]. Under such conditions as we know,
it is not possible to assert that u ∈ H2 (0, L). However we still can prove that
uh → u in the norm k·k1,2 as h goes to zero, according to
Theorem 3.7 The solution uh of (3.10) converges to the solution u of (P1 ) as h

goes to zero, under Assumptions 1 and 2.
Proof. We know that S is dense in V . Hence ∀ε > 0 there exists uε ∈ S such

that ku − uε k1,2 < ε/2.
Now we may use the interpolate Ih (uε ) ∈ Vh of uε , for which we have:
1
(1 + L2 /π 2 ) 2
kuε − Ih (uε )k1,2 ≤ Čh |uε |2,2 , Č = .
π
Hence for the given ε we may choose h small enough to have
ε
kuε − Ih (uε )k1,2 < .
2
Therefore inf ku − vk1,2 ≤ ku − Ih (uε )k1,2 < ε, provided h is small enough, for
v∈Vh
any ε > 0, and the result follows from bound (3.11). q.e.d. ¥
Theorem 3.8 Assume that |p0 | is bounded. Then there exists a constant C0 in-
dependent of h such that the solution of (P2 ) and (3.10) satisfy
ku − uh k0,2 ≤ C0 h |u|1,2 . (3.15)
Proof. First we note that since |p0 | is bounded the solution of (P1 ) belongs
H2 (0, L) for every right hand side in L2 (0, L). Hence we may apply Nitsche’s
trick and (3.14) holds.
On the other hand we have
M Mh i
ku − uh k21,2 ≤ 2
ku − Πh (u)k1,2 ≤ 2 2
ku − Πh (u)k1,2 + kΠh (u)k1,2 .
µ µ
Thus from the Pythagorean Theorem we obtain
M M
ku − uh k21,2 ≤ kuk21,2 ≤ 2
|u|21,2 , (3.16)
µ µC1
where C1 is the constant specified in Proposition 2.1. Combining (3.14) and (3.16)
we establish (3.15) with
s µ ¶
C̃M M L2
C0 = 1+ 2 .
πC1 µ π
q.e.d. ¥
Remark 3.9 Of course if the right hand side of (P1 ) belongs to L2 (0, L), under
the assumptions of Theorem 3.8 estimate (3.15) is suboptimal, since in this case
we may rather apply Theorem 3.6. ¥
3.4 More General Cases and Higher Order Methods
We conclude this chapter by considering other types of finite element methods

applied not only to model problem (P1 ), but also to some important variants of
this equation, such as:
For 0 < α ≤ p ≤ A and 0 ≤ q ≤ B:
(
−(pu0 )0 + qu = f
(The Dirichlet problem) (3.17)
u(0) = u(L) = 0
For 0 < α ≤ p ≤ A and 0 ≤ β ≤ q ≤ B:

(
−(pu0 )0 + qu = f
(The Neumann problem) (3.18)
(pu0 )(0) = (pu0 )(L) = 0
and

 0 0
 −(pu ) + qu = f

)
γu(0) − (pu0 )(0) = 0 (The Fourier problem) (3.19)

 γ>0
 γu(L) + (pu0 )(L) = 0
3.4 More General Cases and Higher Order Methods 53
The reader may easily establish the equivalence between these three problems
under the assumption f ∈ L2 (0, L), respectively with


 Find u ∈ H01 (0, L) such that

 Z L Z L Z L
0 0
pu v dx + quvdx = f vdx (3.20)




0 0 0
where H01 (0, L) = {v /v ∈ H1 (0, L), v(0) = v(L) = 0 }


 Find u ∈ H1 (0, L) such that
Z L Z L Z L
(3.21)


0 0
pu v dx + quvdx = f vdx, ∀v ∈ H1 (0, L)
0 0 0


 Find u ∈ H1 (0, L) such that
Z L Z L Z L


0 0
pu v dx + quvdx + γ [u(0)v(0) + u(L)v(L)] = f vdx, ∀v ∈ H1 (0, L)
0 0 0
(3.22)
He or she may also demonstrate that all those problems have a unique solution
under the respective assumptions on p, q and γ.
Whatever the case one may approximate u by means of the Ritz method, and in
particular by finite element methods such as the already presented piecewise linear
method. Actually in this particular case Vh is defined in the following manner:
Vh = span {ϕi }N −1
i=1 for (3.20)
Vh = span {ϕi }N
i=0 for (3.21) and (3.22).
Furthermore there is no essential difficulty to establish that like in the case of

(P2 ) the approximation uh ∈ Vh of u determined by this finite element method
satisfies, in all the three cases
ku − uh k1,2 ≤ Ĉ1 h |u|2,2 ,
and if |p0 | is bounded

ku − uh k0,2 ≤ C̃0 h2 |u|2,2
where C̃1 and C̃0 are constants independent of u and h.

Let us now consider higher order piecewise polynomial approximations of second

order linear differential equations of the type −(pu0 )0 + qu = f in (0, L), with
different boundary conditions leading to well-posed variational problems such as
those given in (P2 ),(3.20),(3.21) and (3.22).
Denoting by V the corresponding test function space, we define its approxima-
tion space Vh as follows.
For a given integer k > 1:
© ± ª
Vh = v v ∈ V, v|Ki ∈ Pk , ∀i = 1, 2, . . . , N (3.23)
Notice that in all the cases being considered v ∈ Vh requires that v ∈ C 0 [0, L],
and the fact that v satisfies the boundary conditions imposed to any function in
V . For instance in approximating (3.20) we must have v(0) = v(L) = 0, ∀v ∈ Vh ,
while v does not need to satisfy any boundary condition in the case of (3.21) and
(3.22).
If we take for instance k = 2, a convenient basis for Vh in the latter case is
{ϕi }2N
i=0 , where:
ϕ2j has the following form for 0 ≤ j ≤ N − 1
whith
and
ϕ2j−1 has the following form for 1 ≤ j ≤ N :
The reader may draw similar pictures for “natural” bases of Vh , constructed
upon similar principles for some values of k greater than or equal to 3.
Once we have defined the basis of Vh we may compute uh ∈ Vh , namely, the
corresponding Ritz approximation of u, thereby obtaining a system of linear equa-
tions A~c = f~ in all similar to the one generated with piecewise linear elements.
For instance, for k = 2, in the case of problem (3.20), A is a (2N − 1) × (2N − 1)
Z L
¡ 0 0 ¢
matrix whose coefficients aij are given by aij = pϕj ϕi + qϕj ϕi dx and f~ is
Z 0L
the vector of R2N −1 whose i − th component is f ϕi dx, ~c being the vector of
0
coefficients in the expansion of uh with respect to the basis {ϕi }2N −1
i=1 .
Finally the approximation errors ku − uh k1,2 and ku − uh k0,2 may be estimated
by means of analyses entirely analogous to those carried out in detail for piecewise
linear elements applied to (P2 ). The following theorems given without proofs
summarize the main results:
Theorem 3.10 Assume that the solution of the problems (P2 ),(3.20) (3.21),(3.22)
© ± ª
belongs to u ∈ Hk+1 (0, L) where ∀k ∈ N, Hk (0, L) = v v, v 0 , . . . , v (k) ∈ L2 (0, L) .
Then if the problem under consideration is approximated by



 Find uh ∈ Vh such that
Z L
(3.24)

 a(uh , v) = f vdx, ∀v ∈ Vh
0
Z L Z L
0 0
where Vh is given by (3.23), a is the bilinear form pu v dx + quvdx for
0 0
problems (P2 ),(3.20) and (3.21) or the same expression modified by the addition
of γ[u(0)v(0) + u(L)v(L)] for problem (3.22), then there exists a constant Ck
independent of u and h such that
ku − uh k1,2 ≤ Ck hk |u|k+1,2
° °
where |u|k+1,2 = °u(k+1) °0,2 .
Theorem 3.11 If in addition to the assumption of Theorem 3.10 we further as-

sume that p is such that the solution to any problem among (P2 ),(3.20) (3.21),(3.22)
belongs to H2 (0, L) whenever f ∈ L2 (0, L), then there exists a constant Ck0 inde-
pendent of u and h such that ku − uh k0,2 ≤ Ck0 hk+1 |u|k+1,2 .
Remark 3.12 By an analysis completely analogous to the one carried and for
problem (P2 ) one establishes that the only assumption on p necessary for Theorem
3.11 to hold is the boundedness of |p0 |. ¥
To conclude this Chapter we add that both theorems 3.10 and 3.11 extend
with minor modifications to the approximation of non homogeneous problems by
finite element methods constructed upon piecewise polynomials in the manner
described in this section and in the previous one, such as:
(
−(pu0 )0 + qu = f in (0, L)
(Non homogeneous Dirichlet problem)
u(0) = a1 , u(L) = a2
(
−(pu0 )0 + qu = f in (0, L)
(Non homogeneous Neumann problem)
(pu0 )(0) = b1 , (pu0 )(L) = b2

 0 0
 −(pu ) + qu = f in (0, L)

γu(0) − (pu0 )(0) = c1 , (Non homogeneous Fourier problem)


 γu(L) − (pu0 )(0) = c
2
(
−(pu0 )0 + qu = f in (0, L)
(Non homogeneous mixed problem)
u(0) = a1 , (pu0 )(L) = b2
among several other possible combinations of the above boundary conditions.
The way to deal with different types of non homogeneous boundary conditions
in a variational context should be exploited by means of exercises proposed as a
complement to these notes.
4
Extension to Multi-dimensional Problems
To conclude these lectures we present some main aspects of the application of

the finite element method to solve boundary value partial differential equations
defined in a bounded domain Ω ∈ Rn , with n ≥ 2. We take as a model two second
order linear equations, which we set in variational form before discretizing them
by means of the Ritz method based on continuous functions, whose restriction to
each element of a suitable partition of Ω are constructed upon polynomial bases.
This approach gives rise to the so-called Lagrange finite element method, whose
one-dimensional version was introduced in Chapter 3.
4.1 Sobolev Spaces Hm (Ω)
Let Ω be a bounded open set"of Rn , #that is, ∃C > 0 such that ∀~x = (x1 , x2, . . . ,
n
X
xn ) ∈ Ω, |~x| ≤ C, where |~x| = x2i .
i=1
We consider the space R(Ω) of functions of ~x ∈ Ω, which are Riemann integrable
in Ω. Let R(Ω) be the space of equivalent classes of functions which only differ
60 4. Extension to Multi-dimensional Problems
from eachother in a subset of Ω having a measure equal to zero, that is, a nullset
of Ω(1) .
Recalling that every function of R(Ω) is bounded and continuous in Ω, except
eventually at points forming a null set, we may define the following inner product
for R(Ω): Z
([u]| [v]) = u(~x)v(~x)dΩ,
Ω
where [v] denotes the equivalence class of v, with corresponding norm k[v]k0,2 =
1
([v]| [v])02 .
From now on we drop the brackets for simplicity, by (abusively) identifying the
class [v] ∈ R(Ω) with v itself (in so doing we identity R(Ω) with R(Ω)). Now we
define the space L2 (Ω) as the completion of R(Ω) with respect to the k·k0,2 norm.
By definition L2 (Ω) is then a complete space for this norm, more precisely, the
space of function whose squares are Lebesgue-integrable functions in Ω (cf. [10],
Lebesgue Integration and Measure, ’73).
We further introduce the following notation for the partial derivatives of a
function v defined in Ω: let α = (α1 , α2 , . . . , αn ) be an integer multiindex, i.e.,
n
X
αi ∈ N, αi ≥ 0, ∀i ∈ {1, 2, . . . , n}, and |α| = αi . We denote by ∂ α v the partial
i=1
derivative2 of v of order |α| given by:
∂ |α| v
∂ αv =
∂xα1 1 ∂xα2 2 . . . ∂xαnn
By definition we set ∂ α v = v for α = (0, 0, . . . , 0).
Then for m ∈ N we define the Sobolev space Hm (Ω) for m ∈ N, by:
© ± ª
Hm (Ω) = v ∂ α v ∈ L2 (Ω), ∀α such that |α| ≤ m .
1 More precisely a nullset of Ω is a subset of Ω that does not contain any open ball of Ω, that is, a set of the
form B(~x, ε) = {~
y /|~
x−~y | < ε }, for certain ε ∈ R, ε > 0, and ~
x ∈ Ω (which incidentally is not a nullset).
2 Rigorously we should give a precise sense to such partial derivatives, since we do not necessarily mean usual
ones defined at every point of Ω, that is, strong partial derivatives. Actually when dealing with Sobolev spaces
we have to work with the so-called weak derivatives, i.e., derivatives in the sense of distributions. However it
turns out that the latter coincide with the former almost everywhere for functions in Hm (Ω) (cf. [1], Sobolev
Spaces,’75 )
4.1 Sobolev Spaces Hm (Ω) 61
One may easily establish that the mapping (·| ·)m : Hm (Ω) × Hm (Ω) → R given
by
X Z
(u| v)m = ∂ α u∂ α vdΩ
|α|≤m Ω
defines an inner product in Hm (Ω).
Remark 4.1 Unlike the case of R(Ω) condition (I) holds in the strict sense for
Hm (Ω), since ∂ α v = 0 a. e. in Ω,Z ∀α with |α| = 1 implies that v is constant and
this constant must be zero since v 2 dΩ = 0. ¥
Ω
We denote the norm associated with (·| ·)m by k·km,2 and by |·|m,2 the seminorm
2
X Z
given by |·|m,2 = |∂ α (·)|2 dΩ.
|α|≤m Ω
We also denote the usual Euclidian inner product of two vectors ~x and ~y of Rn
by ~x · ~y . In so doing we have in particular for H1 (Ω):
Z ³ ´
(u| v)1 = ~ ~
uv + grad u · grad v dΩ
Ω
·Z µ ¯ ¯2 ¶ ¸ 12
2 ¯ ~ ¯
kvk1,2 = |v| + ¯grad v ¯ dΩ
Ω
and
·Z µ¯ ¯2 ¶ ¸ 21
¯ ~ ¯
|v|1,2 = ¯grad v ¯ dΩ .
Ω
It is possible to prove that Hm (Ω) is a Hilbert space when equipped with inner
product (·| ·)m .
Let now Ω̄ be the closure of Ω and Γ be the boundary of Ω. If Γ is “sufficiently
smooth”3 , by extending functions of H1 (Ω) to Γ in the natural way, we may define
its closed subspace H01 (Ω), namely
© ± ª
H01 (Ω) = v v ∈ H1 (Ω), v = 0 on Γ .
3 Here this expression rules out any domain situated on both sides of a portion of its boundary such as the

set Ω = ~x ∈ R2 /|~
x| < 1, x1 6= x2 for x1 ≥ 0 .
Such “natural” extension is actually the so-called trace of the function v ∈

1
H (Ω) on Γ. It has an obvious definition in the case of bounded continuous func-
tions in Ω. However, unlike functions of H1 (0, L), those in H1 (Ω) for Ω ⊂ Rn with
n ≥ 2 need to be neither continuous nor bounded. To understand this, one may
© ± ª k
take for instance Ω = ~x ∈ R2 |~x| < 12 and v(~x) = |log |~x|| 2 with k ∈ (0, 1): v
does belong to H1 (Ω) but is singular at the origin.
Nevertheless spaces of smooth functions such as C 1 (Ω̄) are dense in H1 (Ω),
and we may then define the extension of a function v ∈ H1 (Ω) to Γ as the limit
in an appropriate sense of the restrictions to Γ of functions forming a sequence
contained in such spaces, that converges to v in the norm of H1 (Ω).
Similarly to the case of H01 (0, L), the following Friedrichs-Poincaré inequality
holds for H01 (Ω), that is:
∃Cp (Ω) > 0 such that kvk0,2 ≤ Cp (Ω) |v|1,2 , ∀v ∈ H01 (Ω) (4.1)
This implies in particular that |·|1,2 is a norm equivalent to the standard norm
k·k1,2 over H01 (Ω).
As a consequence, H01 (Ω) equipped with the inner product
Z
((u| v))1 = ~ u · grad
grad ~ vdΩ
Ω
is a Hilbert space.
4.2 Green’s Formulae
In order to study variational formulations of partial differential equations, the

following result, giving rise to a series of integration formulae known as Green’s
formulae, is fundamental:
Let ~x ∈ Γ and ~ν (~x) = (ν 1 , ν 2 , . . . , ν n ) be the unit vector normal to Γ and
directed outwards Ω..
4.2 Green’s Formulae 63
If Γ is sufficiently smooth we have

Z I
∂v
dΩ = vν i dΓ, i = 1, 2, . . . , n (4.2)
Ω ∂xi Γ
where the “trace” of v on Γ is defined in the sense

described in the previous section.
From relation (4.2) we easily derive the following
results (for sufficiently smooth Γ).
Xn
1 n ∂vi
♦ If ~v ∈ [H (Ω)] , for div ~v = we have:
i=1
∂xi
Z I
div ~v dΩ = ~v · ~ν dΓ. (4.3)
Ω Γ
n
♦ If ~u ∈ [H1 (Ω)] and q ∈ H1 (Ω) we have:
Z I Z
~
~u · grad qdΩ = q~u · ~v dΓ − qdiv ~udΩ (4.4)
Ω Γ Ω
∂u ~ u·~ν ,
♦ Denoting by the normal derivative of u ∈ H2 (Ω) on Γ given by grad
∂ν
for v ∈ H1 (Ω) the so-called First Green’s formula holds:
Z I Z
~ ~ ∂u
grad u · grad vdΩ = vdΓ − ∆uvdΩ (4.5)
Ω Γ ∂ν Ω
n
X ∂ 2u
where ∆u is the laplacian of u, namely, ∆u = .
i=1
∂x2i
While (4.3) is a trivial consequence of (4.2),(4.4) follows from (4.3) by taking

~ q + qdiv ~u. Finally (4.5) results from (4.4)
~v = q~u, which yields div ~v = ~u · grad
~ u, q = v, and noting that div grad
by taking ~u = grad ~ u = ∆u.
Remark 4.2 The fact that all the above integrals make sense under the assump-
tion that the different functions and fields involved belong to the given spaces,
follows from the Schwartz inequality. Indeed it implies that the product of two
square integrable functions in a bounded domain is integrable in this domain. Ob-

serve also the further assumption that all the functions involved in integrals on Γ
are square integrable (on Γ), which implies that all the corresponding integrands
are integrable on Γ, by the same reason. In particular the trace of a function of
∂v
H1 (Ω) is square integrable on Γ, which implies that the trace of on Γ is also
∂xi
∂v
square integrable, provided v ∈ H2 (Ω), since ∀i, ∈ H1 (Ω) in this case. For
∂xi
more details we refer to [1], Sobolev Spaces,’75. ¥
4.3 Model Variational Problems in N-dimensional Spaces
Let us now consider a model, two variational problems corresponding to linear

elliptic partial differential equations in an n-dimensional bounded domain Ω, for
n = 2 or n = 3, namely:
1. The homogeneous Dirichlet problem for the −∆ operator:

Here V = H01 (Ω) and f ∈ L2 (Ω). We wish to solve (3.1) with
Z
a(u, v) = ~ u · grad
grad ~ vdΩ
Ω
Z
F (v) = f vdΩ.
Ω
In this case Theorem 3.1 of the previous Chapter directly applies, since
a(·, ·) is nothing but the inner product ((·| ·))1 for which H01 (Ω) is a Hilbert
space. Moreover
|F (v)| ≤ kf k0,2 kvk0,2 ≤ CF |v|1,2 , ∀v ∈ H01 (Ω) with CF = Cp (Ω) kf k0,2 .
Let us now show that in this case problem (3.1) is equivalent to the partial
differential equation
(
−∆u = f a.e. in Ω
(4.6)
u = 0 a.e. on Γ
4.3 Model Variational Problems in N-dimensional Spaces 65
Although this is by no means essential, we will do it by assuming that the

solution u to the variational problem belongs to H2 (Ω), which is always true
if Ω is convex.
In this case we may apply (4.5), thereby obtaining:
I Z Z
∂u
a(u, v) = vdΓ − ∆uvdΩ = F (v) = f vdΩ, ∀v ∈ H01 (Ω).
Γ ∂ν Ω Ω
Since the integral on Γ vanishes, we have:

Z
(−∆u − f )vdΩ = 0, ∀v ∈ H01 (Ω).
Ω
Here we may use the following density result:
H01 (Ω) is dense in L2 (Ω) equipped with the norm k·k0,2 .
Although this assertion would not be surprising if H01 (Ω) were replaced with
H1 (Ω), this density result can be established by means of arguments much
similar to those employed to prove that S̄ = V in Chapter 2.
Now since ∆u + f ∈ L2 (Ω), ∀v ∈ L2 (Ω) we may take {vn } ⊂ H01 (Ω) such
that kvn − vk0,2 → 0. This yields (∀v ∈ L2 (Ω))
Z Z
(∆u + f )vdΩ = (∆u + f )(v − vn )dΩ ≤ k∆u + f k0,2 kv − vn k0,2
Ω Ω
and hence Z
(∆u + f )vdΩ = 0, ∀v ∈ L2 (Ω),
Ω
which implies that ∆u + f = 0 a.e. in Ω.

Conversely, assuming that the solution of (4.6) belongs to H2 (Ω), we may
rebuild the original variational problem the other way around, by multi-
plying both sides of the differential equation with an arbitrary v ∈ H01 (Ω),
integrating the result over Ω, and then applying first Green’s formula. In so
doing the former is seen to be equivalent to the latter, at least as long as
we can assert that u ∈ H2 (Ω).
2. The homogeneous Neumann problem for the operator −∆ + I (I being the

identity operator):
Let V = H1 (Ω), f ∈ L2 (Ω) and (3.1) be such that
Z h i
a(u, v) = ~ u · grad
grad ~ v + uv dΩ
Ω
Z
F (v) = f vdΩ.
Ω
Here again Theorem 3.1 of Chapter 3 applies directly, as a(·, ·) is nothing but
the standard inner product of H1 (Ω) and F (v) ≤ CF kvk1,2 , ∀v ∈ H1 (Ω),
with CF = kf k0,2 .
In the same manner as for the Dirichlet problem, we may show that, pro-
vided the solution to the so-defined variational problem belongs to H2 (Ω)
- which is true here again at least if Ω is convex - then u is the solution of
the partial differential equation

 −∆u + u = f a.e. in Ω
(4.7)
 ∂u = 0 a.e. on Γ
∂ν
The reader may verify that, conversely, any solution of (4.7) belonging to
H2 (Ω) is a solution of the variational problem
a(u, v) ≡ (u| v)1 = (f | v)0 ≡ F (v), ∀v ∈ H1 (Ω).
Remark 4.3 The equivalence of both equations (4.6) and (4.7) with the corre-
sponding problem (3.1) holds under assumption much weaker than u ∈ H2 (Ω).
However in order to establish it for much less smooth boundaries Γ, than for con-
vex domains, one must use some regularity results for elliptic partial differential
equations that lie beyond the scope of these notes. ¥
As a final point addressed in this section we recall that, according to Theorem

3.2 of the previous Chapter, the variational problem of the two examples we just
considered, are respectively equivalent to the following minimization problems:
4.4 Lagrange Finite Element Solution Methods 67
1. Find u ∈ H01 (Ω) such that ∀v ∈ H01 (Ω)

Z ¯ ¯2 Z Z ¯ ¯2 Z
1 ¯ ~ ¯ 1 ¯ ~ ¯
¯grad u¯ dΩ − f udΩ ≤ ¯grad v ¯ dΩ − f vdΩ;
2 Ω Ω 2 Ω Ω
2. Find u ∈ H1 (Ω) such that ∀v ∈ H1 (Ω)

·Z ¯ ¯2 ¸ Z Z ·¯ ¯2 ¸ Z
1 ¯ ~ ¯ 2 1 ¯ ~ ¯ 2
¯grad u¯ + u dΩ − f udΩ ≤ ¯grad v ¯ + v dΩ − f vdΩ.
2 Ω Ω 2 Ω Ω
4.4 Lagrange Finite Element Solution Methods
The Ritz method is again a suitable numerical approach to solve the model
problems considered in the previous sections, as it is for a very large spectrum of
similar boundary value problems for partial differential equations.
Recalling the main principles of this technique described in the previous Chap-
ter, we go directly into the description of the particular finite dimensional spaces
VN with corresponding bases, associated with the most classical and popular so-
lution method os this class: the Lagrange finite element method.
The Ritz method corresponds to the choice of particular spaces VN , namely,
N −dimensional spaces Vh consisting of continuous functions whose restriction
to every element of a partition of Ω into subset of a given shape fulfilling some
compatibility conditions at the interfaces, are functions of a given type as well.
Vh should possibly be a subspace of the space of test functions V , i.e., H01 (Ω) or
H1 (Ω). Since in the former case it is not so easy to construct Vh ⊂ V in general, in
order to simplify the presentation we shall assume henceforth that Ω is a polygon
for n = 2 and a polyhedron for n = 3.
In this case we consider only the following types of partitions:
• Th consists of closed sets T which are triangles if n = 2 and tetrahedrons if

n = 3;
• The intersection of two elements of Th is either empty, a common vertex, a

common edge, or a common face (for n = 3 only);
[
• = Ω̄.
T ∈Th
The above conditions imply that ∀T ∈ Th , T ∩ Γ is either the empty set or a

subset of the set of vertices, edges or faces (for n = 3 only) of T .
In so doing we define for the Neumann problem:
© ± ª
Vh = v v|T ∈ Pk , ∀T ∈ Th , v ∈ C 0 (Ω̄)
and for the Dirichlet problem we take the intersection of the above space with
H01 (Ω), whereby Pk denotes the space of polynomials in n variables of degree less
than or equal to k.
In order to illustrate how one may compute with such spaces, we consider the
case K = 1: letting {Si }K
i=1 be the set of vertices of the elements of Th numbered
in such a way that the last L vertices are those belonging to Γ a natural choice
for the basis B of Vh , according to the case, is:
√
For the Dirichlet problem: B = {ϕi }N
i=1 with N = K − L.
√
For the Neumann problem: B = {ϕi }N
i=1 with N = K,
Where ϕi (Sj ) = δ ij for 1 ≤ i, j ≤ N .

Once we have defined an appropriate basis B = {ϕi }N
i=1 for Vh , in the general
case, with N = dimVh , we are led to a system of equations for the components
c1 , c2 , , cN of uh ∈ Vh with respect to B, namely, the solution of:
(
Find uh ∈ Vh such that
(4.8)
a(uh , v) = F (v), ∀v ∈ Vh
This system equivalent to (4.8) is again A~c = f~, where aij = a(ϕj , ϕi ), ~c =
(c1 , c2 , . . . , cN ) and fi = F (ϕi ), 1 ≤ i, j ≤ N .
Now according to Theorem 3.1 of Chapter 3, the following error bound applies
to both model problems:
∃C > 0 such that ku − uh k1,2 ≤ C inf ku − vk1,2

v∈Vh
£ ¤− 1
where C = 1 in the case of the Neumann problem, and C = 1 + Cp2 (Ω) 2 in the
case of the Dirichlet problem.
Whenever u is continuous and bounded , one may conveniently bound above the
infimum term by ku − Ih (u)k1,2 , where Ih (u) is the function of Vh that interpolates
u at the nodes of the partition Th associated with the basis functions of the
Lagrange finite element method (for instance in the case of piecewise linear finite
elements, these are the vertices of the triangles or of the tetrahedrons of Th ).
This choice is not always possible for n ≥ 2, since a function of H1 (Ω) is not
necessarily bounded or continuous, and hence this interpolate might simply not
exist. However here we may apply the Sobolev Embedding Theorem (cf. [1], Sobolev
Spaces, ’75 ) which states that whenever u ∈ H2 (Ω), then u ∈ C 0 (Ω) ∩ B(Ω̄) for
n = 2 and n = 3, where B(Ω̄) is the space of bounded functions in Ω̄. Actually
since we are only considering polygonal or polyhedral domains we always have
u ∈ H2 (Ω) at least when Ω is convex, as previously stated.
Anyway, provided u ∈ H2 (Ω) we have
ku − uh k1,2 ≤ C ku − Ih (u)k1,2 .
Let now hT = max |~x − ~y |, and h = max hT . We also define ρT to be the

~
x,~
y ∈T T ∈T
radius of the circle or sphere inscribed in T ∈ Th (that is ρT = max {ρ}). We
B(~
x,ρ)⊂T
shall make the following assumption on an infinite family of partition{Th }h that
we supposedly employ to solve a corresponding sequence of problems (4.8) with
h ≤ h0 , ∀Th .
ρT
∃ a constant c > 0 such that ≥ c, ∀T ∈ Th and ∀h. (4.9)
hT
Such family of partitions is called quasiuniform.

Then we can prove the following error estimate for both model problems con-
sidered in this Chapter: If u ∈ Hk+1 (Ω) there exists a constant CM independent
of u and h such that:
ku − uh k1,2 ≤ CM hk |u|k+1,2 . (4.10)
Estimate implies in particular that if the quasiuniform family of partitions

{Th }h is such that h → 0, then uh converges to u in H1 (Ω) as h goes to zero.
Moreover (4.10) also implies that uh is the exact solution u, ∀h, whenever u is a
polynomial of degree less than or equal to k in Ω.
At this point we illustrate the importance of the assumption that {Th }h be a
quasiuniform through the above example given by [9] ’73:
Take n = 2 and let the triangle T with vertices (0, h2 ), (0, − h2 ) and (λ, 0) be
an element of Th . Letting λ = h2 we have hT = h and (4.9) does not hold, for:
¡√ ¢
ρT = 41 1 + 4h2 − 1 = h2 /2 + O(h3 ).
Now consider the function v(x1 , x2 ) = x22 and piecewise linear elements. The re-
striction of Ih (v) to T is the linear function IT v such that IT v(0, h2 ) = IT v(0, − h2 ) =
h2 /4 and IT v(λ, 0) = 0.
∂(IT v) ∂(IT v) ∂v ∂v
We have = −1/4 and = 0, whereas = 0 and = 2x2 .
∂x1 ∂x2 ∂x1 ∂x1
·Z ¯ ¯2 ¸ 1 · Z ¸ 21
def ¯ ~ ¯ 2 1 2
This yields |v − IT v|1,2,T = ¯grad (v − IT v)¯ dx1 dx2 ≥ (− ) dx1 dx2 =
T T 4
p   21   21
X Z X Z
area(T ) def
, while |v|2,2,T =  |∂ α v|2 dx1 dx2  =  22 dx1 dx2  ≤
4 T T
|α|=2 |α|=2
8 |v − IT v|1,2,T . This indicates that
" # 12
X
ku − Ih (u)k1,2 ≥ |u − IT u|21,2,T
T ∈T
can be no means be bounded above by a term of the form CM h |u|2,2 with CM

independent of u and h!
To conclude we give an estimate of the error measured in the k·k0,2 norm, in the
case where Ω is convex, so that the solutions of the partial differential equations
(4.6) and (4.7) belong to H2 (Ω) for every f ∈ L2 (Ω).
Here again we may use Nitsche’s trick and establish that there exists a constant
0
CM independent of u and h such that
0
ku − uh k0,2 ≤ CM hk+1 |u|k+1,2 (4.11)
provided u ∈ Hk+1 (Ω).

The proof of error estimate (4.10) requires in general fairly more sophisticated
tools than those employed for one dimensional problems. Nevertheless they are
definitively analogous and qualitatively comparable to those given in Chapter 3.
For further details we refer to several available works on the subject, among which
CIARLET [2] certainly lies among the most celebrated ones.
References
[1] ADAMS, R. A., Sobolev Spaces, Academic Press, New York, 1975.
[2] CIARLET, P. G., The Finite Element Method for Elliptic Problems, North
Holland, Amsterdam, 1976.
[3] COLLATZ, L., Funktionalanalysis und Numerische Mathematik, Springer

Verlag, Berlim, 1964.
[4] COURANT, R. & HILBERT, D., Methods of Mathematical Physics, John

Wiley, N. Y., 1962.
[5] DAUTRAY, R. & LIONS, J. L., Mathematical Analysis and Numerical Meth-
ods for Science and Technology, Springer Verlag, 1993.
[6] LUENBERGER, D. G., Optimization by Vector Space Methods, John Wiley

& Sons Inc., New York, 1969.
[7] MIKHLIN, S. G., Variational Methods in Mathematical Physics, Moscow,

1957.
74 References
[8] SCHWARTZ, L., Théorie des Distributions, Hermann et Cie. editions, Paris,
1950.
[9] STRANG, G. & FIX, G. J., An Analysis of the Finite Element Method,
Prentice Hall, Englewood Cliffs, N. J., 1973.
[10] WEIR,A. J., Lebesgue Integration and Measure, Cambridge University Press,
Cambridge, 1974
[11] Ruas, V., Uma Introdução aos Problemas Variacionais, Editora Guanabara
Dois S. A., Rio de Janeiro, 1979.

An Introduction To The Mathematical Foundations of The Finite Element Method

Uploaded by

Copyright:

Available Formats

You might also like

An Introduction To The Mathematical Foundations of The Finite Element Method

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

An Introduction To The Mathematical Foundations of The Finite Element Method

Uploaded by

Copyright:

Available Formats

An Introduction to the Mathematical

Foundations of the Finite Element Method

October 23, 2002

Aos meus bons amigos da

2 Some Concepts of Functional Analysis 11

3 The Finite Element Method in One Dimension Space 37

4 Extension to Multi-dimensional Problems 59

The finite element method is nowadays a widespread tool to solve Engineering

As it is well-known, the finite element method is employed in most cases, to

A ≥ p(x) ≥ α > 0 and B ≥ q(x) ≥ β ≥ 0, ∀x ∈ [0, L]

Moreover temporarily we shall further assume that p is continuous in (0, L)

where L2 (0, L) = {f /f 2 has a finite (Lebesgue) integral in (0, L) }

Remark 1.1 As proven later, every v such that v 0 ∈ L2 (0, L) is continuous.

−(pu0 )0 + qu = f almost everywhere in (0, L)

From this result (1.1) simply becomes

p(L)u0 (L)v(L) − p(0)u0 (0)v(0) = 0, ∀v ∈ V

equal to P applied at the point given by x = L2 .2 Of course if we stick to the first

corresponding solution, the latter writes:

the function u00 must be defined as an element of L2 (0, L).

3 Now take f = (pu0 )0 and g ≡ 1.

As a matter of fact S is a subspace of the classical Sobolev space H2 (0, L)

2.1 Normed Functions Spaces

(i). kvk ≥ 0, ∀v ∈ V and kvk = 0 if and only if v = ~0V ;

(ii). kαvk = |α| kvk , ∀α ∈ R,∀v ∈ V ;

(iii) kv1 + v2 k ≤ kv1 k + kv2 k , ∀v1 , v2 ∈ V ;

the closed interval [0, L]:

kf k0,2 = 0 ⇒ f ∈ N (0, L), or yet [f ] = [O]

2.2 Inner Product Spaces

We shall be particularly concerned about a class of normed spaces whose norm

(I). (v|v) ≥ 0, ∀v ∈ V and (v|v) = 0 if and only if v = ~0V ;

(II). (v|u) = (u|v), ∀u, v ∈ V ;

Indeed, owing to properties (i),(ii) and (iii) we have ∀u, v ∈ V and ∀α ∈ R:

(u|u) + 2α(u|v) + α2 (v|v) ≥ 0

the result follows.

both belonging to C 0 [0, L].

corresponds to no norm. Indeed, ((v|v))1 = 0 implies that v is an arbitrary con-

C1 kvk1,2 ≤ |v|1,2 ≤ C2 kvk1,2 , ∀v ∈ V

Lemma 2.2 (The Friedrichs-Poincaré inequality) There exists a constant

Proof. ∀x ∈ [0, L] we have:

which implies that

from which we obtain

This proves the lemma with Cp = 2L. q.e.d. ¥

Thus taking v = u we readily conclude that u = ~0V . Therefore ~0V ∈ S ⊥ whatever

Lemma 2.6 (Pythagorean Theorem) If u⊥v then

ku + vk2 = kuk2 + kvk2

Proof. Straightforward using the properties of the norm k·k = (˙ · |·) 2 .

Proposition 2.7 Let v0 be a given element of V , v0 6= ~0V . Then ∀v ∈ V there

2.3 Closed Sets in Normed Spaces

Let S : N → V be a given sequence (v1 , v2 , ..., vn , ...) of elements of V also

Definition 2.8 A sequence {vn } contained in a normed real space V is said to

Definition 2.9 A subset S of a normed real vector space V is said to be a closed

Clearly enough V itself is a closed set, as is the empty set ∅ by definition. A

Proposition 2.10 ∀S ⊂ V, S ⊥ is a closed subspace of V .

Proof. Let {vn } ⊂ S ⊥ be a sequence converging to v0 ∈ V . We have (vn |v) = 0,

|(v0 |v)| ≤ kvn − vk kvk , ∀v ∈ S, and ∀n ∈ N

Since by assumption kvn − v0 k → 0 we must necessarily have (v0 |v) = 0, ∀v ∈ V .

Clearly S ⊂ S̄ and if S is a closed set S̄ = S. However in general S̄ 6= S and

Moreover we have ∀x ∈ [0, L]: