Adaptive Fem

Adaptive Finite Element Methods
Lecture Notes Winter Term 2011/12

R. Verf urth
Fakultat f ur Mathematik, Ruhr-Universit at Bochum
Contents
Chapter I. Introduction 7
I.1. Motivation 7
I.2. One-dimensional Sobolev and nite element spaces 11
I.2.1. Sturm-Liouville problems 11
I.2.2. Idea of the variational formulation 11
I.2.3. Weak derivatives 12
I.2.4. Sobolev spaces and norms 12
I.2.5. Variational formulation of the Sturm-Liouville problem 13
I.2.6. Finite element spaces 14
I.2.7. Finite element discretization of the Sturm-Liouville
problem 14
I.2.8. Nodal basis functions 15
I.3. Multi-dimensional Sobolev and nite element spaces 15
I.3.1. Domains and functions 15
I.3.2. Dierentiation of products 16
I.3.3. Integration by parts formulae 16
I.3.4. Weak derivatives 16
I.3.5. Sobolev spaces and norms 17
I.3.6. Friedrichs and Poincare inequalities 18
I.3.7. Finite element partitions 19
I.3.8. Finite element spaces 21
I.3.9. Approximation properties 22
I.3.10. Nodal shape functions 23
I.3.11. A quasi-interpolation operator 25
I.3.12. Bubble functions 26
Chapter II. A posteriori error estimates 29
II.1. A residual error estimator for the model problem 30
II.1.1. The model problem 30
II.1.2. Variational formulation 30
II.1.3. Finite element discretization 30
II.1.4. Equivalence of error and residual 30
II.1.5. Galerkin orthogonality 31
II.1.6. L
2
-representation of the residual 32
II.1.7. Upper error bound 33
II.1.8. Lower error bound 35
II.1.9. Residual a posteriori error estimate 38
3
4 CONTENTS
II.2. A catalogue of error estimators for the model problem 40
II.2.1. Solution of auxiliary local discrete problems 40
II.2.2. Hierarchical error estimates 46
II.2.3. Averaging techniques 51
II.2.4. Equilibrated residuals 53
II.2.5. H(div)-lifting 57
II.2.6. Asymptotic exactness 60
II.2.7. Convergence 62
II.3. Elliptic problems 62
II.3.1. Scalar linear elliptic equations 62
II.3.2. Mixed formulation of the Poisson equation 65
II.3.3. Displacement form of the equations of linearized
elasticity 68
II.3.4. Mixed formulation of the equations of linearized
elasticity 70
II.3.5. Non-linear problems 76
II.4. Parabolic problems 77
II.4.1. Scalar linear parabolic equations 77
II.4.2. Variational formulation 79
II.4.3. An overview of discretization methods for parabolic
equations 79
II.4.4. Space-time nite elements 80
II.4.5. Finite element discretization 81
II.4.6. A preliminary residual error estimator 82
II.4.7. A residual error estimator for the case of small
convection 84
II.4.8. A residual error estimator for the case of large
convection 84
II.4.9. Space-time adaptivity 85
II.4.10. The method of characteristics 87
II.4.11. Finite volume methods 89
II.4.12. Discontinuous Galerkin methods 95
Chapter III. Implementation 97
III.1. Mesh-renement techniques 97
III.1.1. Marking strategies 97
III.1.2. Regular renement 100
III.1.3. Additional renement 101
III.1.4. Marked edge bisection 103
III.1.5. Mesh-coarsening 103
III.1.6. Mesh-smoothing 105
III.2. Data structures 108
III.2.1. Nodes 109
III.2.2. Elements 109
III.2.3. Grid hierarchy 110
CONTENTS 5
III.3. Numerical examples 110
Chapter IV. Solution of the discrete problems 119
IV.1. Overview 119
IV.2. Classical iterative solvers 122
IV.3. Conjugate gradient algorithms 123
IV.3.1. The conjugate gradient algorithm 123
IV.3.2. The preconditioned conjugate gradient algorithm 126
IV.3.3. Non-symmetric and indenite problems 129
IV.4. Multigrid algorithms 131
IV.4.1. The multigrid algorithm 131
IV.4.2. Smoothing 134
IV.4.3. Prolongation 135
IV.4.4. Restriction 136
Bibliography 139
Index 141
CHAPTER I
Introduction
I.1. Motivation
In the numerical solution of practical problems of physics or engi-
neering such as, e.g., computational uid dynamics, elasticity, or semi-
conductor device simulation one often encounters the diculty that
the overall accuracy of the numerical approximation is deteriorated by
local singularities such as, e.g., singularities arising from re-entrant cor-
ners, interior or boundary layers, or sharp shock-like fronts. An obvious
remedy is to rene the discretization near the critical regions, i.e., to
place more grid-points where the solution is less regular. The question
then is how to identify those regions and how to obtain a good bal-
ance between the rened and un-rened regions such that the overall
accuracy is optimal.
Another closely related problem is to obtain reliable estimates of the
accuracy of the computed numerical solution. A priori error estimates,
as provided, e.g., by the standard error analysis for nite element or
nite dierence methods, are often insucient since they only yield
information on the asymptotic error behaviour and require regularity
conditions of the solution which are not satised in the presence of
singularities as described above.
These considerations clearly show the need for an error estimator
which can a posteriori be extracted from the computed numerical so-
lution and the given data of the problem. Of course, the calculation
of the a posteriori error estimate should be far less expensive than the
computation of the numerical solution. Moreover, the error estimator
should be local and should yield reliable upper and lower bounds for
the true error in a user-specied norm. In this context one should note,
that global upper bounds are sucient to obtain a numerical solution
with an accuracy below a prescribed tolerance. Local lower bounds,
however, are necessary to ensure that the grid is correctly rened so
that one obtains a numerical solution with a prescribed tolerance using
a (nearly) minimal number of grid-points.
Disposing of an a posteriori error estimator, an adaptive mesh-
renement process has the following general structure:
Algorithm I.1.1. (General adaptive algorithm)
(0) Given: The data of a partial dierential equation and a toler-
ance .
7
8 I. INTRODUCTION
Sought: A numerical solution with an error less than .
(1) Construct an initial coarse mesh T
0
representing suciently
well the geometry and data of the problem. Set k = 0.
(2) Solve the discrete problem on T
k
.
(3) For each element K in T
k
compute an a posteriori error esti-
mate.
(4) If the estimated global error is less than then stop. Otherwise
decide which elements have to be rened and construct the next
mesh T
k+1
. Replace k by k + 1 and return to step (2).
The above algorithm is best suited for stationary problems. For
transient calculations, some changes have to be made:
The accuracy of the computed numerical solution has to be
estimated every few time-steps.
The renement process in space should be coupled with a time-
step control.
A partial coarsening of the mesh might be necessary.
Occasionally, a complete re-meshing could be desirable.
In both stationary and transient problems, the renement and un-
renement process may also be coupled with or replaced by a moving-
point technique, which keeps the number of grid-points constant but
changes there relative location.
In order to make Algorithm I.1.1 operative we must specify
a discretization method,
a solver for the discrete problems,
an error estimator which furnishes the a posteriori error esti-
mate,
a renement strategy which determines which elements have
to be rened or coarsened and how this has to be done.
The rst point is a standard one and is not the objective of these lecture
notes. The second point will be addressed in chapter IV (p. 119). The
third point is the objective of chapter II (p. 29). The last point will be
addressed in chapter III (p. 97).
In order to get a rst impression of the capabilities of such an adap-
tive renement strategy, we consider a simple, but typical example. We
are looking for a function u which is harmonic, i.e. satises
u = 0,
in the interior of a circular segment centered at the origin with radius
1 and angle
3
2
, which vanishes on the straight parts
D
of the boundary
, and which has normal derivative
2
3
sin(
2
3
) on the curved part
N
of . Using polar co-ordinates, one easily checks that
u = r
2/3
sin(
2
3
).
I.1. MOTIVATION 9
Figure I.1.1. Triangulation obtained by uniform renement
We compute the Ritz projections u
T
of u onto the spaces of continuous
piecewise linear nite elements corresponding to the two triangulations
shown in gures I.1.1 and I.1.2, i.e., solve the problem:
Find a continuous piecewise linear function u
T
such that
_
u
T
v
T
=
_
N
2
3
sin(
2
3
)v
T
holds for all continuous piecewise linear functions v
T
.
The triangulation of gure I.1.1 is obtained by ve uniform renements
of an initial triangulation T
0
which consists of three right-angled isosce-
les triangles with short sides of unit length. In each renement step
every triangle is cut into four new ones by connecting the midpoints
of its edges. Moreover, the midpoint of an edge having its two end-
points on is projected onto . The triangulation in gure I.1.2
is obtained from T
0
by applying six steps of the adaptive renement
strategy described above using the error estimator
R,K
of section II.1.9
(p. 38). A triangle K T
k
is divided into four new ones if
R,K
0.5 max
K
T
k
R,K
(cf. algorithm III.1.1 (p. 97)). Midpoints of edges having their two
endpoints on are again projected onto . For both meshes we list
in table I.1.1 the number NT of triangles, the number NN of unknowns,
and the relative error
=
|(u u
T
)|
|u|
.
10 I. INTRODUCTION
It clearly shows the advantages of the adaptive renement strategy.
Table I.1.1. Number of triangles NT and of unknowns
NN and relative error for uniform and adaptive rene-
ment
renement NT NN
uniform 6144 2945 0.8%
adaptive 3296 1597 0.9%
Figure I.1.2. Triangulation obtained by adaptive renement
Some of the methods which are presented in these lecture notes are
demonstrated in the Java applet ALF (Adaptive Linear Finite elements).
It is available under the address
http://www.rub.de/num1/files/ALF.html .
A user guide can be found in pdf-form at
http://www.rub.de/num1/files/ALFUserGuide.pdf .
ALF in particular oers the following options:
various domains including those with curved boundaries,
various coarsest meshes,
various dierential equations, in particular
the Poisson equation with smooth and singular solutions,
reaction-diusion equations in particular those having so-
lutions with interior layers,
convection-diusion equations in particular those with so-
lutions having interior and boundary layers,
I.2. ONE-DIMENSIONAL SOBOLEV AND FINITE ELEMENT SPACES 11
various options for building the stiness matrix and the right-
hand side,
various solvers in particular
CG- and PCG-algorithms,
several variants of multigrid algorithms with various cy-
cles and smoothers,
the option to choose among uniform and and adaptive rene-
ment based on various a posteriori error estimators.
I.2. One-dimensional Sobolev and nite element spaces
I.2.1. Sturm-Liouville problems. Variational formulations and
associated Sobolev and nite element spaces are fundamental concepts
throughout these notes. To gain a better understanding of these con-
cepts we rst briey consider the following one-dimensional boundary
value problem (Sturm-Liouville problem)
(pu
+ qu = f in (0, 1)
u(0) = 0, u(1) = 0.
Here p is a continuously dierentiable function with
p(x) p > 0
for all x (0, 1) and q is a non-negative continuous function.
I.2.2. Idea of the variational formulation. The basic idea of
the variational formulation of the above Sturm-Liouville problem can
be described as follows:
Multiply the dierential equation with a continuously dier-
entiable function v with v(0) = 0 and v(1) = 0:
(pu
(x)v(x) + q(x)u(x)v(x) = f(x)v(x)

for 0 x 1.
Integrate the result from 0 to 1:
_
1
0
_
(pu
(x)v(x) + q(x)u(x)v(x)
dx =
_
1
0
f(x)v(x)dx.
Use integration by parts for the term containing derivatives:
_
1
0
(pu
(x)v(x)dx
= p(0)u
(0) v(0)
..
=0
p(1)u
(1) v(1)
..
=0
+
_
1
0
p(x)u
(x)v
(x)dx
=
_
1
0
p(x)u
(x)v
(x)dx.
12 I. INTRODUCTION
To put these ideas on a profound basis we must better specify the prop-
erties of the functions u and v. Classical properties such as continuous
dierentiability are too restrictive; the notion derivative must be gen-
eralised in a suitable way. In view of the discretization the new notion
should in particular cover piecewise dierentiable functions.
I.2.3. Weak derivatives. The above considerations lead to the
notion of a weak derivative. It is motivated by the following obser-
vation: Integration by parts yields for all continuously dierentiable
functions u and v satisfying v(0) = 0 and v(1) = 0
_
1
0
u
(x)v(x)dx = u(1) v(1)

..
=0
u(0) v(0)
..
=0
_
1
0
u(x)v
(x)dx
=
_
1
0
u(x)v
(x)dx.
A function u is called weakly dierentiable with weak deriv-
ative w, if every continuously dierentiable function v with
v(0) = 0 and v(1) = 0 satises
_
1
0
w(x)v(x)dx =
_
1
0
u(x)v
(x)dx.
Example I.2.1. Every continuously dierentiable function is weakly
dierentiable and the weak derivative equals the classical derivative.
Every continuous, piecewise continuously dierentiable function is
weakly dierentiable and the weak derivative equals the classical de-
rivative.
The function u(x) = 1 [2x 1[ is weakly dierentiable with weak
derivative
w(x) =
_
2 for 0 < x <
1
2
2 for
1
2
< x < 1
(cf. g. I.2.1). Notice that the value w(
1
2
) is arbitrary.
I.2.4. Sobolev spaces and norms. Variational formulations and
nite element methods are based on Sobolev spaces.
The L
2
-Norm is dened by
|v| =
_
_
1
0
[v(x)[
2
dx
_1
2
.
L
2
(0, 1) denotes the Lebesgue space of all functions v with
nite L
2
-norm |v|.
H
1
(0, 1) is the Sobolev space of all functions v in L
2
(0, 1),
I.2. ONE-DIMENSIONAL SOBOLEV AND FINITE ELEMENT SPACES 13
@
@
@
Figure I.2.1. Function u(x) = 1 [2x 1[ (magenta)
with its weak derivative (red)
whose weak derivative exists and is contained in L
2
(0, 1)
too.
H
1
0
(0, 1) denotes the Sobolev space of all functions v in
H
1
(0, 1) which satisfy v(0) = 0 and v(1) = 0.
Example I.2.2. Every bounded function is contained in L
2
(0, 1).
The function v(x) =
1
x
is not contained in L
2
(0, 1), since the integral
of
1
x
= v(x)
2
is not nite.
Every continuously dierentiable function is contained in H
1
(0, 1).
Every continuous, piecewise continuously dierentiable function is con-
tained in H
1
(0, 1).
The function v(x) = 1[2x1[ is contained in H
1
0
(0, 1) (cf. g. I.2.1).
The function v(x) = 2
x is not contained in H
1
(0, 1), since the integral
of
1
x
=
_
v
(x))
2
is not nite.
Notice that, in contrast to several dimensions, all functions in
H
1
(0, 1) are continuous.
I.2.5. Variational formulation of the Sturm-Liouville prob-
lem. The variational formulation of the Sturm-Liouville problem is
given by:
Find u H
1
0
(0, 1) such that for all v H
1
0
(0, 1) there holds
_
1
0
_
p(x)u
(x)v
(x) + q(x)u(x)v(x)
dx =
_
1
0
f(x)v(x)dx.
It has the following properties:
It admits a unique solution.
Its solution is the unique minimum in H
1
0
(0, 1) of the energy
14 I. INTRODUCTION
function
1
2
_
1
0
_
p(x)u
(x)
2
+ q(x)u(x)
2
dx
_
1
0
f(x)u(x)dx.
I.2.6. Finite element spaces. The discretization of the above
variational problem is based on nite element spaces. For their def-
inition denote by T an arbitrary partition of the interval (0, 1) into
non-overlapping sub-intervals and by k 1 an arbitrary polynomial
degree.
S
k,0
(T ) denotes the nite element space of all continuous
functions which are piecewise polynomials of degree k on
the intervals of T .
S
k,0
0
(T ) is the nite element space of all functions v in
S
k,0
(T ) which satisfy v(0) = 0 and v(1) = 0.
I.2.7. Finite element discretization of the Sturm-Liouville
problem. The nite element discretization of the Sturm-Liouville
problem is given by:
Find a trial function u
T
S
k,0
0
(T ) such that every test function
v
T
S
k,0
0
(T ) satises
_
1
0
_
p(x)u
T
(x)v
T
(x) + q(x)u
T
(x)v
T
(x)
dx =
_
1
0
f(x)v
T
(x)dx.
It admits a unique solution.
Its solution is the unique minimum in S
k,0
0
(T ) of the energy
function.
After choosing a basis for S
k,0
0
(T ) it amounts to a linear
system of equations with k T 1 unknowns and a tridiag-
onal symmetric positive denite matrix, the so-called sti-
ness matrix.
Integrals are usually approximately evaluated using a quad-
rature formula.
In most case one chooses k = 1 (linear elements) or k = 2
(quadratic elements).
One usually chooses a nodal basis for S
k,0
0
(T ) (cf. g. I.2.2).
I.3. MULTI-DIMENSIONAL SOBOLEV AND FINITE ELEMENT SPACES 15
@
@
@
Figure I.2.2. Nodal basis functions for linear elements
(left, blue) and for quadratic elements (right, endpoints
of intervals blue and midpoints of intervals magenta)
I.2.8. Nodal basis functions. The nodal basis functions for lin-
ear elements are those functions which take the value 1 at exactly one
endpoint of an interval and which vanish at all other endpoints of in-
tervals (cf. left part of g. I.2.2).
The nodal basis functions for quadratic elements are those functions
which take the value 1 at exactly one endpoint or midpoint of an inter-
val and which vanish at all other endpoints and midpoints of intervals
(right part of g. I.2.2, endpoints of intervals blue and midpoints of
intervals magenta).
I.3. Multi-dimensional Sobolev and nite element spaces
I.3.1. Domains and functions. The following notations concern-
ing domains and functions will frequently be used:
open, bounded, connected set in R
d
, d 2, 3;
boundary of , supposed to be Lipschitz-continuous;
D
Dirichlet part of , supposed to be non-empty;
N
Neumann part of , may be empty;
n exterior unit normal to ;
p, q, r, . . . scalar functions with values in R;
u, v, w, . . . vector-elds with values in R
d
;
S, T, . . . tensor-elds with values in R
dd
;
I unit tensor;
gradient;
div divergence;
div u =
d
i=1
u
i
x
i
;
div T =
_
d
i=1
T
ij
x
i
_
1jd
;
= div Laplace operator;
D(u) =
1
2
_
u
i
x
j
+
u
j
x
i
_
1i,jd
deformation tensor;
u v inner product;
16 I. INTRODUCTION
S : T dyadic product (inner product of tensors).
I.3.2. Dierentiation of products. The product formula for dif-
ferentiation yields the following formulae for the dierentiation of prod-
ucts of scalar functions, vector-elds and tensor-elds:
div(pu) = p u + p div u,
div(T u) = (div T) u +T : D(u).
I.3.3. Integration by parts formulae. The above product for-
mulae and the Gauss theorem for integrals give rise to the following
integration by parts formulae:
_
pu ndS =
_
p udx +
_
p div udx,
_
n T udS =
_
(div T) udx +
_
T : D(u)dx.
I.3.4. Weak derivatives. Recall that A denotes the closure of a
set A R
d
.
Example I.3.1. For the sets
A = x R
3
: x
2
1
+ x
2
2
+ x
2
3
< 1 open unit ball
B = x R
3
: 0 < x
2
1
+ x
2
2
+ x
2
3
< 1 punctuated open unit ball
C = x R
3
: 1 < x
2
1
+ x
2
2
+ x
2
3
< 2 open annulus
we have
A = x R
3
: x
2
1
+ x
2
2
+ x
2
3
1 closed unit ball
B = x R
3
: x
2
1
+ x
2
2
+ x
2
3
1 closed unit ball
C = x R
3
: 1 x
2
1
+ x
2
2
+ x
2
3
2 closed annulus.
Given a continuous function : R
d
R, we denote its support by
supp = x R
d
: (x) ,= 0.
The set of all functions that are innitely dierentiable and have their
support contained in is denoted by C
0
():
C
0
() = C
() : supp .
Remark I.3.2. The condition supp is a non trivial one, since
supp is closed and is open. Functions satisfying this condition
vanish at the boundary of together with all their derivatives.
Given a suciently smooth function and a multi-index N
d
,
we denote its partial derivatives by
D
1
+...+
d
1
1
. . . x
d
d
.
Given two functions , C
0
(), the Gauss theorem for integrals
yields for every multi-index N
n
the identity
_
dx = (1)
1
+...+
d
_
.
This identity motivates the denition of the weak derivatives:
Given two integrable functions , L
1
() and a multi-
index N
d
, is called the -th weak derivative of if
and only if the identity
_
dx = (1)
1
+...+
d
_
holds for all functions C
0
(). In this case we write
= D
.
Remark I.3.3. For smooth functions, the notions of classical and weak
derivatives coincide. However, there are functions which are not dif-
ferentiable in the classical sense but which have a weak derivative (cf.
Example I.3.4 below).
Example I.3.4. The function [x[ is not dierentiable in (1, 1), but it
is dierentiable in the weak sense. Its weak derivative is the piecewise
constant function which equals 1 on (1, 0) and 1 on (0, 1).
I.3.5. Sobolev spaces and norms. We will frequently use the
following Sobolev spaces and norms:
H
k
() = L
2
() : D
L
2
() for all N
d
with
1
+ . . . +
d
k,
[[
k
=
_

N
d
1
+...+
d
=k
|D
|
2
L
2
()
_1
2
,
||
k
=
_
k
=0
[[
2
_1
2
=
_

N
d
1
+...+
d
k
|D
|
2
L
2
()
_1
2
,
H
1
0
() = H
1
() : = 0 on ,
18 I. INTRODUCTION
H
1
D
() = H
1
() : = 0 on
D
,
H
1
2
() = L
2
() : =
for some H
1
(),
||1
2
,
= inf||
1
: H
1
(),
= .
Note that all derivatives are to be understood in the weak sense.
Remark I.3.5. The space H
1
2
() is called trace space of H
1
(), its
elements are called traces of functions in H
1
().
Remark I.3.6. Except in one dimension, d = 1, H
1
functions are in
general not continuous and do not admit point values (cf. Example
I.3.7 below). A function, however, which is piecewise dierentiable is
in H
1
() if and only if it is globally continuous. This is crucial for
nite element functions.
Example I.3.7. The function [x[ is not dierentiable, but it is in
H
1
((1, 1)). In two dimension, the function ln(ln(
_
x
2
1
+ x
2
2
)) is an
example of an H
1
-function that is not continuous and which does not
admit a point value in the origin. In three dimensions, a similar exam-
ple is given by ln(
_
x
2
1
+ x
2
2
+ x
2
3
).
Example I.3.8. Consider the open unit ball
= x R
d
: x
2
1
+ . . . + x
2
d
< 1
in R
d
and the functions
(x) = x
2
1
+ . . . + x
2
d
2
R.
Then we have
H
1
()
_
0 if d = 2,
> 1
d
2
if d > 2.
I.3.6. Friedrichs and Poincare inequalities. The following in-
equalities are fundamental:
||
0
c
[[
1
for all H
1
D
(),
Friedrichs inequality
||
0
c
[[
1
for all H
1
() with
_
= 0
Poincare inequality.
The constants c
and c
depend on the domain and are propor-

tional to its diameter.
I.3.7. Finite element partitions. The nite element discretiza-
tions are based on partitions of the domain into non-overlapping
simple subdomains. The collection of these subdomains is called a par-
tition and is labeled T . The members of T , i.e. the subdomains, are
called elements and are labeled K.
Any partition T has to satisfy the following conditions:
is the union of all elements in T .
(Ane equivalence) Each K T is either a trian-
gle or a parallelogram, if d = 2, or a tetrahedron
or a parallelepiped, if d = 3.
(Admissibility) Any two elements in T are either
disjoint or share a vertex or a complete edge or
if d = 3 a complete face.
(Shape-regularity) For any element K, the ratio of
its diameter h
K
to the diameter
K
of the largest
ball inscribed into K is bounded independently of
K.
Remark I.3.9. In two dimensions, d = 2, shape regularity means
that the smallest angles of all elements stay bounded away from zero.
In practice one usually not only considers a single partition T , but
complete families of partitions which are often obtained by successive
local or global renements. Then, the ratio h
K
/
K
must be bounded
uniformly with respect to all elements and all partitions.
With every partition T we associate its shape parameter
C
T
= max
KT
h
K
K
.
Remark I.3.10. In two dimensions triangles and parallelograms may
be mixed (cf. Figure I.3.1). In three dimensions tetrahedrons and
parallelepipeds can be mixed provided prismatic elements are also in-
corporated. The condition of ane equivalence may be dropped. It,
however, considerably simplies the analysis since it implies constant
Jacobians for all element transformations.
With every partition T and its elements K we associate the follow-
ing sets:
A
K
: the vertices of K,
c
K
: the edges or faces of K,
20 I. INTRODUCTION
@
@
@
@
@
Figure I.3.1. Mixture of triangular and quadrilateral elements

A
T
: the vertices of all elements in T , i.e.
A
T
=
_
KT
A
K
,
c
T
: the edges or faces of all elements in T , i.e.
c
T
=
_
KT
c
K
,
A
E
: the vertices of an edge or face E c
T
,
A
T ,
: the vertices on the boundary,
A
T ,
D
: the vertices on the Dirichlet boundary,
A
T ,
N
: the vertices on the Neumann boundary,
A
T ,
: the vertices in the interior of ,
c
T ,
: the edges or faces contained in the boundary,
c
T ,
D
: the edges or faces contained in the Dirichlet
boundary,
c
T ,
N
: the edges or faces contained in the Neumann
boundary,
c
T ,
: the edges or faces having at least one endpoint
in the interior of .
For every element, face, or edge S T c we denote by h
S
its
diameter. Note that the shape regularity of T implies that for all
elements K and K
and all edges E and E
that share at least one

vertex the ratios
h
K
h
K
,
h
E
h
E
and
h
K
h
E
are bounded from below and from
above by constants which only depend on the shape parameter C
T
of
T .
With any element K, any edge or face E, and any vertex x we
associate the following sets (see gures I.3.2 and I.3.3)
K
=
_
E
K
E
K
=
K
,
K
=
_
N
K
N
K
=
K
E
=
_
EE
K
,
E
=
_
N
E
N
K
=
K
,
x
=
_
xN
K
.
Due to the shape-regularity of T the diameter of any of these sets
can be bounded by a multiple of the diameter of any element or edge
contained in that set. The constant only depends on the the shape
parameter C
T
of T .
@
@
@
@
@
@
@
@
@
@
@
@
@
@
@
@
@
@
@
@
@
@
@
@
@
@
@
@
@
@
@
@
@
@
@
@
@
@
@
@
@
@
@
@
@
@
@
@
@
@
@
@
@
@
@
@
@
Figure I.3.2. Some domains

K
,
K
,
E
,
E
, and
x
@
@
@
@
@
@
@
@
@
@
@
@
@
@
@
Figure I.3.3. Some examples of domains

x
I.3.8. Finite element spaces. For any multi-index N
d
we
set for abbreviation
[[
1
=
1
+ . . . +
d
,
[[
= max
i
: 1 i d,
x
= x
1
1
. . . x
d
d
.
22 I. INTRODUCTION
Denote by
K = x R
d
: x
1
+ . . . + x
d
1, x
i
0, 1 i d
the reference simplex for a partition into triangles or tetrahedra and
by
K = [0, 1]
d
the reference cube for a partition into parallelograms or parallelepipeds.
Then every element K T is the image of

K under an ane mapping
F
K
. For every integer number k set
R
k
(
K) =
_
spanx
: [[
1
k ,if K is the reference simplex,
spanx
: [[
k ,if K is the reference cube

and set
R
k
(K) =
_
p F
1
K
: p

R
k
_
.
With this notation we dene nite element spaces by
S
k,1
(T ) = : R :
K
R
k
(K) for all K T ,
S
k,0
(T ) = S
k,1
(T ) C(),
S
k,0
0
(T ) = S
k,0
(T ) H
1
0
() = S
k,0
(T ) : = 0 on .
S
k,0
D
(T ) = S
k,0
(T ) H
1
D
() = S
k,0
(T ) : = 0 on
D
.
Note, that k may be 0 for the rst space, but must be at least 1 for
the other spaces.
Example I.3.11. For the reference triangle, we have
R
1
(
K) = span1, x
1
, x
2
,
R
2
(
K) = span1, x
1
, x
2
, x
2
1
, x
1
x
2
, x
2
2
.
For the reference square on the other hand, we have
R
1
(
K) = span1, x
1
, x
2
, x
1
x
2
,
R
2
(
K) = span1, x
1
, x
2
, x
1
x
2
, x
2
1
, x
2
1
x
2
, x
2
1
x
2
2
, x
1
x
2
2
, x
2
2
.
I.3.9. Approximation properties. The nite element spaces de-
ned above satisfy the following approximation properties:
inf
T
S
k,1
(T )
|
T
|
0
ch
k+1
[[
k+1
H
k+1
(), k N,
inf
T
S
k,0
(T )
[
T
[
j
ch
k+1j
[[
k+1
H
k+1
(),
j 0, 1, k N
,
inf
T
S
k,0
0
(T )
[
T
[
j
ch
k+1j
[[
k+1
H
k+1
() H
1
0
(),
j 0, 1, k N
.
I.3.10. Nodal shape functions. Recall that A
T
denotes the set
of all element vertices.
For any vertex x A
T
the associated nodal shape function is de-
noted by
x
. It is the unique function in S
1,0
(T ) that equals 1 at vertex
x and that vanishes at all other vertices y A
T
x.
The support of a nodal shape function
x
is the set
x
and consists
of all elements that share the vertex x (cf. Figure I.3.3).
The nodal shape functions can easily be computed element-wise
from the co-ordinates of the elements vertices.
@
@
@
a
0
a
0
a
1
a
1
a
2
a
2
a
3
Figure I.3.4. Enumeration of vertices of triangles and
parallelograms
Example I.3.12. (1) Consider a triangle K with vertices a
0
, . . . , a
2
numbered counterclockwise (cf. Figure I.3.4). Then the restrictions to
K of the nodal shape functions
a
0
, . . . ,
a
2
are given by
a
i
(x) =
det(x a
i+1
, a
i+2
a
i+1
)
det(a
i
a
i+1
, a
i+2
a
i+1
)
i = 0, . . . , 2,
where all indices have to be taken modulo 3.
(2) Consider a parallelogramK with vertices a
0
, . . . , a
3
numbered coun-
terclockwise (cf. Figure I.3.4). Then the restrictions to K of the nodal
shape functions
a
0
, . . . ,
a
3
are given by
a
i
(x) =
det(x a
i+2
, a
i+3
a
i+2
)
det(a
i
a
i+2
, a
i+3
a
i+2
)

det(x a
i+2
, a
i+1
a
i+2
)
det(a
i
a
i+2
, a
i+1
a
i+2
)
i = 0, . . . , 3,
(3) Consider a tetrahedron K with vertices a
0
, . . . , a
3
enumerated as in
Figure I.3.5. Then the restrictions to K of the nodal shape functions
24 I. INTRODUCTION
a
0
, . . . ,
a
3
are given by
a
i
(x) =
det(x a
i+1
, a
i+2
a
i+1
, a
i+3
a
i+1
)
det(a
i
a
i+1
, a
i+2
a
i+1
, a
i+3
a
i+1
)
i = 0, . . . , 3,
(4) Consider a parallelepiped K with vertices a
0
, . . . , a
7
enumerated as
in Figure I.3.5. Then the restrictions to K of the nodal shape functions
a
0
, . . . ,
a
7
are given by
a
i
(x) =
det(x a
i+1
, a
i+3
a
i+1
, a
i+5
a
i+1
)
det(a
i
a
i+1
, a
i+3
a
i+1
, a
i+5
a
i+1
)
det(x a
i+2
, a
i+3
a
i+2
, a
i+6
a
i+2
)
det(a
i
a
i+2
, a
i+3
a
i+2
, a
i+6
a
i+2
)
det(x a
i+4
, a
i+5
a
i+4
, a
i+6
a
i+4
)
det(a
i
a
i+4
, a
i+5
a
i+4
, a
i+6
a
i+4
)
i = 0, . . . , 7,
@
@
@
@
@
@
P
P
P
P
P
P
P
P
P
a
0
a
0
a
1
a
1
a
3
a
2
a
3
a
7
a
4
a
6
a
5
Figure I.3.5. Enumeration of vertices of tetrahedra
and parallelepipeds (The vertex a
2
of the parallelepiped
is hidden.)
Remark I.3.13. For every element (triangle, parallelogram, tetrahe-
dron, or parallelepiped) the sum of all nodal shape functions corre-
sponding to the elements vertices is identical equal to 1 on the element.
The functions
x
, x A
T
, form a bases of S
1,0
(T ). The bases
of higher-order spaces S
k,0
(T ), k 2, consist of suitable products of
functions
x
corresponding to appropriate vertices x.
Example I.3.14. (1) Consider a again a triangle K with its vertices
numbered as in example I.3.12 (1). Then the nodal basis of S
2,0
(T )
K
consists of the functions
a
i
[
a
i

a
i+1

a
i+2
] i = 0, . . . , 2
4
a
i
a
i+1
i = 0, . . . , 2,
where the functions
a
are as in example I.3.12 (1) and where all

indices have to be taken modulo 3. An other basis of S
2,0
(T )
K
, called
hierarchical basis, consists of the functions
a
i
i = 0, . . . , 2
4
a
i
a
i+1
i = 0, . . . , 2.
(2) Consider a again a parallelogram K with its vertices numbered as
in example I.3.12 (2). Then the nodal basis of S
2,0
(T )
K
consists of
the functions
a
i
[
a
i

a
i+1
+
a
i+2

a
i+3
] i = 0, . . . , 3
4
a
i
[
a
i+1

a
i+2
] i = 0, . . . , 3
16
a
0
a
2
where the functions
a

indices have to be taken modulo 3. The hierarchical basis of S
2,0
(T )
K
consists of the functions
a
i
i = 0, . . . , 3
4
a
i
[
a
i+1

a
i+2
] i = 0, . . . , 3
16
a
0
a
2
.
(3) Consider a again a tetrahedron K with its vertices numbered as in
example I.3.12 (3). Then the nodal basis of S
2,0
(T )
K
consists of the
functions
a
i
[
a
i

a
i+1

a
i+2

a
i+3
] i = 0, . . . , 3
4
a
i
a
j
0 i < j 3,
where the functions
a

indices have to be taken modulo 4. The hierarchical basis consists of
the functions
a
i
i = 0, . . . , 3
4
a
i
a
j
0 i < j 3.
I.3.11. A quasi-interpolation operator. We will frequently use
the quasi-interpolation operator I
T
: L
1
() S
1,0
D
(T ) which is dened
by
I
T
=
xN
T ,
N
T ,
N
x
1
[
x
[
_
x
dx.
26 I. INTRODUCTION
Here, [
x
[ denotes the area, if d = 2, respectively volume, if d = 3, of
the set
x
.
The operator I
T
satises the following local error estimates for all
H
1
D
() and all elements K T :
| I
T
|
L
2
(K)
c
A1
h
K
||
H
1
(
K
)
,
| I
T
|
L
2
(K)
c
A2
h
1
2
K
||
H
1
(
K
)
.
Here,
K
denotes the set of all elements that share at least a vertex
with K (cf. Figure I.3.6). The constants c
A1
and c
A2
only depend on
the shape parameter C
T
of T .
@
@
@
@
@
@
@
@
@
@
@
@
@
@
@
@
@
@
@
@
@
@
@
@
K
K
@
@
@
Figure I.3.6. Examples of domains
K
Remark I.3.15. The operator I
T
is called a quasi-interpolation op-
erator since it does not interpolate a given function at the vertices
x A
T
. In fact, point values are not dened for H
1
-functions. For
functions with more regularity which are at least in H
2
(), the situa-
tion is dierent. For those functions point values do exist and the clas-
sical nodal interpolation operator J
T
: H
2
() H
1
D
() S
1,0
D
(T ) can
be dened by the relation (J
T
())(x) = (x) for all vertices x A
T
.
I.3.12. Bubble functions. For any element K T we dene an
element bubble function by
K
=
K
xKN
T
x
,
K
=
_
_
27 if K is a triangle,
256 if K is a tetrahedron,
16 if K is a parallelogram,
64 if K is a parallelepiped.
0
K
(x) 1 for all x K,
K
(x) = 0 for all x , K,
max
xK
K
(x) = 1.
For every polynomial degree k there are constants c
I1,k
and
c
I2,k
, which only depend on the degree k and the shape pa-
rameter C
T
of T , such that the following inverse estimates
hold for all polynomials of degree k:
c
I1,k
||
K
|
1
2
K
|
K
,
|(
K
)|
K
c
I2,k
h
1
K
||
K
.
Recall that we denote by c
T
the set of all edges, if d = 2, and of all
faces, if d = 3, of all elements in T . With each edge respectively face
E c
T
we associate an edge respectively face bubble function by
E
=
E
xEN
T
x
,
E
=
_
_
4 if E is a line segment,
27 if E is a triangle,
16 if E is a parallelogram.
0
E
(x) 1 for all x
E
,
E
(x) = 0 for all x ,
E
,
max
x
E
E
(x) = 1.
For every polynomial degree k there are constants c
I3,k
,
c
I4,k
, and c
I5,k
, which only depend on the degree k and the
shape parameter C
T
of T , such that the following inverse
estimates hold for all polynomials of degree k:
c
I3,k
||
E
|
1
2
E
|
E
,
|(
E
)|
E
c
I4,k
h
1
2
E
||
E
,
|
E
|
E
c
I5,k
h
1
2
E
||
E
.
28 I. INTRODUCTION
Here
E
is the union of all elements that share E (cf. Figure I.3.7).
Note that
E
consists of two elements, if E is not contained in the
boundary , and of exactly one element, if E is a subset of .
@
@
@
@
@
@
@
@
@
@
@
@
Figure I.3.7. Examples of domains

E
With each edge respectively face E c
T
we nally associate a unit
vector n
E
orthogonal to E and denote by J
E
() the jump across E in
direction n
E
. If E is contained in the boundary the orientation of
n
E
is xed to be the one of the exterior normal. Otherwise it is not
xed.
Remark I.3.16. J
E
() depends on the orientation of n
E
but quantities
of the form J
E
(n
E
) are independent of this orientation.
CHAPTER II
A posteriori error estimates
In this chapter we will describe various possibilities for a posteriori
error estimation. In order to keep the presentation as simple as possible
we will consider in sections II.1 and II.2 a simple model problem: the
two-dimensional Poisson equation (cf. equation (II.1.1) (p. 30)) dis-
cretized by continuous linear or bilinear nite elements (cf. equation
(II.1.3) (p. 30)). We will review several a posteriori error estimators
and show that in a certain sense they are all equivalent and yield
lower and upper bounds on the error of the nite element discretization.
The estimators can roughly be classied as follows:
Residual estimates: Estimate the error of the computed nu-
merical solution by a suitable norm of its residual with respect
to the strong form of the dierential equation (section II.1.9
(p. 38)).
Solution of auxiliary local problems: On small patches of
elements, solve auxiliary discrete problems similar to, but sim-
pler than the original problem and use appropriate norms of
the local solutions for error estimation (section II.2.1 (p. 40)).
Hierarchical basis error estimates: Evaluate the residual of
the computed nite element solution with respect to another
nite element space corresponding to higher order elements or
to a rened grid (section II.2.2 (p. 46)).
Averaging methods: Use some local extrapolate or average of
the gradient of the computed numerical solution for error es-
timation (section II.2.3 (p. 51)).
Equilibrated residuals: Evaluate approximately a dual varia-
tional problem posed on a function space with a weaker topol-
ogy (section II.2.4 (p. 53)).
H(div)-lifting: Sweeping through the elements sharing a given
vertex construct a vector eld such that its divergence equals
the residual (section II.2.5 (p. 57)).
In section II.2.6 (p. 60), we shortly address the question of asymptotic
exactness, i.e., whether the ratio of the estimated and the exact error
remains bounded or even approaches 1 when the mesh-size converges
to 0. In section II.2.7 (p. 62) we nally show that an adaptive method
based on a suitable error estimator and a suitable mesh-renement
strategy converges to the true solution of the dierential equation.
29
30 II. A POSTERIORI ERROR ESTIMATES
II.1. A residual error estimator for the model problem
II.1.1. The model problem. As a model problem we consider
the Poisson equation with mixed Dirichlet-Neumann boundary condi-
tions
u = f in
u = 0 on
D
(II.1.1)
u
n
= g on
N
in a connected, bounded, polygonal domain R
2
with boundary
consisting of two disjoint parts
D
and
N
. We assume that the Dirich-
let boundary
D
is closed relative to and has a positive length and
that f and g are square integrable functions on and
N
, respectively.
The Neumann boundary
N
may be empty.
II.1.2. Variational formulation. The standard weak formula-
tion of problem (II.1.1) is:
Find u H
1
D
() such that
(II.1.2)
_
u v =
_
fv +
_
N
gv
for all v H
1
D
().
It is well-known that problem (II.1.2) admits a unique solution.
II.1.3. Finite element discretization. As in section I.3.7 (p. 19)
we choose an ane equivalent, admissible and shape-regular partition
T of . We then consider the following nite element discretization of
problem (II.1.2):
Find u
T
S
1,0
D
(T ) such that
(II.1.3)
_
u
T
v
T
=
_
fv
T
+
_
N
gv
T
for all v
T
S
1,0
D
(T ).
Again it is well-known that problem (II.1.3) admits a unique solution.
II.1.4. Equivalence of error and residual. In what follows we
always denote by u H
1
D
() and u
T
S
1,0
D
(T ) the exact solutions of
problems (II.1.2) and (II.1.3), respectively. They satisfy the identity
II.1. A RESIDUAL ERROR ESTIMATOR 31
_
(u u
T
) v =
_
fv +
_
N
gv
_
u
T
v
for all v H
1
D
(). The right-hand side of this equation implicitly
denes the residual of u
T
as an element of the dual space of H
1
D
().
The Friedrichs and Cauchy-Schwarz inequalities imply for all v
H
1
D
()
1
_
1 + c
2
|v|
1
sup
wH
1
D
()
w
1
=1
_
v w |v|
1
.
This corresponds to the fact that the bilinear form
H
1
D
() v, w
_
v w
denes an isomorphism of H
1
D
() onto its dual space. The constants
multiplying the rst and last term in this inequality are related to the
norm of this isomorphism and of its inverse.
The denition of the residual and the above inequality imply the
estimate
sup
wH
1
D
()
w
1
=1
__
fw +
_
N
gw
_
u
T
w
_
|u u
T
|
1
_
1 + c
2
sup
wH
1
D
()
w
1
=1
__
fw +
_
N
gw
_
u
T
w
_
.
Since the sup-term in this inequality is equivalent to the norm of the
residual in the dual space of H
1
D
(), we have proved:
The norm in H
1
D
() of the error is, up to multiplicative
constants, bounded from above and from below by the norm
of the residual in the dual space of H
1
D
().
Most a posteriori error estimators try to estimate this dual norm of
the residual by quantities that can more easily be computed from f, g,
and u
T
.
II.1.5. Galerkin orthogonality. Since S
1,0
D
(T ) H
1
D
(), the
error is orthogonal to S
1,0
D
(T ):
_
(u u
T
) w
T
= 0
for all w
T
S
1,0
D
(T ). Using the denition of the residual, this can be
written as
_
fw
T
+
_
N
gw
T

_
u
T
w
T
= 0
for all w
T
S
1,0
D
(T ). This identity reects the fact that the discretiza-
tion (II.1.3) is consistent and that no additional errors are introduced
by numerical integration or by inexact solution of the discrete problem.
It is often referred to as Galerkin orthogonality.
II.1.6. L
2
-representation of the residual. Integration by parts
element-wise yields for all w H
1
D
()
_
fw +
_
N
gw
_
u
T
w
=
_
fw +
_
N
gw
KT
_
K
u
T
w
=
_
fw +
_
N
gw +
KT
__
K
u
T
w
_
K
n
K
u
T
w
_
=
KT
_
K
(f + u
T
)w +
EE
T ,
N
_
E
(g n
E
u
T
)w
EE
T ,
_
E
J
E
(n
E
u
T
)w.
Here, n
K
denotes the unit exterior normal to the element K. Note that
u
T
vanishes on all triangles.
For abbreviation, we dene element and edge residuals by
R
K
(u
T
) = f + u
T
and
R
E
(u
T
) =
_
_
J
E
(n
E
u
T
) if E c
T ,
,
g n
E
u
T
if E c
T ,
N
,
0 if E c
T ,
D
.
Then we obtain the following L
2
-representation of the residual
_
fw +
_
N
gw
_
u
T
w
=
KT
_
K
R
K
(u
T
)w +
EE
T
_
E
R
E
(u
T
)w.
Together with the Galerkin orthogonality this implies
_
fw +
_
N
gw
_
u
T
w
=
KT
_
K
R
K
(u
T
)(w w
T
)
+
EE
T
_
E
R
E
(u
T
)(w w
T
)
for all w H
1
D
() and all w
T
S
1,0
D
(T ).
II.1.7. Upper error bound. We x an arbitrary function w
H
1
D
() and choose w
T
= I
T
w with the quasi-interpolation operator of
section I.3.11 (p. 25). The Cauchy-Schwarz inequality for integrals and
the properties of I
T
then yield
_
fw +
_
N
gw
_
u
T
w
=
KT
_
K
R
K
(u
T
)(w I
T
w) +
EE
T
_
E
R
E
(u
T
)(w I
T
w)
KT
|R
K
(u
T
)|
K
|w I
T
w|
K
+
EE
T
|R
E
(u
T
)|
E
|w I
T
w|
E
KT
|R
K
(u
T
)|
K
c
A1
h
K
|w|
1,
K
+
EE
T
|R
E
(u
T
)|
E
c
A2
h
1
2
E
|w|
1,
E
.
Invoking the Cauchy-Schwarz inequality for sums this gives
_
fw +
_
N
gw
_
u
T
w
maxc
A1
, c
A2
KT
h
2
K
|R
K
(u
T
)|
2
K
+
EE
T
h
E
|R
E
(u
T
)|
2
E
_1
2
KT
|w|
2
1,
K
+
EE
T
|w|
2
1,
E
_1
2
.
In a last step we observe that the shape-regularity of T implies
_
KT
|w|
2
1,
K
+
EE
T
|w|
2
1,
E
_1
2
c|w|
1
with a constant c which only depends on the shape parameter C
T
of
T and which takes into account that every element is counted several
times on the left-hand side of this inequality.
Combining these estimates with the equivalence of error and resid-
ual, we obtain the following upper bound on the error
|u u
T
|
1
c
KT
h
2
K
|R
K
(u
T
)|
2
K
+
EE
T
h
E
|R
E
(u
T
)|
2
E
_1
2
with
c
=
_
1 + c
2
maxc
A1
, c
A2
c.
The right-hand side of this estimate can be used as an a posteri-
ori error estimator since it only involves the known data f and g, the
solution u
T
of the discrete problem, and the geometrical data of the
partition. The above inequality implies that the a posteriori error es-
timator is reliable in the sense that an inequality of the form error
estimator tolerance implies that the true error is also less than the
tolerance up to the multiplicative constant c
. We want to show that

the error estimator is also ecient in the sense that an inequality of
the form error estimator tolerance implies that the true error is
also greater than the tolerance possibly up to another multiplicative
constant.
For general functions f and g the exact evaluation of the integrals
occurring on the right-hand side of the above estimate may be prohibi-
tively expensive or even impossible. The integrals then must be approx-
imated by suitable quadrature formulae. Alternatively the functions f
and g may be approximated by simpler functions, e.g., piecewise poly-
nomial ones, and the resulting integrals be evaluated exactly. Often,
both approaches are equivalent.
II.1.8. Lower error bound. In order to prove the announced
eciency, we denote for every element K by f
K
the mean value of f
on K
f
K
=
1
[K[
_
K
fdx
and for every edge E on the Neumann boundary by g
E
the mean value
of g on E
g
E
=
1
[E[
_
E
gdS.
We x an arbitrary element K and insert the function
w
K
= (f
K
+ u
T
)
K
in the L
2
-representation of the residual. Taking into account that
supp w
K
K we obtain
_
K
R
K
(u
T
)w
K
=
_
K
(u u
T
) w
K
.
We add
_
K
(f
K
f)w
K
on both sides of this equation and obtain
_
K
(f
K
+ u
T
)
2
K
=
_
K
(f
K
+ u
T
)w
K
=
_
K
(u u
T
) w
K

_
K
(f f
K
)w
K
.
The results of section I.3.12 (p. 26) imply for the left hand-side of this
equation
_
K
(f
K
+ u
T
)
2
K
c
2
I1
|f
K
+ u
T
|
2
K
and for the two terms on its right-hand side
_
K
(u u
T
) w
K
|(u u
T
)|
K
|w
K
|
K
|(u u
T
)|
K
c
I2
h
1
K
|f
K
+ u
T
|
K
_
K
(f f
K
)w
K
|f f
K
|
K
|w
K
|
K
|f f
K
|
K
|f
K
+ u
T
|
K
.
This proves that
h
K
|f
K
+ u
T
|
K
c
2
I1
c
I2
|(u u
T
)|
K
+ c
2
I1
h
K
|f f
K
|
K
.
(II.1.4)
Next, we consider an arbitrary interior edge E c
T ,
and insert the
function
w
E
= R
E
(u
T
)
E
in the L
2
-representation of the residual. This gives
_
E
J
E
(n
E
u
T
)
2
E
=
_
E
R
E
(u
T
)w
E
=
_
E
(u u
T
) w
E
KT
EE
K
_
K
R
K
(u
T
)w
E
=
_
E
(u u
T
) w
E
KT
EE
K
_
K
(f
K
+ u
T
)w
E
KT
EE
K
_
K
(f f
K
)w
E
The results of section I.3.12 (p. 26) imply for the left-hand side of this
equation
_
E
J
E
(n
E
u
T
)
2
E
c
2
I3
|J
E
(n
E
u
T
)|
2
E
and for the three terms on its right-hand side
_
E
(u u
T
) w
E
|(u u
T
)|
1,
E
|w
E
|
1,
E
|(u u
T
)|
1,
E
c
I4
h
1
2
E
|J
E
(n
E
u
T
)|
E
KT
EE
K
_
K
(f
K
+ u
T
)w
E

KT
EE
K
|f
K
+ u
T
|
K
|w
E
|
K
KT
EE
K
|f
K
+ u
T
|
K
c
I5
h
1
2
E
|J
E
(n
E
u
T
)|
E
KT
EE
K
_
K
(f f
K
)w
E

KT
EE
K
|f f
K
|
K
|w
E
|
K
KT
EE
K
|f f
K
|
K
c
I5
h
1
2
E
|J
E
(n
E
u
T
)|
E
and thus yields
c
2
I3
|J
E
(n
E
u
T
)|
E
c
I4
h
1
2
E
|(u u
T
)|
1,
E
+
KT
EE
K
c
I5
h
1
2
E
|f
K
+ u
T
|
K
+
KT
EE
K
c
I5
h
1
2
E
|f f
K
|
K
.
Combining this estimate with inequality (II.1.4) we obtain
h
1
2
E
|J
E
(n
E
u
T
)|
E
c
2
I3
c
I5
_
c
I4
+ c
2
I1
c
I2
|(u u
T
)|
1,
E
+ c
2
I3
c
I5
_
1 + c
2
I1
h
E
KT
EE
K
|f f
K
|
K
.
(II.1.5)
Finally, we x an edge E on the Neumann boundary, denote by K the
adjacent element and insert the function
w
E
= (g
E
n
E
u
T
)
E
in L
2
-representation of the residual. This gives
_
E
R
E
(u
T
)w
E
=
_
K
(u u
T
) w
E

_
K
R
K
(u
T
)w
E
.
We add
_
E
(g
E
g)w
E
on both sides of this equation and obtain
_
E
(g
E
n
E
u
T
)
2
E
=
_
E
(g
E
n
E
u
T
)w
E
=
_
K
(u u
T
) w
E
_
K
(f
K
+ u
T
)w
E

_
K
(f f
K
)w
E
_
E
(g g
E
)w
E
.
Invoking once again the results of section I.3.12 (p. 26) and using the
same arguments as above this implies that
h
1
2
E
|g
E
n
E
u
T
|
E
c
2
I3
c
I5
_
c
I4
+ c
2
I1
c
I2
|(u u
T
)|
K
+ c
2
I3
c
I5
_
1 + c
2
I1
h
K
|f f
K
|
K
+ c
2
I3
h
1
2
E
|g g
E
|
E
.
(II.1.6)
Estimates (II.1.4), (II.1.5), and (II.1.6) prove the announced eciency
of the a posteriori error estimate:
_
h
2
K
|f
K
+ u
T
|
2
K
+
1
2
EE
K
E
T ,
h
E
|J
E
(n
E
u
T
)|
2
E
+
EE
K
E
T ,
N
h
E
|g
E
n
E
u
T
|
2
E
_1
2
c
_
|u u
T
|
2
1,
K
+
T
E
K
E
K
=
h
2
K
|f f
K
|
2
1,K
EE
K
E
T ,
N
h
E
|g g
E
|
2
E
_1
2
.
The constant c
only depends on the shape parameter C

T
.
II.1.9. Residual a posteriori error estimate. The results of
the preceding sections can be summarized as follows:
Denote by u H
1
D
() and u
T
S
1,0
D
(T ) the unique solu-
tions of problems (II.1.2) (p. 30) and (II.1.3) (p. 30), re-
spectively. For every element K T dene the residual a
posteriori error estimator
R,K
by
R,K
=
_
h
2
K
|f
K
+ u
T
|
2
K
+
1
2
EE
K
E
T ,
h
E
|J
E
(n
E
u
T
)|
2
E
+
EE
K
E
T ,
N
h
E
|g
E
n
E
u
T
|
2
E
_1
2
,
where f
K
and g
E
are the mean values of f and g on K
and E, respectively. There are two constants c
and c
,
which only depend on the shape parameter C
T
, such that
the estimates
|u u
T
|
1
c
KT
2
R,K
+
KT
h
2
K
|f f
K
|
2
K
+
EE
T ,
N
h
E
|g g
E
|
2
E
_1
2
and
R,K
c
_
|u u
T
|
2
1,
K
+
T
E
K
E
K
=
h
2
K
|f f
K
|
2
1,K
EE
K
E
T ,
N
h
E
|g g
E
|
2
E
_1
2
hold for all K T .
Remark II.1.1. The factor
1
2
multiplying the second term in
R,K
takes into account that each interior edge is counted twice when adding
all
2
R,K
. Note that u
T
= 0 on all triangles.
Remark II.1.2. The rst term in
R,K
is related to the residual of
u
T
with respect to the strong form of the dierential equation. The
second and third term in
R,K
are related to that boundary operator
which is canonically associated with the strong and weak form of the
dierential equation. These boundary terms are crucial when consid-
ering low order nite element discretizations as done here. Consider
e.g. problem (II.1.1) (p. 30) in the unit square (0, 1)
2
with Dirichlet
boundary conditions on the left and bottom part and exact solution
u(x) = x
1
x
2
. When using a triangulation consisting of right angled
isosceles triangles and evaluating the line integrals by the trapezoidal
rule, the solution of problem (II.1.3) (p. 30) satises u
T
(x) = u(x) for
all x A
T
but u
T
,= u. The second and third term in
R,K
reect
the fact that u
T
/ H
2
() and that u
T
does not exactly satisfy the
Neumann boundary condition.
Remark II.1.3. The correction terms
h
K
|f f
K
|
K
and h
1
2
E
|g g
E
|
E
in the above a posteriori error estimate are in general higher order per-
turbations of the other terms. In special situations, however, they can
be dominant. To see this, assume that T contains at least one triangle,
choose a triangle K
0
T and a non-zero function
0
C
0
(
K
0
), and
consider problem (II.1.1) (p. 30) with f =
0
and
D
= . Since
_
K
0
f =
_
K
0
0
= 0
and f = 0 outside K
0
, we have
f
K
= 0
for all K T . Since
_
fv
T
=
_
K
0
0
v
T
=
_
K
0
0
v
T
= 0
for all v
T
S
1,0
D
(T ), the exact solution of problem (II.1.3) (p. 30) is
u
T
= 0.
Hence, we have
R,K
= 0
for all K T , but
|u u
T
|
1
,= 0.
This eect is not restricted to the particular approximation of f consid-
ered here. Since
0
C
0
(
K
0
) is completely arbitrary, we will always
encounter similar diculties as long as we do not evaluate |f|
K
ex-
actly which in general is impossible. Obviously, this problem is cured
when further rening the mesh.
II.2. A catalogue of error estimators for the model problem
II.2.1. Solution of auxiliary local discrete problems. The
results of section II.1 show that we must reliably estimate the norm
of the residual as an element of the dual space of H
1
D
(). This could
be achieved by lifting the residual to a suitable subspace of H
1
D
() by
solving auxiliary problems similar to, but simpler than the original dis-
crete problem (II.1.3) (p. 30). Practical considerations and the results
of the section II.1 suggest that the auxiliary problems should satisfy
the following conditions:
In order to get an information on the local behaviour of the
error, they should involve only small subdomains of .
In order to yield an accurate information on the error, they
should be based on nite element spaces which are more ac-
curate than the original one.
In order to keep the computational work at a minimum, they
should involve as few degrees of freedom as possible.
II.2. A CATALOGUE OF ERROR ESTIMATORS 41
To each edge and eventually to each element there should cor-
respond at least one degree of freedom in at least one of the
auxiliary problems.
The solution of all auxiliary problems should not cost more
than the assembly of the stiness matrix of problem (II.1.3)
(p. 30).
There are many possible ways to satisfy these conditions. Here, we
present three of them. To this end we denote by P
1
= span1, x
1
, x
2
the space of linear polynomials in two variables.

II.2.1.1. Dirichlet problems associated with vertices. First, we de-
cide to impose Dirichlet boundary conditions on the auxiliary problems.
The fourth condition then implies that the corresponding subdomains
must consist of more than one element. A reasonable choice is to con-
sider all nodes x A
T ,
A
T ,
N
and the corresponding domains
x
(see gures I.3.2 (p. 21) and I.3.3 (p. 21)). The above conditions then
lead to the following denition:
Set for all x A
T ,
A
T ,
N
V
x
= span
K
,
E
,
E
: K T , x A
K
,
E c
T ,
, x A
E
,
E
c
T ,
N
, E

x
,
, , P
1
and
D,x
= |v
x
|
x
where v
x
V
x
is the unique solution of
_
x
v
x
w =
KT
xN
K
_
K
f
K
w +
EE
T ,
N
E
x
_
E
g
E
w
x
u
T
w
for all w V
x
.
In order to get a dierent interpretation of the above problem, set
u
x
= u
T
+ v
x
.
Then
D,x
= |(u
x
u
T
)|
x
and u
x
u
T
+ V
x
_
x
u
x
w =
KT
xN
K
_
K
f
K
w +
EE
T ,
N
E
x
_
E
g
E
w
for all w V
x
. This is a discrete analogue of the following Dirichlet
problem
= f in
x
= u
T
on
x
n
= g on
x

N
.
Hence, we can interpret the error estimator
D,x
in two ways:
We solve a local analogue of the residual equation using a
higher order nite element approximation and use a suitable
norm of the solution as error estimator.
We solve a local discrete analogue of the original problem using
a higher order nite element space and compare the solution
of this problem to the one of problem (II.1.3) (p. 30).
Thus, in a certain sense,
D,x
is based on an extrapolation technique.
It can be shown that it yields upper and lower bounds on the error
u u
T
and that it is comparable to the estimator
R,T
.
Denote by u H
1
D
() and u
T
S
1,0
D
() the unique solu-
tions of problems (II.1.2) (p. 30) and (II.1.3) (p. 30). There
are constants c
N,1
, . . . , c
N,4
, which only depend on the shape
parameter C
T
, such that the estimates
D,x
c
N,1
_

KT
xN
K
2
R,K
_1
2
,
R,K
c
N,2
_

xN
K
\N
T ,
D
2
D,x
_1
2
,
D,x
c
N,3
_
|u u
T
|
2
1,
x
+
KT
xN
K
h
2
K
|f f
K
|
2
K
+
EE
T ,
N
E
x
h
E
|g g
E
|
2
E
_1
2
,
|u u
T
|
1
c
N,4
_

xN
T ,
N
T ,
N
2
D,x
+
KT
h
2
K
|f f
K
|
2
K
+
EE
T ,
N
h
E
|g g
E
|
2
E
_1
2
hold for all x A
T ,
A
T ,
N
and all K T . Here, f
K
, g
E
,
and
R,K
are as in sections II.1.8 (p. 35) and II.1.9 (p. 38).
II.2.1.2. Dirichlet problems associated with elements. We now con-
sider an estimator which is a slight variation of the preceding one.
Instead of all x A
T ,
A
T ,
N
and the corresponding domains
x
we consider all K T and the corresponding sets
K
(see gure I.3.2
(p. 21)). The considerations from the beginning of this section then
lead to the following denition:
Set for all K T
V
K
= span
K
,
E
,
E
: K
T , c
K
c
K
,= ,
E c
K
c
T ,
,
E
c
T ,
N
, E

K
,
, , P
1
and
D,K
= | v
K
|
K
where v
K

V
K
_
K
v
K
w =
T
E
K
E
K
=
_
K
f
K
w +
E
T ,
N
E
K
_
E
g
E
w
K
u
T
w
for all w

V
K
.
As before we can interpret u
T
+ v
K
as an approximate solution of
the following Dirichlet problem
= f in
K
= u
T
on
K
N
n
= g on
K

N
.
It can be shown that
D,K
also yields upper and lower bounds on the
error u u
T
and that it is comparable to
D,x
and
R,K
.
Denote by u H
1
D
() and u
T
S
1,0
D
tions of problem (II.1.2) (p. 30) and (II.1.3) (p. 30). There
are constants c
T ,1
, . . . , c
T ,4
parameter C
T
D,K
c
T ,1
_

K
T
E
K
E
K
=
2
R,K
_1
2
,
R,K
c
T ,2
_

K
T
E
K
E
K
=
2
D,K
_1
2
,
D,K
c
T ,3
_
|u u
T
|
2
1,
K
+
T
E
K
E
K
=
h
2
K
|f f
K
|
2
K
E
T ,
N
E
K
h
E
|g g
E
|
2
E
_1
2
,
|u u
T
|
1
c
T ,4
_
KT
2
D,K
+
KT
h
2
K
|f f
K
|
2
K
+
EE
T ,
N
h
E
|g g
E
|
2
E
_1
2
hold for all K T . Here, f
K
, g
E
,
R,K
are as in sections
II.1.8 (p. 35) and II.1.9 (p. 38).
II.2.1.3. Neumann problems. For the third estimator we decide to
impose Neumann boundary conditions on the auxiliary problems. Now
it is possible to choose the elements in T as the corresponding subdo-
main. This leads to the denition:
Set for alle K T
V
K
= span
K
,
E
: E c
K
c
T ,
D
, , P
1

and
N,K
= |v
K
|
K
where v
K
_
K
v
K
w =
_
K
(f
K
+ u
T
)w
1
2
EE
K
E
T ,
_
E
J
E
(n
E
u
T
)w
+
EE
K
E
T ,
N
_
E
(g
E
n
E
u
T
)w
for all w V
K
.
Note, that the factor
1
2
multiplying the residuals on interior edges
takes into account that interior edges are counted twice when summing
the contributions of all elements.
The above problem can be interpreted as a discrete analogue of the
following Neumann problem
= R
K
(u
T
) in K
n
=
1
2
R
E
(u
T
) on K
n
= R
E
(u
T
) on K
N
= 0 on K
D
.
Again it can be shown that
N,K
also yields upper and lower bounds
on the error and that it is comparable to
R,K
.
Denote by u H
1
D
() and u
T
S
1,0
D
tions of problem (II.1.2) (p. 30) and (II.1.3) (p. 30). There
are constants c
T ,5
, . . . , c
T ,8
parameter C
T
N,K
c
T ,5
R,K
,
R,K
c
T ,6
_

K
T
E
K
E
K
=
2
N,K
_1
2
,
N,K
c
T ,7
_
|u u
T
|
2
1,
K
+
T
E
K
E
K
=
h
2
K
|f f
K
|
2
K

+
EE
K
E
T ,
N
h
E
|g g
E
|
2
E
_1
2
,
|u u
T
|
1
c
T ,8
_
KT
2
N,K
+
KT
h
2
K
|f f
K
|
2
K
+
EE
T ,
N
h
E
|g g
E
|
2
E
_1
2
hold for all K T . Here, f
K
, g
E
,
R,K
are as in sections
II.1.8 (p. 35) and II.1.9 (p. 38).
Remark II.2.1. When T exclusively consists of triangles u
T
van-
ishes element-wise and the normal derivatives n
E
u
T
are edge-wise
constant. In this case the functions , , and can be dropped in the
denitions of V
x
,

V
K
, and V
K
. This considerably reduces the dimension
of the spaces V
x
,

V
K
, and V
K
and thus of the discrete auxiliary prob-
lems. Figures I.3.2 (p. 21) and I.3.3 (p. 21) show typical examples of
domains
x
and
K
. From this it is obvious that in general the above
auxiliary discrete problems have at least the dimensions 12, 7, and 4,
respectively. In any case the computation of
D,x
,
D,K
, and
N,K
is
more expensive than the one of
R,K
. This is sometimes payed o by
an improved accuracy of the error estimate.
II.2.2. Hierarchical error estimates. The key-idea of the hi-
erarchical approach is to solve problem (II.1.2) (p. 30) approximately
using a more accurate nite element space and to compare this solution
with the solution of problem (II.1.3) (p. 30). In order to reduce the
computational cost of the new problem, the new nite element space is
decomposed into the original one and a nearly orthogonal higher order
complement. Then only the contribution corresponding to the com-
plement is computed. To further reduce the computational cost, the
original bilinear form is replaced by an equivalent one which leads to a
diagonal stiness matrix.
To describe this idea in detail, we consider a nite element space
Y
T
which satises S
1.0
D
(T ) Y
T
H
1
D
() and which either consists of
higher order elements or corresponds to a renement of T . We then
denote by w
T
Y
T
the unique solution of
(II.2.1)
_
w
T
v
T
=
_
fv
T
+
_
N
gv
T
for all v
T
Y
T
.
To compare the solutions w
T
of problem (II.2.1) and u
T
of problem
(II.1.3) (p. 30) we subtract
_
u
T
v
T
on both sides of equation
(II.2.1) and take the Galerkin orthogonality into account. We thus
obtain
_
(w
T
u
T
) v
T
=
_
fv
T
+
_
N
gv
T

_
u
T
v
T
=
_
(u u
T
) v
T
for all v
T
Y
T
, where u H
1
D
() is the unique solution of problem
(II.1.2) (p. 30). Since S
1.0
D
(T ) Y
T
, we may insert v
T
= w
T
u
T
as a test-function in this equation. The Cauchy-Schwarz inequality for
integrals then implies
|(w
T
u
T
)| |(u u
T
)|.
To prove the converse estimate, we assume that the space Y
T
satis-
es a saturation assumption, i.e., there is a constant with 0 < 1
such that
(II.2.2) |(u w
T
)| |(u u
T
)|.
From the saturation assumption (II.2.2) and the triangle inequality we
immediately conclude that
|(u u
T
)| |(u w
T
)| +|(w
T
u
T
)|
|(u u
T
)| +|(w
T
u
T
)|
and therefore
|(u u
T
)|
1
1
|(w
T
u
T
)|.
Thus, we have proven the two-sided error bound
|(w
T
u
T
)| |(u u
T
)|
1
1
|(w
T
u
T
)|.
Hence, we may use |(w
T
u
T
)| as an a posteriori error estimator.
This device, however, is not ecient since the computation of w
T
is at least as costly as the one of u
T
. In order to obtain a more ecient
error estimation, we use a hierarchical splitting
Y
T
= S
1,0
D
(T ) Z
T
and assume that the spaces S
1,0
D
(T ) and Z
T
are nearly orthogonal and
satisfy a strengthened Cauchy-Schwarz inequality, i.e., there is a con-
stant with 0 < 1 such that
(II.2.3) [
_
v
T
z
T
[ |v
T
||z
T
|
holds for all v
T
S
1,0
D
(T ), z
T
Z
T
.
Now, we write w
T
u
T
in the form v
T
+z
T
with v
T
S
1,0
D
(T ) and
z
T
Z
T
. From the strengthened Cauchy-Schwarz inequality we then
deduce that
(1 )|v
T
|
2
+|z
T
|
2
|(w
T
u
T
)|
2
(1 + )|v
T
|
2
+|z
T
|
2
and in particular
|z
T
|
1
1
|(w
T
u
T
)|. (II.2.4)
Denote by z
T
Z
T
the unique solution of
(II.2.5)
_
z
T

T
=
_
f
T
+
_
N
g
T

_
u
T

T
for all
T
Z
T
.
From the denitions (II.1.2) (p. 30), (II.1.3) (p. 30), (II.2.1), and
(II.2.5) of u, u
T
, w
T
, and z
T
we infer that
_
z
T

T
=
_
(u u
T
)
T
(II.2.6)
=
_
(w
T
u
T
)
T
for all
T
Z
T
and
_
(w
T
u
T
) v
T
= 0 (II.2.7)
for all v
T
S
1,0
D
(T ). We insert
T
= z
T
in equation (II.2.6). The
Cauchy-Schwarz inequality for integrals then yields
|z
T
| |(u u
T
)|.
On the other hand, we conclude from inequality (II.2.4) and equations
(II.2.6) and (II.2.7) with
T
= z
T
that
|(w
T
u
T
)|
2
=
_
(w
T
u
T
) (w
T
u
T
)
=
_
(w
T
u
T
) (v
T
+ z
T
)
=
_
(w
T
u
T
) z
T
=
_
z
T
z
T
|z
T
||z
T
|
1
|z
T
||(w
T
u
T
)|
and hence
|(u u
T
)|
1
1
|(w
T
u
T
)|
1
(1 )
1
|z
T
|.
Thus, we have established the two-sided error bound
|z
T
| |(u u
T
)|
1
(1 )
1
|z
T
|.
Therefore, |z
T
| can be used as an error estimator.
At rst sight, its computation seems to be cheaper than the one of
w
T
since the dimension of Z
T
is smaller than that of Y
T
. The com-
putation of z
T
, however, still requires the solution of a global system
and is therefore as expensive as the calculation of u
T
and w
T
. But, in
most applications the functions in Z
T
vanish at the vertices of A
T
since
Z
T
is the hierarchical complement of S
1,0
D
(T ) in Y
T
. This in particu-
lar implies that the stiness matrix corresponding to Z
T
is spectrally
equivalent to a suitably scaled lumped mass matrix. Therefore, z
T
can be replaced by a quantity z
T
which can be computed by solving a
diagonal linear system of equations.
More precisely, we assume that there is a bilinear form b on Z
T
Z
T
which has a diagonal stiness matrix and which denes an equivalent
norm to || on Z
T
, i.e.,
(II.2.8) |
T
|
2
b(
T
,
T
) |
T
|
2
holds for all
T
Z
T
with constants 0 < .
The conditions on b imply that there is a unique function z
T
Z
T
which satises
(II.2.9) b(z
T
,
T
) =
_
f
T
+
_
N
g
T

_
u
T

T
for all
T
Z
T
.
The Galerkin orthogonality and equation (II.2.5) imply
b(z
T
,
T
) =
_
(u u
T
)
T
=
_
z
T

T
for all
T
Z
T
. Inserting
T
= z
T
and
T
= z
T
in this identity and
using estimate (II.2.8) we infer that
b(z
T
, z
T
) =
_
(u u
T
) z
T
|(u u
T
)||z
T
|
|(u u
T
)|
1
b(z
T
, z
T
)
1
2
and
|z
T
|
2
= b(z
T
, z
T
)
b(z
T
, z
T
)
1
2
b(z
T
, z
T
)
1
2
b(z
T
, z
T
)
1
2
|z
T
|.
This proves the two-sided error bound
b(z
T
, z
T
)
1
2
|(u u
T
)|
(1 )
1
b(z
T
, z
T
)
1
2
.
We may summarize the results of this section as follows:
Denote by u H
1
D
() and u
T
S
1,0
D
spectively. Assume that the space Y
T
= S
1,0
D
(T ) Z
T
satis-
es the saturation assumption (II.2.2) and the strengthened
Cauchy-Schwarz inequality (II.2.3) and admits a bilinear
form b on Z
T
Z
T
which has a diagonal stiness matrix
and which satises estimate (II.2.8). Denote by z
T
Z
T
the unique solution of problem (II.2.9) and dene the hier-
archical a posteriori error estimator
H
by
H
= b(z
T
, z
T
)
1
2
.
Then the a posteriori error estimates
|(u u
T
)|
(1 )
H
and
H

1
|(u u
T
)|
are valid.
Remark II.2.2. When considering families of partitions obtained by
successive renement, the constants and in the saturation as-
sumption and the strengthened Cauchy-Schwarz inequality should be
uniformly less than 1. Similarly, the quotient

should be uniformly
bounded.
Remark II.2.3. The bilinear form b can often be constructed as fol-
lows. The hierarchical complement Z
T
can be chosen such that its
elements vanish at the element vertices A
T
. Standard scaling argu-
ments then imply that on Z
T
the H
1
-semi-norm || is equivalent to
a scaled L
2
-norm. Similarly, one can then prove that the mass-matrix
corresponding to this norm is spectrally equivalent to a lumped mass-
matrix. The lumping process in turn corresponds to a suitable numer-
ical quadrature. The bilinear form b then is given by the inner-product
corresponding to the weighted L
2
-norm evaluated with the quadrature
rule.
Remark II.2.4. The strengthened Cauchy-Schwarz inequality, e.g.,
holds if Y
T
consists of continuous piecewise quadratic or biquadratic
functions. Often it can be established by transforming to the reference
element and solving a small eigenvalue-problem there.
Remark II.2.5. The saturation assumption (II.2.2) is used to estab-
lish the reliability of the error estimator
H
. One can prove that the
reliability of
H
in turn implies the saturation assumption (II.2.2). If
the space Y
T
contains the functions w
K
and w
E
of section II.1.8 (p. 35)
one may repeat the proofs of estimates (II.1.4) (p. 35), (II.1.5) (p. 37),
and (II.1.6) (p. 38) and obtains that up to perturbation terms of
the form h
K
|f f
K
|
K
and h
1
2
E
|g g
E
|
E
the quantity |z
T
|
K
is
bounded from below by
R,K
for every element K. Together with the
results of section II.1.9 (p. 38) and inequality (II.2.9) this proves up
to the perturbation terms the reliability of
H
without resorting to
the saturation assumption. In fact, this result may be used to show
that the saturation assumption holds if the right-hand sides f and g
of problem (II.1.1) (p. 30) are piecewise constant on T and c
T ,
N
, re-
spectively.
II.2.3. Averaging techniques. To avoid unnecessary technical
diculties and to simplify the presentation, we consider in this sec-
tion problem (II.1.1) (p. 30) with pure Dirichlet boundary conditions,
i.e.
N
= , and assume that the partition T exclusively consists of
triangles.
The error estimator of this chapter is based on the following ideas.
Denote by u and u
T
the unique solutions of problems (II.1.2) (p. 30)
and (II.1.3) (p. 30). Suppose that we dispose of an easily computable
approximation Gu
T
of u
T
such that
(II.2.10) |u Gu
T
| |u u
T
|
holds with a constant 0 < 1. We then have
1
1 +
|Gu
T
u
T
| |u u
T
|
1
1
|Gu
T
u
T
|
and may therefore choose |Gu
T
u
T
| as an error estimator. Since
u
T
is a piecewise constant vector-eld we may hope that its L
2
-
projection onto the continuous, piecewise linear vector-elds satises
inequality (II.2.10). The computation of this projection, however, is
as expensive as the solution of problem (II.1.3) (p. 30). We therefore
replace the L
2
-scalar product by an approximation which leads to a
more tractable auxiliary problem.
In order to make these ideas more precise, we denote by W
T
the
space of all piecewise linear vector-elds and set V
T
= W
T
C(, R
2
).
Note that X
T
W
T
. We dene a mesh-dependent scalar product
(, )
T
on W
T
by
(v, w)
T
=
KT
[K[
3
_

xN
K
v[
K
(x) w[
K
(x)
_
.
Here, [K[ denotes the area of K and
[
K
(x) = lim
yx
yK
(y)
for all W
T
, K T , x A
K
.
Since the quadrature formula
_
K

[K[
3
xN
K
(x)
is exact for all linear functions, we have
(II.2.11) (v, w)
T
=
_
v w
if both arguments are elements of W
T
and at least one of them is
piecewise constant. Moreover, one easily checks that
1
4
|v|
2
(v, v)
T
|v|
2
for all v W
T
and
(v, w)
T
=
1
3
xN
T
[
x
[v(x) w(x) (II.2.12)
for all v, w V
T
.
Denote by Gu
T
V
T
the (, )
T
-projection of u
T
onto V
T
, i.e.,
(Gu
T
, v
T
)
T
= (u
T
, v
T
)
T
for all v
T
V
T
. Equations (II.2.11) and (II.2.12) imply that
Gu
T
(x) =
KT
xN
K
[K[
[
x
[
u
T
[
K
for all x A
T
. Thus, Gu
T
may be computed by a local averaging of
u
T
.
We nally set
Z,K
= |Gu
T
u
T
|
K
and
Z
=
_
KT
2
Z,K
_1
2
.
One can prove that
Z
yields upper and lower bounds for the error
and that it is comparable to the residual error estimator
R,K
of section
II.1.9 (p. 38).
II.2.4. Equilibrated residuals. The error estimator of this sec-
tion is due to Ladev`eze and Leguillon and is based on a dual variational
principle.
We dene the energy norm |[|[ corresponding to the variational
problem (II.1.2) (p. 30) by
|[v|[
2
=
_
v v
and a quadratic functional J on H
1
D
() by
J(v) =
1
2
_
v v
_
fv
_
N
gv +
_
u
T
v,
where u
T
S
1,0
D
(T ) is the unique solution of problem (II.1.3) (p. 30).
The Galerkin orthogonality implies that
J(v) =
1
2
_
v v
_
(u u
T
) v
for all v H
1
D
(). Hence, J attains its unique minimum at u u
T
.
Inserting v = u u
T
in the denition of J therefore yields
1
2
|[u u
T
|[
2
= J(u u
T
) J(v)
for all v H
1
D
(). Thus, the energy norm of the error can be computed
by solving the variational problem
minimize J(v) in H
1
D
().
Unfortunately, this is an innite dimensional minimization problem.
In order to obtain a more tractable problem we want to replace H
1
D
()
by the broken space
H
T
= v L
2
() : v[
K
H
1
(K) for all K T , v = 0 on
D
.
Obviously, we have
H
1
D
() = v H
T
: J
E
(v) = 0 for all E c
T ,
,
where J
E
(v) denotes the jump of v across E in direction n
E
.
One can prove that a continuous linear functional on H
T
vanishes
on H
1
D
() if and only if there is a vector-eld L
2
()
d
with div
L
2
() and n = 0 on
N
such that
(v) =
KT
_
K
n
K
v
for all v V
T
. Hence, the polar of H
1
D
() in V
T
can be identied with
the space
M = L
2
()
d
: div L
2
(), n = 0 on
N
.
We want to extend the residual R(u
T
) of section II.1.4 (p. 30) to
a continuous linear functional on the larger space H
T
. To this end we
associate with each E c
T
a smooth vector-eld
E
. The choice of
E
is arbitrary subject to the constraint that
E
= g[
E
for all E c
T ,
N
.
The particular choice of the uxes
E
for the interelement boundaries
will later on determine the error estimation method; for E
D
the
value of
E
is completely irrelevant. Once we have chosen the
E
we
can associate with each element K T a vector-eld
K
dened on
K such that
(II.2.13)
KT
_
K
K
v =
EE
T
_
E
E
J
E
(v)
for all v H
T
. Here, we use the convention that J
E
(v) = v if E .
We then dene the extension

R(u
T
) of R(u
T
) to H
T
by
R(u
T
), v =
KT
_
_
K
fv
_
K
u
T
v +
_
K
K
v
_
EE
T ,
_
E
E
J
E
(v)
for all v H
T
.
Due to equation (II.2.13) and the denition of
E
on the Neumann
boundary
N
, we have
R(u
T
), v =
_
fv +
_
N
gv
_
u
T
v
for all v H
1
D
(). Moreover, there is a
M such that
(v) =
EE
T ,
_
E
E
J
E
(v)
for all v H
T
since
EE
T ,
_
E
E
J
E
(v) = 0
holds for all v H
1
D
().
Now, we dene a Lagrangian functional / on H
T
M by
/(v, ) =
1
2
KT
_
K
v v
R
T
(u
T
), v (v).
M is the space of Langrange multipliers for the constraint
J
E
(v) = 0 for all E c
T ,
.
This implies that
1
2
|[u u
T
|[
2
= inf
vH
1
D
()
J(v)
= inf
wH
T
sup
M
/(w, )
= sup
M
inf
wH
T
/(w, ).
Hence, we get for all M
1
2
|[u u
T
|[
2
inf
wH
T
/(w, )
= inf
wH
T
_
KT
_
1
2
_
K
w w
_
K
fw
+
_
K
u
T
w
_
K
K
w
(w) (w)
_
.
The particular choice =
therefore yields
|[u u
T
|[
2
2 inf
wH
T
KT
_
1
2
_
K
w w
_
K
fw
+
_
K
u
T
w
_
K
K
w
_
.
In order to write this in a more compact form, we denote by H
K
the
restriction of H
1
D
() to a single element K T and set
J
K
(w) =
1
2
_
K
w w
_
K
fw +
_
K
u
T
w
_
K
K
w
for all w H
K
and all K T . We then have
H
T

=
KT
H
K
and therefore
(II.2.14) |[u u
T
|[
2
2
KT
inf
w
K
H
K
J
K
(w
K
).
Estimate (II.2.14) reduces the computation of the energy norm of
the error to a family of minimization problems on the elements in T .
However, for each K T , the corresponding variational problem still
is innite dimensional. In order to overcome this diculty we rst
rewrite J
K
. Using integration by parts we see that
J
K
(w) =
1
2
_
K
w w
_
K
(f + u
T
)w
_
K
(
K
n
K
u
T
)
=
1
2
_
K
w w
_
K
R
K
(u
T
)w
_
K
R
K
(u
T
) w,
where
R
K
(u
T
) = f + u
T
and
R
K
(u
T
) =
K
n
K
u
T
.
For every element K T we dene
Y
K
= L
2
(K)
d
: div L
2
(K),
div = R
K
(u
T
), n
K
= R
K
(u
T
).
(II.2.15)
The complementary energy principle then tells us that
inf
wH
K
J
K
(w) = sup
Y
K
1
2
_
K
.
Together with inequality (II.2.14) this implies that
|[u u
T
|[
2
2
KT
sup
K
Y
K
1
2
_
K
K

K
=
KT
inf
K
Y
K
_
K
K

K
.
This is the announced dual variational principle. It proves:
Denote by u H
1
D
() and u
T
S
1,0
D
() the unique solu-
spectively. For every element K T dene the space Y
K
by equation (II.2.15). Then the a posteriori error estimate
|(u u
T
)|
2
KT
_
K
K

K
holds for all
K
Y
K
and all K T .
Remark II.2.6. The concrete realization of the equilibrated residual
method depends on the choice of the
K
and on the denition of the
K
.
Ladev`eze and Leguillon choose
K
to be the average of n
K
u
T
from
the neighbouring elements plus a suitable piecewise linear function on
K. The functions
K
are often chosen as the solution of a maximiza-
tion problem on a nite dimensional subspace of Y
K
corresponding to
higher order nite elements. With a proper choice of
K
and
K
one
may thus recover the error estimator
N,K
of section II.2.1.3 (p. 44).
II.2.5. H(div)-lifting. The basic idea is to construct a piece-wise
linear vector eld
T
such that
(II.2.16)
div
T
= f on every K T
J
E
(n
E

T
) = J
E
(n
E
u
T
) on every E c
n
T
= g n u
T
on every E c
N
.
Then the vector eld =
T
+ u
T
is contained in H(div; ) and
satises
(II.2.17)
div = f in
n = g on
N
.
since u
T
vanishes element-wise.
To simplify the presentation we assume for the rest of this section
that
T exclusively consists of triangles,
f is piece-wise constant,
g is piece-wise constant.
Parallelograms could be treated by changing the denition (II.2.18) of
the vector elds
K,E
. General functions f and g introduce additional
data errors.
For every triangle K and every edge E thereof we denote by a
K,E
the vertex of K which is not contained in E and set
(II.2.18)
K,E
(x) =

1
(E)
2
2
(K)
(x a
K,E
).
The vector elds
K,E
are the shape functions of the lowest order
Raviart-Thomas space and have the following properties
(II.2.19)
div
K,E
=

1
(E)
2
(K)
on K,
n
K

K,E
= 0 on K E,
n
K

K,E
= 1 on E,
|
K,E
|
K
ch
K
,
where n
K
denotes the unit exterior normal of K and where the constant
c only depends on the shape parameter of T .
@
@
@
@
@
K
1
K
2
K
3
K
4
K
5
K
6
K
7
Figure II.2.1. Enumeration of elements in
z
Now, we consider an arbitrary interior vertex z A
. We enumer-
ate the triangles in
z
from 1 to n and the edges emanating from z
from 0 to n such that (cf. gure II.2.1)
E
0
= E
n
,
E
i1
and E
i
are edges of K
i
for every i.
We dene
0
= 0
and recursively for i = 1, . . . , n
i
=

2
(K
i
)
3
1
(E
i
)
f +

1
(E
i1
)
2
1
(E
i
)
J
E
i1
(n
E
i1
u
T
) +

1
(E
i1
)
1
(E
i
)

i1
.
By induction we obtain
1
(E
n
)
n
=
n
i=1
2
(K
i
)
3
f +
n1
j=0
1
(E
j
)
2
J
E
j
(n
E
j
u
T
).
Since
_
K
i
z
=

2
(K
i
)
3
for every i 1, . . . , n and since
_
E
j
z
=

1
(E
j
)
2
for every j 0, . . . , n 1, we conclude using the assumption that
f and g are piece-wise constant that
i=1
2
(K
i
)
3
f +
n1
j=0
1
(E
j
)
2
J
E
j
(n
E
j
u
T
) =
_
r
z

_
j
z
= 0.
Hence we have
n
= 0. Therefore we can dene a vector eld
z
by
setting for every i 1, . . . , n
(II.2.20)
z
[
K
i
=
i
K
i
,E
i

_
J
E
i1
(n
E
i1
u
T
) +
i1
_
K
i
,E
i1
.
Equations (II.2.19) and the denition of the
i
imply that
(II.2.21)
div
z
=
1
3
f on K
i
J
E
i
(
z
n
E
i
) =
1
2
J
E
(n
E
i
u
T
) on E
i
holds for every i 1, . . . , n.
For a vertex on the boundary , the construction of
z
must be
modied as follows:
For every edge on the Neumann boundary
N
we must replace
J
E
(n
E
u
T
) by g n u
T
.
If z is a vertex on the Dirichlet boundary, there is at least one
edge emanating from z which is contained in
D
. We must
choose the enumeration of the edges such that E
n
is one of
these edges.
With these modications, equations (II.2.20) and (II.2.21) carry over
although in general
n
,= 0 for vertices on the boundary .
In a nal step, we extend the vector elds
z
by zero outside
z
and
set
(II.2.22)
T
=
zN
z
.
Since every triangle has three vertices and every edge has two vertices,
we conclude from equations (II.2.21) that
T
has the desired properties
(II.2.16).
The last inequality in (II.2.19), the denition of the
i
, and the
observation that u
T
vanishes element-wise imply that
|
z
|
z
c
_

K
z
h
2
K
|f + u
T
|
2
K
+
E
z
h
E
|J
E
(n
E
u
T
)|
2
E
+
E
z
N
h
E
|g n
E
u
T
|
2
E
_1
2
holds for every vertex z A with a constant which only depends on
the shape parameter of T .
Combining these results, we arrive at the following a posteriori error
estimates:
|(u u
T
)| |
T
|
|
T
| c
|(u u
T
)|
II.2.6. Asymptotic exactness. The quality of an a posteriori
error estimator is often measured by its eciency index, i.e., the ratio
of the estimated error and of the true error. An error estimator is called
ecient if its eciency index together with its inverse remain bounded
for all mesh-sizes. It is called asymptotically exact if its eciency index
tends to one when the mesh-size converges to zero.
In the generic case we have
_
KT
h
2
K
|f f
K
|
2
K
_1
2
= o(h)
and
_

EE
T ,
N
h
E
|g g
E
|
2
E
_1
2
= o(h),
where
h = max
KT
h
K
denotes the maximal mesh-size. On the other hand, the solutions of
problems (II.1.2) (p. 30) and (II.1.3) (p. 30) satisfy
|u u
T
|
1
ch
always but in trivial cases. Hence, the results of sections II.1.9 (p. 38),
II.2.1.1 (p. 41), II.2.1.2 (p. 43), II.2.1.3 (p. 44), II.2.3 (p. 51), and
II.2.3 (p. 51) imply that the corresponding error estimators are ecient.
Their eciency indices can in principle be estimated explicitly since
the constants in the above sections only depend on the constants in
the quasi-interpolation error estimate of section I.3.11 (p. 25) and the
inverse inequalities of section I.3.12 (p. 26) for which sharp bounds can
be derived.
Using super-convergence results one can also prove that on special
meshes the error estimators of sections II.1.9 (p. 38), II.2.1.1 (p. 41),
II.2.1.2 (p. 43), II.2.1.3 (p. 44), II.2.3 (p. 51), and II.2.3 (p. 51) are
asymptotically exact.
The following example shows that asymptotic exactness may not
hold on general meshes even if they are strongly structured.
@
@
@
@
@
@
@
@
@
@
@
@
@
@
@
@
@
@
@
@
@
@
@
@
@
@
@
@
@
@
@
@
@
@
@
@
@
@
@
@
@
@
@
@
@
@
@
@
Figure II.2.2. Triangulation of example II.2.7 corre-

sponding to n = 4
Example II.2.7. Consider problem (II.1.1) (p. 30) on the unit square
= (0, 1)
2
with
N
= (0, 1) 0 (0, 1) 1,
g = 0,
and
f = 1.
The exact solution is
u(x, y) =
1
2
x(1 x).
The triangulation T is obtained as follows (cf. gure II.2.2): is
divided into n
2
squares with sides of length h =
1
n
, n N
; each
square is cut into four triangles by drawing the two diagonals. This
triangulation is often called a criss-cross grid. Since the solution u
of problem (II.1.1) (p. 30) is quadratic and the Neumann boundary
conditions are homogeneous, one easily checks that the solution u
T
of
problem (II.1.3) (p. 30) is given by
u
T
(x) =
_
u(x) if x is a vertex of a square,
u(x)
h
2
24
if x is a midpoint of a square.
Using this expression for u
T
one can explicitly calculate the error and
the error estimator. After some computations one obtains for any
square Q, which is disjoint from
N
,
_
KT
KQ
2
N,K
_1
2
_
|e|
Q
=
_
17
6
1.68.
Hence, the error estimator cannot be asymptotically exact.
II.2.7. Convergence. Assume that we dispose of an error esti-
mator
K
which yields global upper and local lower bounds for the
error of the solution of problem (II.1.2) (p. 30) and its nite element
discretization (II.1.3) (p. 30) and that we apply the general adaptive
algorithm I.1.1 (p. 7) with one of the renement strategies of algo-
rithms III.1.1 (p. 97) and III.1.2 (p. 98). Then one can prove that the
error decreases linearly. More precisely: If u denotes the solution of
problem (II.1.2) and if u
i
denotes the solution of the discrete problem
(II.1.3) corresponding to the i-the partition T
i
, then there is a constant
0 < < 1, which only depends on the constants c
and c
in the error
bounds, such that
|(u u
i
)|
i
|(u u
0
)|.
II.3. Elliptic problems
II.3.1. Scalar linear elliptic equations. In this section we con-
sider scalar linear elliptic partial dierential equations in their general
form
div(Au) +a u + u = f in
u = 0 on
D
n Au = g on
N
where the diusion A(x) is for every x a symmetric positive denite
matrix. We assume that the data satisfy the following conditions:
The diusion A is continuously dierentiable and uniformly
elliptic and uniformly isotropic, i.e.,
= inf
x
min
zR
d
\{0}
z
t
A(x)z
z
t
z
> 0
and
=
1
sup
x
max
zR
d
\{0}
z
t
A(x)z
z
t
z
is of moderate size.
II.3. ELLIPTIC PROBLEMS 63
The convection a is a continuously dierentiable vector eld
and scaled such that
sup
x
[a(x)[ 1.
The reaction is a continuous non-negative scalar function.
There is a constant 0 such that

1
2
div a
for all x . Moreover there is a constant c
b
0 of moderate
size such that
sup
x
(x) c
b
.
The Dirichlet boundary
D
has positive (d 1)-dimensional
measure and includes the inow boundary x : a(x)
n(x) < 0.
With these assumptions we can distinguish dierent regimes:
dominant diusion: sup
x
[a(x)[ c
c
and c
b
with
constants of moderate size;
dominant reaction: sup
x
[a(x)[ c
c
and with a
constant c
c
of moderate size;
dominant convection: .
II.3.1.1. Variational formulation. The variational formulation of
the above dierential equation is given by:
Find u H
1
D
() such that
_
u Av +a uv + uv =
_
fv +
_
N
gv
holds for all v H
1
D
().
The above assumptions on the dierential equation imply that this
variational problem admits a unique solution and that the correspond-
ing natural energy norm is given by
|[v|[ =
_
|v|
2
+ |v|
2
_1
2
.
The corresponding dual norm is denoted by |[|[
and is given by
|[w|[
= sup
vH
1
D
()\{0}
1
|[v|[
_
v w + vw.
II.3.1.2. Finite element discretization. The nite element discreti-
zation of the above dierential equation is given by:
Find u
T
S
k,0
D
(T ) such that
_
u
T
Av
T
+a u
T
v
T
+ u
T
v
T
KT
K
_
K
div(Au
T
) +a u
T
+ u
T
a v
T
=
_
fv
T
+
_
N
gv
T
+
KT
K
_
K
fa v
T
holds for all v
T
S
k,0
D
(T ).
The
K
are non-negative stabilization parameters. The case
K
= 0
for all elements K corresponds to the standard nite element scheme.
This choice is appropriate for the diusion dominated and reaction
dominated regimes. In the case of a dominant convection, however, the
K
should be chosen strictly positive in order to stabilize the discretiza-
tion. In this case the discretization is often referred to as streamline
upwind Petrov-Galerkin discretization or in short SUPG discretization.
With an appropriate choice of the stabilization parameters
K
one
can prove that the above discrete problem admits a unique solution for
all regimes described above.
II.3.1.3. Residual error estimates. Denote by u and u
T
the solu-
tions of the variational problem and of its discretization. As in section
II.1.6 (p. 32) we dene element and edge or face residuals by
R
K
(u
T
) = f
K
+ div(Au
T
) a u
T
u
T
R
E
(u
T
) =
_
_
J
E
(n
E
Au
T
) if E c
T ,
,
g n
E
Au
T
if E c
T ,
N
,
0 if E c
T ,
D
.
Here, as in section I.3.8 (p. 21), f
K
and g
E
denote the average of f on
K and the average of g on E, respectively. The residual error estimator
is then given by
R,K
=
_
2
K
|R
K
(u
T
)|
2
K
+
EE
K
1
2
E
|R
E
(u
T
)|
2
E
_1
2
with
S
= min
1
2
h
S
,
1
2
for S T c
T
.
One can prove that
R,K
yields global upper and lower bounds for the
error measured in the norm |[e|[ +|[a e|[
. In the case of dominant

diusion or dominant reaction, the dual norm |[|[
can be dropped.
In this case, also lower error bounds can be established.
II.3.1.4. Other error estimators. The error estimators of sections
II.2.1.1 (p. 41), II.2.1.2 (p. 43) and II.2.1.3 (p. 44) which are based
on the solution of auxiliary local discrete problems can easily be ex-
tended to the present situation. One only has to replace the dif-
ferential operator u u by the the actual dierential operator
u div(Au) +a u + u and to use the above denition of the
element and edge or face residuals.
In the cases of dominant diusion or of dominant reaction, the same
remark applies to the hierarchical estimator of section II.2.2 (p. 46) and
the equilibrated residual method of section II.2.4 (p. 53). Both tech-
niques have diculties in the case of dominant convection, due to the
lacking symmetry of the bilinear form associated with the variational
problem.
In the case of dominant diusion, the averaging technique of section
II.2.3 (p. 51) can easily be extended to the present situation. One only
has to replace the gradient u by the oblique derivative Au. In the
cases of dominant reaction or of dominant convection, however, the
averaging technique is not appropriate since it is based on the diusive
part of the dierential operator which is no longer dominant.
II.3.2. Mixed formulation of the Poisson equation. In this
section we once again consider the model problem. But now we impose
pure homogeneous Dirichlet boundary conditions and most important
write the problem as a rst order system by introducing u as an
additional unknown:
div = f in
= u in
u = 0 on .
Our interest in this problem is twofold:
Its nite element discretization introduced below allows the
direct approximation of u without resorting to a dieren-
tiation of the nite element approximation u
T
considered so
far.
Its analysis prepares the a posteriori error analysis of the equa-
tions of linear elasticity considered in the next two sections
where mixed methods are mandatory to avoid locking phe-
nomena.
For the variational formulation, we introduce the space
H(div; ) =
_
L
2
()
d
: div L
2
()
_
and its norm
||
H
=
_
||
2
+|div |
2
_1
2
.
Next, we multiply the rst equation of the above dierential equation
by a function v L
2
() and the second equation by a vector eld
H(div; ), integrate both expressions over and use integration
by parts for the integral involving u. We thus arrive at the problem:
Find H(div; ) and u L
2
() such that
_
+
_
udiv = 0
_
div v =
_
fv
holds for all H(div; ) and v L
2
().
The dierential equation and its variational formulation are equivalent
in the usual weak sense: Every classical solution of the dierential
equation is a solution of the variational problem and every solution
of the variational problem which is suciently regular is a classical
solution of the dierential equation.
To keep the exposition as simple as possible, we only consider
the simplest discretization which is given by the lowest order Raviart-
Thomas spaces. For every element K T we set
RT
0
(K) = R
0
(K)
d
+ R
0
(K)
_
_
x
1
.
.
.
x
d
_
_
and
RT
0
(T ) =
_
T
:
T
K
RT
0
(K) for all K T ,
_
E
J
E
(n
E

T
) = 0 for all E c
_
.
The degrees of freedom associated with RT
0
(T ) are the values of the
normal components of the
T
evaluated at the midpoints of edges,
if d = 2, or the barycentres of faces, if d = 3. Since the normal
components n
E

T
are constant on the edges respective faces, the
condition
_
E
J
E
(n
E

T
) = 0 for all E c
ensures that the space RT

0
(T ) is contained in H(div; ). The mixed
nite element approximation of the model problem is then given by:
Find
T
RT
0
(T ) and u
T
S
0,1
(T ) such that
_
T

T
+
_
u
T
div
T
= 0
_
div
T
v
T
=
_
fv
T
holds for all
T
RT
0
(T ) and v
T
S
0,1
(T ).
The a posteriori error analysis of the mixed formulation of the model
problem relies on the Helmholtz decomposition of vector elds. For its
description we dene the so-called curl operator curl by
curl =
=
_
_
_
_
_
_
_
3
x
2

2
x
3
1
x
3

3
x
1
2
x
1

1
x
2
_
_
_
_
_
_
_
if : R
3
,
curl =

2
x
1

1
x
2
if : R
2
,
curl v =
_
_
_
v
x
2
v
x
1
_
_
_
if v : R and d = 2.
The Helmholtz decomposition then states that every vector eld can be
split into a gradient and a rotational component. More precisely, there
are two continuous linear operators R and G such that every vector
eld can be split in the form
= (G) + curl(R).
Using the Helmholtz decomposition one can prove the following a
posteriori error estimate:
_
|
T
|
2
H
+|u u
T
|
2
_1
2
c
KT
2
R,K
_1
2
R,K
c
_
|
T
|
2
H(div;
K
)
+|u u
T
|
2
K
_1
2
with
R,K
=
_
h
2
K
|curl
T
|
2
K
+ h
2
K
|
T
|
2
K
+|f + div
T
|
2
K
+
1
2
EE
K,
h
E
|J
E
(
T
(
T
n
E
)n
E
)|
2
E
+
EE
K,
h
E
|
T
(
T
n
E
)n
E
|
2
E
_1
2
.
Remark II.3.1. The terms
|curl
T
|
K
with K T ,
|J
E
(
T
(
T
n
E
)n
E
)|
E
with E c
,
|
T
(
T
n
E
)n
E
|
E
with E c
in
R,K
are the residuals of
T
corresponding to the equation curl = 0.
Due to the condition = u, this equation is a redundant one for the
analytical problem. For the discrete problem, however, it is an extra
condition which is not incorporated.
II.3.3. Displacement form of the equations of linearized
elasticity. The equations of linearized elasticity are given by the
boundary value problem
= Du in
= C
1
in
div = f in
as = 0 in
u = 0 on
D
n = 0 on
N
where the various quantities are
u : R
d
the displacement,
Du =
1
2
(u + u
t
) =
1
2
(
u
i
x
j
+
u
j
x
i
)
1i,jd
the deformation
tensor or symmetric gradient,
: R
dd
the strain tensor,
: R
dd
the stress tensor,
C the elasticity tensor,
f : R
d
the given body load, and
as =
t
the skew symmetric part of a given tensor.
The most import example of an elasticity tensor is given by
C = tr()I + 2
where I R
dd
is the unit tensor, tr() denotes the trace of , and
, > 0 are the Lame parameters. To simplify the presentation we
assume throughout this section that C takes the above form. We are
mainly interested in estimates which are uniform with respect to the
Lame parameters.
II.3.3.1. Displacement formulation. The simplest discretization of
the equations of linearized elasticity is based on its displacement for-
mulation:
div(CDu) = f in
u = 0 on
D
n CDu = 0 on
N
.
The corresponding variational problem is given by:
Find u H
1
D
()
d
such that
_
Du : CDv =
_
f v
holds for all v H
1
D
()
d
.
Here, : denotes the inner product of two tensors, i.e.,
: =
1i,jd
ij
ij
.
The variational problem is the Euler-Lagrange equation corresponding
to the problem of minimizing the total energy
J(u) =
1
2
_
Du : CDu
_
f u.
II.3.3.2. Finite element discretization. The nite element discreti-
zation of the displacement formulation of the equations of linearized
elasticity is given by:
Find u
T
S
k,0
D
(T )
d
such that
_
Du
T
: CDv
T
=
_
f v
T
holds for all v
T
S
k,0
D
(T )
d
.
It is well known that this problem admits a unique solution.
II.3.3.3. Residual error estimates. The methods of section II.1 (p. 30)
and II.3.1 (p. 62) directly carry over to this situation. They yield the
residual a posteriori error estimator
R,K
=
_
h
2
K
|f
T
+ div(CDu
T
)|
2
K
+
1
2
EE
K
E
T ,
h
E
|J
E
(n
E
CDu
T
)|
2
E
+
EE
K
E
T ,
N
h
E
|n
E
CDu
T
|
2
E
_1
2
which gives global upper and local lower bounds on the H
1
-norm of
the error in the displacements.
II.3.3.4. Other error estimators. Similarly the methods of sections
II.2.1.1 (p. 41), II.2.1.2 (p. 43), II.2.1.3 (p. 44), II.2.2 (p. 46), II.2.3
(p. 51) and II.2.4 (p. 53) can be extended to the displacement for-
mulation of the equations of linearized elasticity. Now, of course, the
auxiliary problems are elasticity problems in displacement form and
terms of the form u
T
must be replaced by the stress tensor (u
T
).
II.3.4. Mixed formulation of the equations of linearized
elasticity. Though appealing by its simplicity the displacement for-
mulation of the equations of linearized elasticity and the corresponding
nite element discretizations suer from serious drawbacks:
The displacement formulation and its discretization break
down for nearly incompressible materials which is reected by
the so-called locking phenomenon.
The quality of the a posteriori error estimates deteriorates in
the incompressible limit. More precisely, the constants in the
upper and lower error bounds depend on the Lame parameter
and tend to innity for large values of .
Often, the displacement eld is not of primary interest but the
stress tensor is of physical interest. This quantity, however, is
not directly discretized in displacement methods and must a
posteriori be extracted from the displacement eld which often
leads to unsatisfactory results.
These drawbacks can be overcome by suitable mixed formulations the
equations of linearized elasticity and appropriate mixed nite element
discretizations. Correspondingly these are in the focus of the following
subsections. We are primarily interested in discretizations and a pos-
teriori error estimates which are robust in the sense that their quality
does not deteriorate in the incompressible limit. This is reected by
the need for estimates which are uniform with respect to the Lame
parameter .
II.3.4.1. The Hellinger-Reissner principle. To simplify the notation
we introduce the spaces
H = H(div; )
d
= L
2
()
dd
[ div L
2
()
d
,
V = L
2
()
d
,
W = L
2
()
dd
[ +
t
= 0
and equip them with their natural norms
||
H
=
_
||
2
+|div |
2
_1
2
,
|u|
V
= |u|,
||
W
= ||.
Here, the divergence of a tensor is taken row by row, i.e.,
(div )
i
=
1jn
ij
x
j
.
The Hellinger-Reissner principle is a mixed variational formulation
of the equations of linearized elasticity, in which the strain is elimi-
nated. It is given by:
Find H, u V , W such that
_
C
1
: +
_
div u +
_
: = 0
_
div v =
_
f v
_
: = 0
holds for all H, v V , W.
It can be proven that the bilinear form corresponding to the left hand-
sides of the above problem is uniformly continuous and coercive with
respect to the Lame parameter . Thanks to this stability result, all
forthcoming constants are independent of . Hence, the corresponding
estimates are robust for nearly incompressible materials.
II.3.4.2. PEERS and BDMS elements. We consider two types of
mixed nite element discretizations of the Hellinger-Reissner principle:
the PEERS element and
the BDMS elements.
Both families have proven to be particularly well suited to avoid locking
phenomena. They are based on the curl operators of section II.3.2
(p. 65) and the Raviart-Thomas space
RT
0
(K) = R
0
(K)
d
+ R
0
(K)
_
_
x
1
.
.
.
x
d
_
_
.
For any integer k N and any element K T we then set
B
k
(K) = R
dd
: (
i1
, . . . ,
id
) = curl(
K
w
i
),
w
i
R
k
(K)
2d3
, 1 i d,
BDM
k
(K) = R
k
(K)
d
.
Both types of discretizations are obtained by replacing the spaces H,
V and W in the variational problem corresponding to the Hellinger-
Reissner principle by discrete counterparts H
T
, V
T
and W
T
, respec-
tively. They dier by choice of these spaces H
T
, V
T
and W
T
which are
given by
H
T
=
T
H :
T
[
K
RT
0
(K)
d
B
0
(K), K T ,
T
n = 0 on
N
,
V
T
= v
T
V : v
T
[
K
R
0
(K)
d
, K CT,
W
T
=
T
W C()
dd
:
T
[
K
R
1
(K)
dd
, K T ,
for the PEERS element and
H
T
=
T
H :
T
[
K
BDM
k
(K)
d
B
k1
(K), K T ,
T
n = 0 on
N
,
V
T
= v
T
V : v
T
[
K
R
k1
(K)
d
, K CT,
W
T
=
T
W C()
dd
:
T
[
K
R
k
(K)
dd
, K T
for the BDMS elements.
For both discretizations it can be proven that they admit a unique
solution.
II.3.4.3. Residual a posteriori error estimates. In what follows we
always denote by (, u, ) H V W the unique solution of the
variational problem corresponding to the Hellinger-Reissner principle
and by (
T
, u
T
,
T
) H
T
V
T
W
T
its nite element approximation
using the PEERS or BDMS elements.
For every edge of face E and every tensor eld : R
dd
we
denote by
E
() = (n n)n n
the tangential component of .
With these notations we dene for every element K T the residual
a posteriori error estimator
R,K
by
R,K
=
_
h
2
K
|C
1
T
+
T
u
T
|
2
K
+
1
2
|f + div
T
|
2
K
+
1
2
|as(
T
)|
2
K
+ h
2
K
|curl(C
1
T
+
T
)|
2
K
+
EE
K
E
T ,
h
E
|J
E
(
E
(C
1
T
+
T
))|
2
E
+
EE
K
E
T ,
h
E
|
E
(C
1
T
+
T
)|
2
E
_1
2
.
It can be proven that this estimator yields upper and lower bounds on
the error
_
1
2
|
T
|
2
H
+|u u
T
|
2
+|
T
|
2
_1
2
up to multiplicative constants which are independent of the Lame pa-
rameters and .
II.3.4.4. Local Neumann problems. We want to treat local auxiliary
problems, which are based on a single element K T . Furthermore
we want to impose pure Neumann boundary conditions. Since the dis-
placement of a linear elasticity problem with pure Neumann boundary
conditions is unique only up to rigid body motions, we must factor out
the rigid body motions R
K
of the element K. These are given by
R
K
=
_
v = (a, b) + c(x
2
, x
1
) : a, b, c R if d = 2,
v = a + b x : a, b R
3
if d = 3.
We set
H
K
= BDM
m
(K)
d
B
m1
(K),
V
K
= R
m1
(K)
d
/R
K
W
K
=
K
R
m
(K)
dd
:
K
+
t
K
= 0
and
X
K
= H
K
V
K
W
K
,
where m k +2d. With this denition of spaces it can be proven that
the following auxiliary local discrete problem admits a unique solution:
Find (
K
, u
K
,
K
) X
K
such that
_
K
C
1
K
:
K
+
_
K
div
K
u
K
+
_
K
K
:
K
=
_
K
C
1
T
:
K
_
K
div
K
u
T
_
K
K
:
T
_
K
div
K
v
K
=
_
K
f v
K

_
K
div
T
v
K
_
K
K
:
K
=
_
K
T
:
K
holds for all (
K
, v
K
,
K
) X
K
.
With the solution of this problem we dene the error estimator
N,K
by
N,K
=
_
1
2
|
K
|
2
H(div;K)
+|v
K
|
2
K
+|
K
|
2
K
_1
2
.
It yields upper and lower bounds on the error up to multiplicative
constants which are independent of the Lame parameters and .
Remark II.3.2. The above auxiliary problem is a discrete linear elas-
ticity problem with pure Neumann boundary conditions on the single
element K. In order to implement the error estimator
N,K
one has
to construct a basis for the space V
K
. This can be done by taking
the standard basis of R
m
(K)
d
and dropping those degrees of freedom
that belong to the rigid body motions. Afterwards one has to compute
the stiness matrix for each element K and solve the associated local
auxiliary problem.
II.3.4.5. Local Dirichlet problems. Now we want to construct an
error estimator which is similar to the estimator
D,K
of section II.2.1.2
(p. 43) and which is based on the solution of discrete linear elasticity
problems with Dirichlet boundary conditions on the patches
K
. To
this end we associate with every element K the spaces
H
K
=
K
H(div;
K
)
d
:
T
[
K
BDM
m
(K
)
d
B
m1
(K
),
K
T , c
K
c
K
,= ,
V
K
= v
T
L
2
(
K
)
d
: v
T
[
K
R
m1
(K
)
d
,
K
T , c
K
c
K
,= ,
W
K
=
T
L
2
(
K
)
dd
C(
K
)
dd
:
T
+
t
T
= 0,
T
[
K
R
m
(K
)
dd
,
K
T , c
K
c
K
,=
and
X
K
=

H
K

V
K

W
K
and consider the following auxiliary local problem
Find (
K
, u
K
,
K
) X
K
such that
_
K
C
1

K
:
K
+
_
K
div
K
u
K
+
_
K
:
K
=
_
K
C
1
T
:
K
K
div
K
u
T
K
:
T
_
K
div
K
v
K
=
_
K
f v
K
K
div
T
v
K
_
K

K
:
K
=
_
T
:
K
holds for all (
K
, v
K
,
K
) X
K
.
Again it can be proven that this problem admits a unique solution.
With it we dene the error estimator
D,K
by
D,K
=
_
1
2
|
K
|
2
H(div;K)
+| v
K
|
2
K
+|
K
|
2
K
_1
2
.
It yields upper and lower bounds on the error up to multiplicative
constants which are independent of the Lame parameters and .
II.3.5. Non-linear problems. For non-linear elliptic problems,
residual a posteriori error estimators are constructed in the same way
as for linear problems. The estimators consist of two ingredients:
element residuals which consist of the element-wise residual of
the actual discrete solution with respect to the strong form of
the dierential equation,
edge or face residuals which consist of the inter-element jump
of that trace operator which naturally links the strong and
weak form of the dierential equation where all dierential
operators are evaluated at the current discrete solution.
Example II.3.3. If the dierential equation takes the form
div a(x, u, u) + b(x, u, u) = 0 in
u = 0 on
D
n a(x, u, u) = g on
N
with a suitable dierentiable vector-eld a : R R
d
R
d
and a
suitable continuous function b : RR
d
R, the element and edge
or face residuals are given by
R
K
(u
T
) = div a(x, u
T
, u
T
) + b(x, u
T
, u
T
)
and
R
E
(u
T
) =
_
_
J
E
(n
E
a(x, u
T
, u
T
)) if E c
T ,
g
E
n
E
a(x, u
T
, u
T
) if E c
T ,
N
0 if E c
T ,
D
respectively, where g
E
denotes the mean value of g on E.
Some peculiarities arise from the non-linearity and must be taken
into account:
The error estimation only makes sense if the discrete problem
is based on a well-posed variational formulation of the dieren-
tial equation. Here, well-posedness means that the non-linear
mapping associated with the variational problem must at least
be continuous. In order to full this requirement one often has
to leave the Hilbert setting and to replace H
1
() by general
Sobolev spaces W
1,p
() with a Lebesgue exponent p ,= 2, typ-
ically p > d. The choice of the Lebesgue exponent p is not at
II.4. PARABOLIC PROBLEMS 77
the disposal of the user, it is dictated by the nature of the non-
linearity such as, e.g., its growth. When leaving the Hilbert
setting, the L
2
-norms used in the error estimators must be re-
placed by corresponding L
p
-norms and the weighting factors
must be adapted too. Thus, in a general L
p
-setting, a typical
residual error estimator takes the form
R,K
=
_
h
p
K
|R
K
(u
T
)|
p
K
+
1
2
EE
K
h
E
|R
E
(u
T
)|
p
E
_1
p
.
Non-linear problems in general have multiple solutions. There-
fore any error estimator can at best control the error of the
actual discrete solution with respect to a near-by solution of
the variational problem. Moreover, an error control often is
possible only if the actual grid is ne enough. Unfortunately,
the notions near-by and ne enough can in general not be
quantied.
Non-linear problems often inhibit bifurcation or turning points.
Of course, one would like to keep track of these phenomena
with the help of the error estimator. Unfortunately, this is of-
ten possible only on a heuristic bases. Rigourous arguments in
general require additional a priori information on the structure
of the solution manifold which often is not available.
For non-linear problems, error estimators based on the solution of
auxiliary discrete problems can be devised as in section II.2.1 (p. 40)
for the model problem. Their solution is considerably simplied by the
following observations:
The non-linearity only enters on the right-hand side of the
auxiliary problems via the element and edge or face residuals
described above.
The left-hand sides of the auxiliary problems correspond to
linear dierential operators which are obtained by linearizing
the non-linear problem at the current discrete solution.
Variable coecients can be frozen at the current discrete so-
lution and suitable points of the local patch such as, e.g., the
barycentres of the elements and edges or faces.
The error estimators of sections II.2.2 (p. 46), II.2.3 (p. 51) and
II.2.4 (p. 53) can in general be applied to non-linear problems only
on a heuristic bases; rigourous results are at present only available for
some of these estimators applied to particular model problems.
II.4. Parabolic problems
II.4.1. Scalar linear parabolic equations. In this section we
extend the results of section II.3 (p. 62) to general linear parabolic
equations of second order:
t
u div(Au) +a u + u = f in (0, T]
u = 0 on
D
(0, T]
n Au = g on
N
(0, T]
u = u
0
in .
Here, R
d
, d 2, is a bounded polygonal cross-section with a
Lipschitz boundary consisting of two disjoint parts
D
and
N
. The
nal time T is arbitrary, but kept xed in what follows.
We assume that the data satisfy the following conditions:
The diusion A is a continuously dierentiable matrix-valued
function and symmetric, uniformly positive denite and uni-
formly isotropic, i.e.,
= inf
0<tT,x
min
zR
d
\{0}
z
T
A(x, t)z
z
T
z
> 0
and
=
1
sup
0<tT,x
max
zR
d
\{0}
z
T
A(x, t)z
z
T
z
is of moderate size.
The convection a is a continuously dierentiable vector-eld
and scaled such that
sup
0<tT,x
[a(x, t)[ 1.
The reaction is a continuos non-negative scalar function.
There is a constant 0 such that

1
2
div a
for almost all x and 0 < t T. Moreover there is a
constant c
b
0 of moderate size such that
sup
0<tT,x
[(x, t)[ c
b
.
The Dirichlet boundary
D
has positive (d 1)-dimensional
measure and includes the inow boundary
_
0<tT
x : a(x, t) n(x) < 0.
With these assumptions we can distinguish dierent regimes:
dominant diusion: sup
0<tT,x
[a(x, t)[ c
c
and c
with constants of moderate size;

dominant reaction: sup
0<tT,x
[a(x, t)[ c
c
and
with a constant c
c
of moderate size;
dominant convection: .
II.4.2. Variational formulation. The variational formulation of
the above parabolic dierential equation is given by:
Find u : (0, T) H
1
D
() such that
_
T
0
|u(x, t)|
2
dt < ,
_
T
0
_
sup
vH
1
d
()\{0}
v=1
_
t
u(x, t)v(x)dx
_
2
dt < ,
u(, 0) = u
0
and for almost every t (0, T) and all v H
1
D
()
_
t
uv+
_
uAv+
_
auv+
_
uv =
_
fv+
_
N
gv.
The assumptions of section II.4.1 imply that this problem admits a
unique solution.
The error estimation of the following sections is based on the energy
norm associated with this variational problem
|[v|[ =
_
|v|
2
+ |v|
2
_1
2
and the corresponding dual norm
|[|[
= sup
vH
1
D
()\{0}
1
|[v|[
_
v.
II.4.3. An overview of discretization methods for parabolic
equations. Within the nite element framework, there are three main
approaches to discretize parabolic equations:
Method of lines: One chooses a xed spatial mesh and applies
a standard nite element scheme to the spatial part of the
dierential equation. This gives rise to a large system of ordi-
nary dierential equations. The size of this system is given
by the number of degrees of freedom of the nite element
space, the unknowns are the (now time-dependent) coecients
of the nite element functions. The system of ordinary dif-
ferential equations is then solved by a standard ODE-solver
such as, e.g., the Crank-Nicolson scheme or some Runge-Kutta
method.
Rothes method: In this approach the order of temporal and
spatial discretization is interchanged. The parabolic partial
dierential equation is interpreted as an ordinary dierential
equation with respect to time with functions having their tem-
poral values in suitable innite dimensional function spaces
such as, e.g., H
1
(). One applies a standard ODE-solver to
this system of ordinary dierential equations. At each time-
step this gives rise to a stationary elliptic partial dierential
equation. These elliptic equations are then discretized by a
standard nite element scheme.
Space-time nite elements: In this approach space and time
are discretized simultaneously.
All three approaches often lead to the same discrete problem. However,
they considerably dier in their analysis and most important in our
context their potential for adaptivity. With respect to the latter, the
space-time elements are clearly superior.
II.4.4. Space-time nite elements. In what follows we consider
partitions
J = [t
n1
, t
n
] : 1 n N
I
of the time-interval [0, T] into subintervals satisfying

0 = t
0
< . . . < t
N
I
= T.
For every n with 1 n N
I
we denote by
I
n
= [t
n1
, t
n
]
the n-th subinterval and by
n
= t
n
t
n1
its length.
-
6
t
0
t
1
t
2
.
.
.
t
N
I
1
t
N
I
T
0
T
N
I
Figure II.4.1. Space-time partition
With every intermediate time t
n
, 0 n N
I
, we associate an
admissible, ane equivalent, shape regular partition T
n
of (cf. gure
II.4.1) and a corresponding nite element space X
n
. In addition to the
conditions of sections I.3.7 (p. 19) and I.3.8 (p. 21) the partitions J
and T
n
and the spaces X
n
must satisfy the following assumptions:
Non-degeneracy: Every time-interval has a positive length,
i.e.,
n
> 0 for all 1 n N
I
and all J.
Transition condition: For every n with 1 n N
I
there is
an ane equivalent, admissible, and shape-regular partition
T
n
such that it is a renement of both T
n
and T
n1
and such
that
sup
1nN
I
sup
K
T
n
sup
K
T
n
KK
h
K
h
K
<
uniformly with respect to all partitions J which are obtained
by adaptive or uniform renement of any initial partition of
[0, T].
Degree condition: Each X
n
consists of continuous functions
which are piecewise polynomials, the degrees being at least one
and being bounded uniformly with respect to all partitions T
n
and J.
The non-degeneracy is an obvious requirement to exclude pathological
situations.
The transition condition is due to the simultaneous presence of nite
element functions dened on dierent grids. Usually the partition T
n
is
obtained from T
n1
by a combination of renement and of coarsening.
In this case the transition condition only restricts the coarsening: it
must not be too abrupt nor too strong.
The lower bound on the polynomial degrees is needed for the construc-
tion of suitable quasi-interpolation operators. The upper bound ensures
that the constants in inverse estimates similar to those of section I.3.12
(p. 26) are uniformly bounded.
For every n with 0 n N
I
we nally denote by
n
the L
2
-
projection of L
2
() onto X
n
.
II.4.5. Finite element discretization. For the nite element
discretization we choose a partition J of [0, T], corresponding parti-
tions T
n
of and associated nite element spaces X
n
as above and a
parameter [
1
2
, 1]. With the abbreviation
A
n
= A(, t
n
),
a
n
= a(, t
n
),
n
= (, t
n
),
f
n
= f(, t
n
),
g
n
= g(, t
n
),
the nite element discretization is then given by:
Find u
n
T
n
X
n
, 0 n N
I
, such that
u
0
T
0
=
0
u
0
and, for n = 1, . . . , N
I
,
_
n
_
u
n
T
n
u
n1
T
n1
_
v
T
n
+
_
_
u
n
T
n
+ (1 )u
n1
T
n1
_
A
n
v
T
n
+
_
a
n

_
u
n
T
n
+ (1 )u
n1
T
n1
_
v
T
n
+
_
n
_
u
n
T
n
+ (1 )u
n1
T
n1
_
v
T
n
=
_
_
f
n
+ (1 )f
n1
_
v
T
n
+
_
N
_
g
n
+ (1 )g
n1
_
v
T
n
for all v
T
n
X
n
.
This is the popular A-stable -scheme which in particular yields the
Crank-Nicolson scheme if =
1
2
and the implicit Euler scheme if = 1.
The assumptions of section II.4.1 imply that the discrete problem
admits a unique solution (u
n
T
n
)
0nN
I
. With this sequence we associate
the function u
I
which is piecewise ane on the time-intervals [t
n1
, t
n
],
1 n N
I
, and which equals u
n
T
n
at time t
n
, 0 n N
I
, i.e.,
u
I
(, t) =
1
n
_
(t
n
t)u
n1
T
n1
+ (t t
n1
)u
n
T
n
_
on [t
n1
, t
n
].
Note that
t
u
I
=
1
n
(u
n
T
n
u
n1
T
n1
) on [t
n1
, t
n
].
Similarly we denote by f
I
and g
I
the functions which are piecewise
constant on the time-intervals and which, on each interval (t
n1
, t
n
], are
equal to the L
2
-projection of f
n
+(1 )f
n1
and g
n
+(1 )g
n1
,
respectively onto the nite element space X
n
, i.e.,
f
I
(, t) =
n
_
f(, t
n
) + (1 )f(, t
n1
)
_
g
I
(, t) =
n
_
g(, t
n
) + (1 )g(, t
n1
)
_
on [t
n1
, t
n
].
II.4.6. A preliminary residual error estimator. Similarly to
elliptic problems we dene element residuals by
R
K
= f
I

1
n
(u
n
T
n
u
n1
T
n1
) + div(A
n
(u
n
T
n
+ (1 )u
n1
T
n1
))
a
n
(u
n
T
n
+ (1 )u
n1
T
n1
)
n
(u
n
T
n
+ (1 )u
n1
T
n1
),
and edge or face residuals by
R
E
=
_
_
J
E
(n
E
A
n
(u
n
T
n
+ (1 )u
n1
T
n1
) if E c
T
n
,
,
g
I
n
E
A
n
(u
n
T
n
+ (1 )u
n1
T
n1
) if E c
T
n
,
N
,
0 if E c
T
n
,
D
,
and weighting factors by
S
= minh
S
1
2
,
1
2
for all elements, edges or faces S T c

T
. Here we use the convention
that
1
2
= if = 0.
With these notations a preliminary residual space-time error esti-
mator for the parabolic equation is given by

I
=
_
|u
0
0
u
0
|
2
+
N
I
n=1
n
__
n
T
n
_
2
+|[u
n
T
n
u
n1
T
n1
|[
2
+|[a
n
(u
n
T
n
u
n1
T
n1
)|[
2
__1
2
with
n
T
n
=
_
T
n
2
K
|R
K
|
2
K
+
EE
T
n
1
2
E
|R
E
|
2
E
_1
2
.
One can prove that
I
yields upper and lower bounds for the error
measured in the norm
_
sup
0tT
|u u
I
|
2
+
_
T
0
|[u u
I
|[
2
+
_
T
0
|[

t
(u u
I
) +a (u u
I
)|[
2
_1
2
.
We call
I
a preliminary error estimator since it is not suited for
practical computations due to the presence of the dual norm|[|[
which
is not computable. To obtain the nal computable error estimator we
must replace this dual norm be a computable quantity. For achieving
this goal we must distinguish two cases:
small convection: sup
0tT
|a(, t)|
1
2
max,
1
2
;
large convection: sup
0tT
|a(, t)|
1
2
max,
1
2
.
II.4.7. A residual error estimator for the case of small con-
vection. In this case, we use an inverse estimate to bound the critical
term
|[a
n
(u
n
T
n
u
n1
T
n1
)|[
by
|[u
n
T
n
u
n1
T
n1
|[
times a constant of moderate size. We thus obtain the residual error
estimator
I
=
_
|u
0
0
u
0
|
2
+
N
I
n=1
n
__
n
T
n
_
2
+|[u
n
T
n
u
n1
T
n1
|[
2
__1
2
with
n
T
n
=
_
T
n
2
K
|R
K
|
2
K
+
EE
T
n
1
2
E
|R
E
|
2
E
_1
2
.
It is easy to compute and yields upper and lower bounds for the error
measured in the norm of section II.4.6.
II.4.8. A residual error estimator for the case of large con-
vection. In this case we cannot bound the dual norm by an inverse
estimate. If we would do so, we would lose a factor
1
2
in the error
estimates. To avoid this undesirable phenomenon we must invest some
additional work. The basic idea is as follows:
Due to the denition of the dual norm, its contribution
equals the energy norm of the weak solution of a suitable
stationary reaction-diusion equation. This solution is
approximated by a suitable nite element function. The
error of this approximation is estimated by an error es-
timator for stationary reaction-diusion equations.
To make these ideas precise, we denote for every integer n between
1 and N
I
by
X
n
= S
1,0
D
(
T
n
)
the space of continuous piecewise linear functions corresponding to

T
n
and vanishing on
D
and by u
n
T
n

X
n
the unique solution of the
discrete reaction-diusion problem
u
n
T
n
v
T
n
+
_
u
n
T
n
v
T
n
=
_
a
n
(u
n
T
n
u
n1
T
n1
)v
T
n
for all v
T
n

X
n
. Further we dene an error estimator
n
T
n
by

n
T
n
=
_
T
n
2
K
|a
n
(u
n
T
n
u
n1
T
n1
) + u
n
T
n
u
n
T
n
|
2
K
+
EE
T
n
,
E
T
n
,
N
1
2
E
|J
E
(n
E
u
n
T
n
)|
2
E
_1
2
.
With these notations the error estimator for the parabolic equation
is the given by
I
=
_
|u
0
0
u
0
|
2
+
N
I
n=1
n
__
n
T
n
_
2
+|[u
n
T
n
u
n1
T
n1
|[
2
+
_

n
T
n
_
2
+|[ u
n
T
n
|[
2
__1
2
with
n
T
n
=
_
T
n
2
K
|R
K
|
2
K
+
EE
T
n
1
2
E
|R
E
|
2
E
_1
2
.
Compared to the case of small convection we must solve on each time-
level an additional discrete problem to compute u
n
T
n
. The computa-
tional work associated with these additional problems corresponds to
doubling the number of time-steps for the discrete parabolic problem.
II.4.9. Space-time adaptivity. When considering the error es-
timators
I
of the preceding two sections, the term
1
2
n

n
T
n
can be interpreted as a spatial error indicator, whereas the other terms
can be interpreted as temporal error indicators. These dierent con-
tributions can be used to control the adaptive process in space and
time.
To make things precise and to simplify the notation we set
n
h
=
n
T
n
and
= |[u
n
T
n
u
n1
T
n1
|[
in the case of small convection and
=
_
|[u
n
T
n
u
n1
T
n1
|[
2
+
_

n
T
n
_
2
+|[ u
n
T
n
|[
2
_1
2
in the case of large convection. Thus,
n
h
is our measure for the spatial
error and
n
does the corresponding job for the temporal error.

II.4.9.1. Time adaptivity. Assume that we have solved the discrete
problem up to time-level n 1 and that we have computed the error
estimators
n1
h
and
n1
. Then we set
t
n
=
_
minT, t
n1
+
n1
if
n1

n1
h
,
minT, t
n1
+ 2
n1
if
n1

1
2
n1
h
.
In the rst case we retain the previous time-step; in the second case
we try a larger time step.
Next, we solve the discrete problem on time-level n with the current
value of t
n
and compute the error estimators
n
h
and
n
.
If
n

n
h
, we accept the current time-step and continue with the
space adaptivity, which is described in the next sub-section.
If
n
2
n
h
, we reject the current time-step. We replace t
n
by
1
2
(t
n1
+ t
n
) and repeat the solution of the discrete problem on time-
level n and the computation of the error estimators.
The described strategy obviously aims at balancing the two contri-
butions
n
h
and
n
of the error estimator.

II.4.9.2. Space adaptivity. For time-dependent problems the spa-
tial adaptivity must also allow for a local mesh coarsening. Hence, the
marking strategies of section III.1.1 (p. 97) must be modied accord-
ingly.
Assume that we have solved the discrete problem on time-level n
with an actual time-step
n
and an actual partition T
n
of the spatial
domain and that we have computed the estimators
n
h
and
n
. More-
over, suppose that we have accepted the current time-step and want to
optimize the partition T
n
.
We may assume that T
n
currently is the nest partition in a hierar-
chy T
0
n
, . . . , T
n
of nested, successively rened partitions, i.e. T
n
= T
n
and T
j
n
is a (local) renement of T
j1
n
, 1 j .
Now, we go back mgenerations in the grid-hierarchy to the partition
T
m
n
. Due to the nestedness of the partitions, each element K T
m
n
is the union of several elements K
T
n
. Each K
gives a contribution
to
n
h
. We add these contributions and thus obtain for every K T
m
n
an error estimator
K
. With these estimators we then perform M steps
of one of the marking strategies of section III.1.1 (p. 97). This yields a
new partition T
m+M
n
which usually is dierent from T
n
. We replace
T
n
by this partition, solve the corresponding discrete problem on time-
level n and compute the new error estimators
n
h
and
n
.
If the newly calculated error estimators satisfy
n
h

n
, we accept
the current partition T
n
and proceed with the next time-level.
If
n
h
2
n
, we successively rene the partition T

n
as described in
section III.1 (p. 97) with the
n
h
as error estimators until we arrive at a
partition which satises
n
h

n
. When this goal is achieved we accept

the spatial discretization and proceed with the next time-level.
Typical values for the parameters m and M are 1 m 3 and
m M m + 2.
II.4.10. The method of characteristics. The method of char-
acteristics can be interpreted as a modication of space-time nite el-
ements designed for problems with a large convection term. The main
idea is to split the discretization of the material derivative consisting
of the time derivative and the convective derivative from the remaining
terms.
To simplify the description of the method of characteristics we as-
sume that we use linear nite elements, have pure Dirichlet boundary
conditions, i.e.
D
= , and that the convection satises the slightly
more restrictive condition
div a = 0 in (0, T],
a = 0 on (0, T].
Since the function a is Lipschitz continuous with respect to the spatial
variable and vanishes on the boundary , for every (x
, t
) (0, T],
standard global existence results for the ows of ordinary dierential
equations imply that the characteristic equation
d
dt
x(t; x
, t
) = a(x(t; x
, t
), t), t (0, t
),
x(t
; x
, t
) = x
has a unique solution x(; x
, t
) which exists for all t [0, t
] and stays
within . Hence, we may set U(x
, t) = u(x(t; x
, t
), t). The total

derivative d
t
U satises
d
t
U =
t
u +a u.
Therefore, the parabolic equation can equivalently be written as
d
t
U div(Du) + bu = f in (0, T).
The discretization by the method of characteristics relies on a separate
treatment of these two equations.
z
x
n1
z
z
x
n1
z
Figure II.4.2. Computation of x
n1
z
in the method of characteristics
For every intermediate time t
n
, 1 n N
I
, and every node
z A
n,
we compute an approximation x
n1
z
to x(t
n1
; z, t
n
) (cf.
gure II.4.2) by applying an arbitrary but xed ODE-solver such as
e.g. the explicit Euler scheme to the characteristic equation with
(x
, t
) = (z, t
n
). We assume that the time-step
n
and the ODE-solver
are chosen such that x
n1
z
lies within for every n 1, . . . , N
I
and every z A
n,
. The assumptions on the convection a in particular
imply that this condition is satised for a single explicit Euler step if
n
< 1/|a(, t
n
)|
L
1,
()
. Denote by
n
: L
2
() X
n
a suitable quasi-
interpolation operator, e.g. the L
2
-projection. Then the method of
characteristics takes the form:
Set
u
0
T
0
=
0
u
0
.
For n = 1, . . . , N
I
successively compute u
n1
T
n
X
n
such
that
u
n1
T
n
(z) =
_
u
n1
T
n1
(x
n1
z
) if z A
n,
,
0 if z A
n,
,
and nd u
n
T
n
X
n
such that
_
n
_
u
n
T
n
u
n1
T
n
_
v
T
n
+
_
u
n
T
n
A
n
v
T
n
+
_
n
u
n
T
n
v
T
n
=
_
f
n
v
T
n
holds for all v
T
n
X
n
.
II.4.11. Finite volume methods. Finite volume methods are a
dierent popular approach for solving parabolic problems in particular
those with large convection. For this type of discretizations, the theory
of a posteriori error estimation and adaptivity is much less developed
than for nite element methods. But, there is an important particular
case where nite volume methods can easily prot from nite element
techniques. This is the case of so-called dual nite volume meshes.
II.4.11.1. Systems in divergence form. Finite volume methods are
tailored for systems in divergence form where we are looking for a
vector eld U dened on a subset of R
d
having values in R
m
which
satises the dierential equation
M(U)
t
+ div F(U) = g(U, x, t) in (0, )
U(, 0) = U
0
in .
Here, g, the source, is a vector eld on R
m
(0, ) with values
in R
m
, M, the mass, is a vector eld on R
m
with values in R
m
, F the
ux is a matrix valued function on R
m
with values in R
md
and U
0
,
the initial value, is a vector eld on with values in R
m
. The dier-
ential equation of course has to be completed with suitable boundary
conditions. These, however, will be ignored in what follows.
Notice that the divergence has to be taken row-wise
div F(U) =
_
d
j=1
F(U)
i,j
x
j
_
1im
.
The ux F can be slit into two contributions
F = F
adv
+F
visc
.
F
adv
is called advective ux and does not contain any derivatives. F
visc
is called viscous ux and contains spacial derivatives. The advective
ux models transport or convection phenomena while the viscous ux
is responsible for diusion phenomena.
Example II.4.1. A linear parabolic equation of 2nd order
u
t
div(Au) +a u + u = f,
is a system in divergence form with
m = 1, U = u, M(U) = u,
F
adv
(U) = au, F
visc
(U) = Au, g(U) = f u + (div a)u.
Example II.4.2. Burgers equation
u
t
+ u
u
x
= 0
is a system in divergence form with
m = d = 1, u = u, M(U) = u,
F
adv
(u) =
1
2
u
2
, F
visc
(U) = 0, g(U) = 0.
Other important examples of systems in divergence form are the
Euler equations and Navier-Stokes equations for non-viscous respective
viscous uids. Here we have d = 2 or d = 3 and m = d +2. The vector
U consists of the density, velocity and the internal energy of the uid.
II.4.11.2. Basic idea of the nite volume method. Choose a time
step > 0 and a partition T of consisting of arbitrary non-over-
lapping polyhedra. Here, the elements may have more complicated
shapes than in the nite element method (cf. gs. II.4.3 (p. 92) and
II.4.4 (p. 92)). Moreover, hanging nodes are allowed.
Now we choose an integer n 1 and an element K T and keep
both xed in what follows. First we integrate the dierential equation
on K [(n 1), n]
_
n
(n1)
_
K
M(U)
t
dxdt +
_
n
(n1)
_
K
div F(U)dxdt
=
_
n
(n1)
_
K
g(U, x, t)dxdt.
Next we use integration by parts for the terms on the left-hand side
_
n
(n1)
_
K
M(U)
t
dxdt =
_
K
M(U(x, n))dx
_
K
M(U(x, (n 1)))dx,
_
n
(n1)
_
K
div F(U)dxdt =
_
n
(n1)
_
K
F(U) n
K
dSdt.
For the following steps we assume that U is piecewise constant with
respect to space and time. We denote by U
n
K
and U
n1
K
the value of U
on K at times n und (n 1), respectively. Then we have
_
K
M(U(x, n))dx [K[M(U
n
K
)
_
K
M(U(x, (n 1)))dx [K[M(U
n1
K
)
_
n
(n1)
_
K
F(U) n
K
dSdt
_
K
F(U
n1
K
) n
K
dS
_
n
(n1)
_
K
g(U, x, t)dxdt [K[g(U
n1
K
, x
K
, (n 1)).
Here, [K[ denotes the area of K, if d = 2, or the volume of K, if d = 3,
respectively.
In a last step we approximate the boundary integral for the ux by a
numerical ux
_
K
F(U
n1
K
) n
K
d

T
KK
E
[K K
[F
T
(U
n1
K
, U
n1
K
).
All together we obtain the following nite volume method
For every element K T compute
U
0
K
=
1
[K[
_
K
U
0
(x).
For n = 1, 2, . . . successively compute for every element K
T
M(U
n
K
) = M(U
n1
K
)
T
KK
E
[K K
[
[K[
F
T
(U
n1
K
, U
n1
K
)
+ g(U
n1
K
, x
K
, (n 1)).
Here, [K K
[ denotes the length respective area of the common

boundary of K K
.
This method may easily be modied as follows:
The time step may be variable.
The partition of may change from one time step to the other.
The approximation U
n
K
must not be piecewise constant.
In order to obtain an operating discretization, we still have to make
precise the following points:
construction of T ,
choice of F
T
.
Moreover we have to take into account boundary conditions. This item,
however, will not be addressed in what follows.
II.4.11.3. Construction of dual nite volume meshes. For construct-
ing the nite volume mesh T , we start from a standard nite element
partition

T which satises the conditions of section I.3.7 (p. 19). Then
we subdivide each element

K

T into smaller elements by either
drawing the perpendicular bisectors at the midpoints of edges
of

K (cf. Figure II.4.3) or by
connecting the barycentre of

K with its midpoints of edges (cf.
Figure II.4.4).
Then the elements in T consist of the unions of all small elements that
share a common vertex in the partition

T .
@
@
@
@
@
@
@
@
@
@
@
@
@
@
@
@
@
@
@
@
@
@
@
@
@
@
@
@
@
@
@
@
@
@
@
@
@
@
@
@
@
@
@
@
@
@
@
@
@
@
@
@
@
@
@
@
@
@
@
@
@
@
@
@
@
@
@
@
@
@
@
@
@
@
@
@
@
@
@
@
Figure II.4.3. Dual mesh (red) via perpendicular bi-
sectors of primal mesh (blue)
@
@
@
@
@
@
@
@
@
@
@
@
@
@
@
@
@
@
@
@
@
@
@
@
@
@
@
@
@
@
@
@
@
@
@
@
@
@
@
@
@
@
@
@
@
@
@
@
@
@
@
@
@
@
@
@
@
@
@
@
@
@
@
@
@
@
@
@
@
@
@
@
@
@
@
@
@
@
@
@
A
A
A
A
A
A
A
A
A
A
A
A
A
A
A
A
A
A
A
A
A
A
A
A
A
A
A
A
A
A
A
A
A
A
A
A
A
A
A
A
A
A
A
A
A
A
A
A
A
A
A
A
H
H
H
H
H
H
H
H
H
H
H
H
H
H
H
H
H
H
H
H
H
H
H
H
H
H
H
H
H
H
H
H
H
H
H
H
H
H
H
H
H
H
H
H
H
H
H
H
H
H
H
H
Figure II.4.4. Dual mesh (red) via barycentres of pri-

mal mesh (blue)
Thus the elements in T can be associated with the vertices in A
T
.
Moreover, we may associate with each edge or face in c
T
exactly two
vertices in A
T
such that the line connecting these vertices intersects
the given edge or face, respectively.
The rst construction has the advantage that this intersection is
orthogonal. But this construction also has some disadvantages which
are not present with the second construction:
The perpendicular bisectors of a triangle may intersect in a
point outside the triangle. The intersection point is within
the triangle only if its largest angle is at most a right one.
The perpendicular bisectors of a quadrilateral may not inter-
sect at all. They intersect in a common point inside the quadri-
lateral only if it is a rectangle.
The rst construction has no three dimensional analogue.
II.4.11.4. Construction of numerical uxes. For the construction of
numerical uxes we assume that T is a dual mesh corresponding to a
primal nite element partition

T . With every edge or face E of T we
denote by K
1
and K
2
the adjacent volumes, by U
1
and U
2
the values
U
n1
K
1
and U
n1
K
2
, respectively and by x
1
, x
2
vertices of

T such that the
segment x
1
x
2
intersects E.
As in the analytical case, we split the numerical ux F
T
(U
1
, U
2
)
into a viscous numerical ux F
T ,visc
(U
1
, U
2
) and an advective numer-
ical ux F
T ,adv
(U
1
, U
2
) which are constructed separately.
We rst construct the numerical viscous uxes. To this end we
introduce a local coordinate system
1
, . . . ,
d
such that
1
is parallel
to x
1
x
2
and such that the remaining coordinates are tangential to E
(cf. gure II.4.5). Next we express all derivatives in F
visc
in terms of
partial derivatives corresponding to the new coordinates and suppress
all derivatives which do not pertain to
1
. Finally we approximate
derivatives corresponding to
1
by dierences of the form

1
2
|x
1
x
2
|
.
6
Figure II.4.5. Local coordinate system for the approx-
imation of viscous uxes
We now construct the numerical advective uxes. To this end we
denote by
C(V) = D(F
adv
(V) n
K
1
) R
mm
the derivative of F
adv
(V) n
K
1
with respect to V and suppose that this
matrix can be diagonalized, i.e., there is an invertible matrix Q(V)
R
mm
and a diagonal matrix (V) R
mm
such that
Q(V)
1
C(V)Q(V) = (V).
This assumption is, e.g., satised for the Euler and Navier-Stokes equa-
tions. With any real number z we then associate its positive and neg-
ative part
z
+
= maxz, 0, z
= minz, 0
and set
(V)
= diag
_
(V)
11
, . . . , (V)
mm
_
,
C(V)
= Q(V)(V)
Q(V)
1
.
With these notations the Steger-Warming scheme for the approxima-
tion of advective uxes is given by
F
T ,adv
(U
1
, U
2
) = C(U
1
)
+
U
1
+ C(U
2
)
U
2
.
A better approximation is the van Leer scheme
F
T ,adv
(U
1
, U
2
)
=
_
C(U
1
) + C(
1
2
(U
1
+U
2
))
+
C(
1
2
(U
1
+U
2
))
_
U
1
+
_
C(U
2
) C(
1
2
(U
1
+U
2
))
+
+ C(
1
2
(U
1
+U
2
))
_
U
2
.
Both approaches require the computation of DF
adv
(V) n
K
1
to-
gether with its eigenvalues and eigenvectors for suitable values of V.
In general the van Leer scheme is more costly than the Steger-Warming
scheme since it requires three evaluations of C(V) instead of two. For
the Euler and Navier-Stokes equations, however, this extra cost can be
avoided proting from the particular structure F
adv
(V) n
K
1
= C(V)V
of these equations.
Example II.4.3. When applied to Burgers equation of Example II.4.2
(p. 89) the Steger-Warming scheme takes the form
F
T ,adv
(u
1
, u
2
) =
_
_
u
2
1
if u
1
0, u
2
0
u
2
1
+ u
2
2
if u
1
0, u
2
0
u
2
2
if u
1
0, u
2
0
0 if u
1
0, u
2
0
while the van Leer scheme reads
F
T ,adv
(u
1
, u
2
) =
_
u
2
1
if u
1
u
2
u
2
2
if u
1
u
2
.
II.4.11.5. Relation to nite element methods. The fact that the el-
ements of a dual mesh can be associated with the vertices of a nite
element partition gives a link between nite volume and nite element
methods:
Consider a function that is piecewise constant on the
dual mesh T , i.e. S
0,1
(T ). With we associate
a continuous piecewise linear function S
1,0
(
T ) cor-
responding to the nite element partition

T such that
(x
K
) =
K
for the vertex x
K
A
T
corresponding to
K T .
This link considerably simplies the analysis of nite volume meth-
ods and suggests a very simple and natural approach to a posteriori
error estimation and mesh adaptivity for nite volume methods:
Given the solution of the nite volume scheme compute the
corresponding nite element function .
Apply a standard a posteriori error estimator to .
Given the error estimator apply a standard mesh renement
strategy to the nite element mesh

T and thus construct a
new, locally rened partition

T .
Use

T to construct a new dual mesh T
. This is the renement

of T .
II.4.11.6. TVD and ENO schemes. The convergence analysis of -
nite volume methods is based on compactness arguments, in particular
the concept of compensated compactness. This requires to bound the
total variation of the numerical approximation and to avoid unphysical
oscillations. This leads to the concept of total variation diminishing
TVD and essentially non-oscillating ENO schemes. Corresponding
material may be found under the names of Enquvist, LeVeque, Osher,
Roe, Tadmor, . . ..
II.4.12. Discontinuous Galerkin methods. These methods
can be interpreted as a mixture of nite element and nite volume
methods. The basic idea of discontinuous Galerkin methods can be
described as follows:
Approximate U by discontinuous functions which are poly-
nomials with respect to space and time on small space-time
cylinders of the form K [(n 1), n] with K T .
For every such cylinder multiply the dierential equation by
a corresponding test-polynomial and integrate the result over
the cylinder.
Use integration by parts for the ux term.
Accumulate the contributions of all elements in T .
Compensate for the illegal integration by parts by adding ap-
propriate jump-terms across the element boundaries.
Stabilize the scheme in a Petrov-Galerkin way by adding suit-
able element residuals.
In their simplest form these ideas lead to the following discrete problem:
Compute U
0
T
, the L
2
-projection of U
0
onto S
k,1
(T ).
For n 1 successively nd U
n
T
S
k,1
(T ) such that
KT
1
_
K
M(U
n
T
) V
T

KT
_
K
F(U
n
T
) : V
T
+
EE
E
h
E
_
E
_
n
E
F(U
n
T
)V
T
E
+
KT
K
h
2
K
_
K
div F(U
n
T
) div F(V
T
)
=
KT
1
_
K
M(U
n1
T
) V
T
+
KT
_
K
g(, n) V
T
+
KT
K
h
2
K
_
K
g(, n) div F(V
T
)
holds for all V
T
.
This discretization can easily be generalized as follows:
The jump and stabilization terms can be chosen more judi-
ciously.
The time-step may not be constant.
The spacial mesh may depend on time.
The functions U
T
and V
T
may be piecewise polynomials of
higher order with respect to to time. Then the term
KT
_
n
(n1)
_
K
M(U
T
)
t
V
T
must be added on the left-hand side and terms of the form
M(U
T
)
t
V
T
must be added to the element residuals.
CHAPTER III
Implementation
III.1. Mesh-renement techniques
For step (4) of the adaptive algorithm I.1.1 (p. 7) we must provide
a device that constructs the next mesh T
k+1
from the current mesh
T
k
disposing of an error estimate
K
for every element K T
k
. This
requires two key-ingredients:
a marking strategy that decides which elements should be re-
ned and
renement rules which determine the actual subdivision of a
single element.
Since we want to ensure the admissibility of the next partition, we have
to avoid hanging nodes (cf. gure III.1.1). Therefore, the renement
process will proceed in two stages:
In the rst stage we determine a subset

T
k
of T
k
consisting
of all those elements that must be rened due to a too large
value of
K
. The renement of these elements usually is called
regular.
In the second stage additional elements are rened in order to
eliminate the hanging nodes which are eventually created dur-
ing the rst stage. The renement of these additional elements
is sometimes referred to as irregular.
H
H
H
H
H
H
H
H
H
H
H
H
Figure III.1.1. Hanging node

III.1.1. Marking strategies. There are two popular marking
strategies for determining the set

T
k
: the maximum strategy and the
equilibration strategy.
Algorithm III.1.1. (Maximum strategy)
(0) Given: a partition T , error estimates
K
for the elements K
T , and a threshold (0, 1).
Sought: a subset

T of marked elements that should be rened.
97
98 III. IMPLEMENTATION
(1) Compute
T ,max
= max
KT
K
.
(2) If
K

T ,max
mark K for renement and put it into the set

T .
Algorithm III.1.2. (Equilibration strategy)
(0) Given: a partition T , error estimates
K
for the elements K
T , and a threshold (0, 1).
Sought: a subset

T of marked elements that should be rened.
(1) Compute
T
=
KT
2
K
.
Set
T
= 0 and

T = .
(2) If
T
(1 )
T
return

T ; stop. Otherwise go to step 3.
(3) Compute

T ,max
= max
KT \
K
.
(4) For all elements K T
T check whether
K
=
T ,max
.
If this is the case, put K in

T and add
2
K
to
T
. Otherwise
skip K.
When all elements have been treated, return to step 2.
At the end of this algorithm the set

T satises
2
K
(1 )
KT
2
K
.
Both marking strategies yield comparable results. The maximum
strategy obviously is cheaper than the equilibration strategy. In both
strategies, a large value of leads to small sets

T , i.e., very few elements
are marked. A small value of leads to large sets

T , i.e., nearly all
elements are marked. A popular and well established choice is 0.5.
In many applications, one encounters the diculty that very few
elements have an extremely large estimated error, whereas the remain-
ing ones split into the vast majority with an extremely small estimated
error and a third group of medium size consisting of elements which
have an estimated error much less than the error of the elements in the
rst group and much larger than the error of the elements in the second
group. In this situation algorithms III.1.1 and III.1.2 will only rene
III.1. MESH-REFINEMENT TECHNIQUES 99
the elements of the rst group. This deteriorates the performance of
the adaptive algorithm. It can substantially be enhanced by a simple
modication:
Given a small percentage , we rst mark the % ele-
ments with largest estimated error for renement and
then apply algorithms III.1.1 and III.1.2 only to the re-
maining elements.
The following two Java-methods realize algorithms III.1.1 and
III.1.2 with this modication.
// mark elements of grid lv according to error estimate,
// modified maximum strategy
public void markModMaxEstimate( double[] eta, double threshold,
double excess ) {
int nel = l[lv].last - l[lv].first;
double maxError = max( eta, l[lv].first, l[lv].last );
// maximal estimated error
boolean iter = true;
while ( iter ) {
int discard = 0;
double newmax = 0.0;
for ( int i = l[lv].first; i < l[lv].last; i++ ) {
if ( eta[i] >= maxError )
discard++;
else
newmax = Math.max( newmax, eta[i] );
}
if ( discard < (int)( nel*excess ) )
maxError = newmax;
else
iter = false;
}
maxError *= threshold;
for ( int i = l[lv].first; i < l[lv].last; i++ ) // for all elements
if ( eta[i] >= maxError )
e[i].t = 5;
} // end of markModMaxEstimate
// mark elements of grid lv according to error estimate,
// equilibration strategy
public void markEquilibratedEstimate( double[] eta,
double threshold, double excess ) {
int nel = l[lv].last - l[lv].first;
double maxError = max( eta, l[lv].first, l[lv].last );
// maximal estimated error
double totalError = innerProduct( eta, eta, l[lv].first, l[lv].last );
boolean iter = true;
int discard = 0;
while ( iter ) {
if( e[i].t != 5 ) { // only non-marked elements
if ( eta[i] >= maxError ) {
e[i].t = 5;
discard++;
totalError -= eta[i]*eta[i];
}
else
}
}
if ( discard < (int)(excess*nel) )
maxError = newmax;
else
iter = false;
}
iter = true;
double sum = 0.0;
while ( iter ) {
if( e[i].t != 5 ) { // only non-marked elements
if ( eta[i] >= maxError ) {
e[i].t = 5;
sum += eta[i]*eta[i];
}
else
}
}
if ( sum < threshold*totalError )
maxError = newmax;
else
iter = false;
}
} // end of markEquilibratedEstimate
III.1.2. Regular renement. Elements that are marked for re-
nement often are rened by connecting their midpoints of edges. The
resulting elements are called red.
Triangles and quadrilaterals are thus subdivided into four smaller
triangles and quadrilaterals that are similar to the parent element and
have the same angles. Thus the shape parameter
h
K
K
of the elements
does not change.
This renement is illustrated by the top-left triangle of gure III.1.2
and by the top square of gure III.1.3. The numbers outside the ele-
ments indicate the local enumeration of edges and vertices of the parent
element. The numbers inside the elements close to the vertices indi-
cate the local enumeration of the vertices of the child elements. The
numbers +0, +1 etc. inside the elements give the enumeration of the
children.
Note that the enumeration of new elements and new vertices is
chosen in such a way that triangles and quadrilaterals may be treated
simultaneously without case selections.
Parallelepipeds are also subdivided into eight smaller similar par-
allelepipeds by joining the midpoints of edges.
@
@
@
@
@
@
i+1 i+2
i
i
i+1 i+2
@
@
@
@
@
@
i+1 i+2
i
i
i+1 i+2
@
@
@
@
@
@
i+1 i+2
i
i
i+1 i+2
@
@
@
@
@
@
i+1 i+2
i
i
i+1 i+2
@
@
@
0 0 0 0
0
i
+(i+1) +(i+2)
+(i)
+(3)
+0 +1
@
@
@
0 0 0 0
0 0
+0 +0
+1
+2 +1
+2
Figure III.1.2. Renement of triangles
For tetrahedrons, the situation is more complicated. Joining the
midpoints of edges introduces four smaller similar tetrahedrons at the
vertices of the parent tetrahedron plus a double pyramid in its interior.
The latter one is subdivided into four small tetrahedrons by cutting
it along two orthogonal planes. These tetrahedrons, however, are not
similar to the parent tetrahedron. But there are rules which determine
the cutting planes such that a repeated renement according to these
rules leads to at most four similarity classes of elements originating
from a parent element. Thus these rules guarantee that the shape pa-
rameter of the partition does not deteriorate during a repeated adaptive
renement procedure.
III.1.3. Additional renement. Since not all elements are re-
ned regularly, we need additional renement rules in order to avoid
hanging nodes (cf. gure III.1.1) and to ensure the admissibility of
the rened partition. These rules are illustrated in gures III.1.2 and
III.1.3.
For abbreviation we call the resulting elements green, blue, and
purple. They are obtained as follows:
a green element by bisecting exactly one edge,
a blue element by bisecting exactly two edges,
a purple quadrilateral by bisecting exactly three edges.
In order to avoid too acute or too abstuse triangles, the blue and
green renement of triangles obey to the following two rules:
In a blue renement of a triangle, the longest one of the re-
nement edges is bisected rst.
Before performing a green renement of a triangle it is checked
whether the renement edge is part of an edge which has been
i+3 i
i+1 i+2
i+2
i+3
i
i+1
i+3 i
i+1 i+2
i+2
i+3
i
i+1
i+3 i
i+1 i+2
i+2
i+3
i
i+1
i+3 i
i+1 i+2
i+2
i+3
i
i+1
i+3 i
i+1 i+2
i+2
i+3
i
i+1
0 0
0 0
+(i+3) +(i)
+(i+1) +(i+2)
A
A
A
A
A
A
A
A
A
A
A
A
0 0
0
+0 +1
+2
0
0
+0 +1
@
@
@
@
@
@
0 0
0
0
0
+0
+1
+2
+3
+4
@
@
@
@
@
@
0
0 0
0
0
+0
+1
+2
+3
+4
Figure III.1.3. Renement of quadrilaterals
bisected during the last ng generations. If this is the case, a
blue renement is performed instead.
The second rule is illustrated in gure III.1.4. The cross in the left part
represents a hanging node which should be eliminated by a green re-
nement. The right part shows the blue renement which is performed
instead. Here the cross represents the new hanging node which is cre-
ated by the blue renement. Numerical experiments indicate that the
optimal value of ng is 1. Larger values result in an excessive blow-up
of the renement zone.
@
@
@
@
@
@
@
@
@
@
@
@
@
@
@
Figure III.1.4. Forbidden green renement and sub-

stituting blue renement
III.1.4. Marked edge bisection. The marked edge bisection is
an alternative to the described regular red renement which does not
require additional renement rules for avoiding hanging nodes. It is
performed according to the following rules:
The coarsest mesh T
0
is constructed such that the longest edge
of any element is also the longest edge of the adjacent element
unless it is a boundary edge.
The longest edges of the elements in T
0
are marked.
Given a partition T
k
and an element thereof which should be
rened, it is bisected by joining the mid-point of its marked
edge with the vertex opposite to this edge.
When besecting the edge of an element, its two remaining
edges become the marked edges of the two resulting new tri-
angles.
This process is illustrated in gure III.1.5. The marked edges are la-
beled by .
@
@
@
@
@
@
@
@
@
@
@
@
@
@
@
@
@
@
@
@
@
@
@
@

@
@
@
@
@
@

Figure III.1.5. Subsequent marked edge bisection, the
marked edges are labeled by
III.1.5. Mesh-coarsening. The adaptive algorithm I.1.1 (p. 7) in
combination with the marking strategies of Algorithms III.1.1 (p. 97)
and III.1.2 (p. 98) produces a sequence of increasingly rened parti-
tions. In many situations, however, some elements must be coarsened
in the course of the adaptive process. For time dependent problems this
is obvious: A critical region, e.g. an interior layer, may move through
the spatial domain in the course of time. For stationary problems this
is less obvious. But, for elliptic problems one can prove that a possi-
ble coarsening is mandatory to ensure the optimal complexity of the
adaptive process.
The basic idea of the coarsening process is to go back in the hi-
erarchy of partitions and to cluster elements with too small an error.
The following algorithm goes m generations backwards, accumulates
the error indicators, and then advances n > m generations using the
marking strategies of Algorithms III.1.1 (p. 97) and III.1.2 (p. 98). For
stationary problems, typical values are m = 1 and n = 2. For time
dependent problems one may choose m > 1 and n > m+1 to enhance
the temporal movement of the renement zone.
Algorithm III.1.3. Given: A hierarchy T
0
, . . ., T
k
of adaptively re-
ned partitions, error indicators
K
for the elements K of T
k
, and
parameters 1 m k and n > m.
Sought: A new partition T
km+n
.
(1) For every element K T
km
set
K
= 0.
(2) For every element K T
k
determine its ancestor K
T
km
and add
2
K
to
2
K
.
(3) Successively apply Algorithms III.1.1 (p. 97) or III.1.2 (p. 98)
n times with as error indicator. In this process, equally dis-
tribute
K
over the siblings of K once an element K is subdi-
vided.
@
@
@
@
@
@
@
@
@
@
@
@
@
@
@
@
@
@
@
@
@
@
@
@
@
@
@
"
"
"
"
"
"
""
Figure III.1.6. The vertex marked is resolvable in

the left patch but not in the right one.
The following algorithm is particularly suited for the marked edge
bisection of section III.1.4. In the framework of Algorithm III.1.3 its
parameters are m = 1 and n = 2, i.e., it constructs the partition of
the next level simultaneously rening and coarsening elements of the
current partition. For its description we need some notations:
An element K of the current partition T has renement level
if it is obtained by subdividing times an element of the
coarsest partition.
Given a triangle K of the current partition T which is obtained
by bisecting a parent triangle K
, the vertex of K which is not

a vertex of K
is called the renement vertex of K.

A vertex z A of the current partition T and the correspond-
ing patch
z
are called resolvable (cf. gure III.1.6) if
z is the renement vertex of all elements contained in
z
and
all elements contained in
z
have the same renement
level.
Algorithm III.1.4. Given: A partition T , error indicators
K
for all
elements K of T , and parameters 0 <
1
<
2
< 1.
Sought: Subsets T
c
and T
r
of elements that should be coarsened and
rened, respectively.
(1) Set T
c
= , T
r
= and compute
T ,max
= max
KT

K
.
(2) For all K T check whether
K

2
T ,max
. If this is the
case, put K into T
r
.
(3) For all vertices z A check whether z is resolvable. If this
is the case and if max
K
z

K

1
T ,max
, put all elements
contained in
z
into T
c
.
Remark III.1.5. Algorithm III.1.4 obviously is a modication of the
maximum strategy of Algorithm III.1.1 (p. 97). A coarsening of ele-
ments can also be incorporated in the equilibration strategy of Algo-
rithm III.1.2 (p. 98).
III.1.6. Mesh-smoothing. In this section we describe mesh-
smoothing strategies which try to improve the quality of a partition
while retaining its topological structure. The vertices of the partition
are moved, but the number of elements and their adjacency remain un-
changed. All strategies use a process similar to the well-known Gauss-
Seidel algorithm to optimize a suitable quality function q over the class
of all partitions having the same topological structure. They dier in
the choice of the quality function. The strategies of this section do
not replace the mesh renement methods of the previous sections, they
complement them. In particular an improved partition may thus be
obtained when a further renement is impossible due to an exhausted
storage.
In order to simplify the presentation, we assume throughout this
section that all partitions exclusively consist of triangles.
III.1.6.1. The Optimization Process. We rst describe the optimiza-
tion process. To this end we assume that we dispose of a quality func-
tion q which associates with every element a non-negative number such
that a larger value of q indicates a better quality. Given a partition
T we want to nd an improved partition

T with the same number of
elements and the same adjacency such that
min
T
q(
K) > min
KT
q(K).
To this end we perform several iterations of the following smoothing
procedure similar to the Gau-Seidel iteration:
For every vertex z in the current partition T , x the vertices
of
z
and nd a new vertex z inside
z
such that
min
K
z
q(
K) > min
K
z
q(K).
The practical solution of the local optimization problem depends
on the choice of the quality function q. In what follows we will present
three possible choices for q.
III.1.6.2. A Quality Function Based on Geometrical Criteria. The
rst choice is purely based on the geometry of the partitions and tries
to obtain a partition which consists of equilateral triangles. To describe
this approach, we enumerate the vertices and edges of a given triangle
consecutively in counter-clockwise order from 0 to 2 such that edge i
is opposite to vertex i (cf. gures III.1.2 (p. 101) and III.1.3 (p. 102)).
Then edge i has the vertices i + 1 and i + 2 as its endpoints where all
expressions have to be taken modulo 3. With these notations we dene
the geometric quality function q
G
by
q
G
(K) =
4
3
2
(K)
1
(E
0
)
2
+
1
(E
1
)
2
+
1
(E
2
)
2
.
The function q
G
is normalized such that it attains its maximal value 1
for an equilateral triangle.
To obtain a more explicit representation of q
G
and to solve the
optimization problem, we denote by x
0
= (x
0,1
, x
0,2
), x
1
= (x
1,1
, x
1,2
),
and x
2
= (x
2,1
, x
2,2
) the co-ordinates of the vertices. Then we have
2
(K) =
1
2
_
(x
1,1
x
0,1
)(x
2,2
x
0,2
) (x
2,1
x
0,1
)(x
1,2
x
0,2
)
_
and
1
(E
i
)
2
= (x
i+2,1
x
i+1,1
)
2
+ (x
i+2,2
x
i+1,2
)
2
for i = 0, 1, 2. There are two main possibilities to solve the optimization
problem for q
G
.
In the rst approach, we determine a triangle K
1
in
z
such that
q
G
(K
1
) = min
K
z
q
G
(K)
and start the enumeration of its vertices at the vertex z. Then we
determine a point z
such that the points z
, x
1
, and x
2
are the vertices
of an equilateral triangle and that this enumeration of vertices is in
counter-clockwise order. Now, we try to nd a point z approximately
solving the optimization problem by a line search on the straight line
segment connecting z and z
.
In the second approach, we determine two triangles K
1
and K
2
in
z
such that
q
G
(K
1
) = min
K
z
q
G
(K) and q
G
(K
2
) = min
K
z
\K
1
q
G
(K).
Then, we determine the unique point z
such that the two triangles

corresponding to K
1
and K
2
with z replaced by z
have equal qualities

q
G
. This point can be computed explicitly from the co-ordinates of K
1
and K
2
which remain unchanged. If z
is within
z
, it is the optimal
solution z of the optimization problem. Otherwise we again try to nd
z by a line search on the straight line segment connecting z and z
.
III.1.6.3. A Quality Function Based on Interpolation. Our second
candidate for a quality function is given by
q
I
(K) = |(u
Q
u
L
)|
2
K
,
where u
Q
and u
L
denote the quadratic and linear interpolant, respec-
tively of u. Using the functions
E
of Section I.3.12 (p. 26) we have
u
Q
u
L
=
2
i=0
d
i
E
i
with
d
i
= u
_
1
2
(x
i+1
+ x
i+2
)
_
1
2
u(x
i+1
)
1
2
u(x
i+2
)
for i = 0, 1, 2 where again all indices have to be taken modulo 3. Hence,
we have
q
I
(K) = v
t
Bv with v =
_
d
0
d
1
d
2
_
and B
ij
=
_
K
E
i

E
j
for i, j = 0, 1, 2. A straightforward calculation yields
B
ii
=

1
(E
0
)
2
+
1
(E
1
)
2
+
1
(E
2
)
2
3
2
(K)
=
4
3
1
q
G
(K)
for all i and
B
ij
=
2(x
i+2
x
i+1
) (x
j+2
x
j+1
)
3
2
(K)
for i ,= j. Since B is spectrally equivalent to its diagonal, we approxi-
mate q
I
(K) by
q
I
(K) =
1
q
G
(K)
2
i=0
d
2
i
.
To obtain an explicit representation of q
I
in terms of the geometrical
data of K, we assume that the second derivatives of u are constant on
K. Denoting by H
K
the Hessian matrix of u on K, Taylors formula
then yields
d
i
=
1
8
(x
i+2
x
i+1
)
t
H
K
(x
i+2
x
i+1
)
for i = 0, 1, 2. Hence, with this assumption, q
I
is a rational function
with quadratic polynomials in the nominator and denominator. The
optimization problem can therefore be solved approximately with a few
steps of a damped Newton iteration. Alternatively we may adopt our
previous geometrical reasoning with q
G
replaced by q
I
.
III.1.6.4. A Quality Function Based on an Error Indicator. The
third choice of a quality function is given by
q
E
(K) =
_
K
i=0
e
i
E
i
2
,
where the coecients e
0
, e
1
and e
2
are computed from an error indicator
K
. Once we dispose of these coecients, the optimization problem for
the function q
E
may be solved in the same way as for q
I
.
The computation of the coecients e
0
, e
1
and e
2
is particularly
simple for the error indicator
N,K
of section II.2.1.3 (p. 44) which
is based on the solution of local Neumann problems on the elements.
Denoting by v
K
the solution of the auxiliary, we compute the e
i
by
solving the least-squares problem
minimize
_
K
v
K

2
i=0
e
i
E
i
2
.
For the error indicator
D,K
of section II.2.1.2 (p. 43) which is based
on the solution of an auxiliary discrete Dirichlet problem on the patch
K
, we may proceed in a similar way and simply replace v
K
by the
restriction to K of v
K
, the solution of the auxiliary problem.
For the residual error indicator
R,K
of section II.1 (p. 30), nally,
we replace the function v
K
by
h
K
_
f
K
+ u
T
_
K

2
i=0
h
E
[E
i
]
1
2
J
E
(n
E
i
u
T
)
E
i
with the obvious modications for edges on the boundary .
III.2. Data structures
In this section we shortly describe the required data structures for
a Java or C++ implementation of an adaptive nite element algorithm.
For simplicity we consider only the two-dimensional case. Note that the
data structures are independent of the particular dierential equation
and apply to all engineering problems which require the approximate
solution of partial dierential equations. The described data structures
III.2. DATA STRUCTURES 109
are realized in the Java applet ALF (Adaptive Linear Finite elements).
It is available at the address
www.rub.de/num1/files/ALFApplet
together with a user guide in pdf-format
www.rub.de/num1/files/ALFUserguide.pdf .
III.2.1. Nodes. The class NODE realizes the concept of a node,
i.e., of a vertex of a grid. It has three members c, t, and d.
The member c stores the co-ordinates in Euclidean 2-space. It is a
double array of length 2.
The member t stores the type of the node. It equals 0 if it is an
interior point of the computational domain. It is k, k > 0, if the node
belongs to the k-th component of the Dirichlet boundary part of the
computational domain. It equals k, k > 0, if the node is on the k-th
component of the Neumann boundary.
The member d gives the address of the corresponding degree of freedom.
It equals 1 if the corresponding node is not a degree of freedom since,
e.g., it lies on the Dirichlet boundary. This member takes into account
that not every node actually is a degree of freedom.
III.2.2. Elements. The class ELEMENT realizes the concept of an
element. Its member nv determines the element type, i.e., triangle
or quadrilateral. Its members v and e realize the vertex and edge
informations, respectively. Both are integer arrays of length 4.
The vertices are enumerated consecutively in counter-clockwise order,
v[i] gives the global number of the i-th vertex. It is assumed that
v[3] = 1 if nv= 3.
The edges are also enumerated consecutively in counter-clockwise order
such that the i-th edge has the vertices i + 1 mod nv and i + 2 mod nv
as its endpoints. Thus, in a triangle, edge i is opposite vertex i.
A value e[i] = 1 indicates that the corresponding edge is on a straight
part of the boundary. Similarly e[i] = k 2, k 0, indicates that
the endpoints of the corresponding edge are on the k-th curved part of
the boundary. A value e[i] = j 0 indicates that edge i of the current
element is adjacent to element number j. Thus the member e decribes
the neighbourhood relation of elements.
The members p, c, and t realize the grid hierarchy and give the number
of the parent, the number of the rst child, and the renement type,
respectively. In particular we have
t
_
_
0 if the element is not rened
1, . . . , 4 if the element is rened green
5 if the element is rened red
6, . . . , 24 if the element is rened blue
25, . . . , 100 if the element is rened purple.
At rst sight it may seem strange to keep the information about nodes
and elements in dierent classes. But this approach has several advan-
tages:
It minimizes the storage requirement. The co-ordinates of a
node must be stored only once. If nodes and elements are rep-
resented by a common structure, these co-ordinates are stored
4 6 times.
The elements represent the topology of the grid which is inde-
pendent of the particular position of the nodes. If nodes and
elements are represented by dierent structures it is much eas-
ier to implement mesh smoothing algorithms which aect the
position of the nodes but do not change the mesh topology.
III.2.3. Grid hierarchy. When creating a hierarchy of adaptively
rened grids, the nodes are completely hierarchical, i.e., a node of grid
T
i
is also a node of any grid T
j
with j > i. Since in general the grids
are only partly rened, the elements are not completely hierarchical.
Therefore, all elements of all grids are stored.
The information about the dierent grids is implemented by the
class LEVEL. Its members nn, nt, nq, and ne give the number of nodes,
triangles, quadrilaterals, and edges, resp. of a given grid. The members
first and last give the addresses of the rst element of the current
grid and of the rst element of the next grid, respectively. The mem-
ber dof yields the number of degrees of freedom of the corresponding
discrete nite element problems.
III.3. Numerical examples
The examples of this section are computed with the demonstration
Java-applet ALF on a MacIntosh G4 powerbook. The linear systems
are solved with a multi-grid V-cycle algorithm in examples III.3.1
III.3.3 and a multi-grid W-cycle algorithm in example III.3.4. In ex-
amples III.3.1 and III.3.2 we use one Gau-Seidel forward sweep for pre-
smoothing and one backward Gau-Seidel sweep for post-smoothing.
In example III.3.3 the smoother is one symmetric Gau-Seidel sweep
for a downwind re-enumeration for the unknowns. In example III.3.4
we use two steps of a symmetric Gau-Seidel algorithm for pre- and
post-smoothing. Tables III.3.1 III.3.4 give for all examples the fol-
lowing quantities:
L: the number of renement levels,
NN: the number of unknowns,
NT: the number of triangles,
NQ: the number of quadrilaterals,
: the true relative error
uu
T
1
u
1
, if the exact solution is known,
and the estimated relative error

u
T
1
, if the exact solution is
unknown,
III.3. NUMERICAL EXAMPLES 111
q: the eciency index

uu
T
1
of the error estimator provided
the exact solution is known.
Example III.3.1. We consider the Poisson equation
u = 0 in
u = g on
in the L-shaped domain = (1, 1)
2
(0, 1) (1, 0). The boundary
data g are chosen such that the exact solution in polar co-ordinates is
u = r
2
3
sin
_
3
2
_
.
The coarsest mesh for a partition into quadrilaterals consists of three
squares with sides of length 1. The coarsest triangulation is obtained by
dividing each of these squares into two triangles by joining the top-left
and bottom-right corner of the square. For both coarsest meshes we
rst apply a uniform renement until the storage capacity is exhausted.
Then we apply an adaptive renement strategy based on the residual
error estimator
R,K
of section II.1.9 (p. 38) and the maximum strategy
of algorithm III.1.1 (p. 97). The renement process is stopped as soon
as we obtain a solution with roughly the same relative error as the
solution on the nest uniform mesh. The corresponding numbers are
given in table III.3.1. Figures III.3.1 and III.3.2 show the nest meshes
obtained by the adaptive process.
Table III.3.1. Comparison of uniform and adaptive re-
nement for example III.3.1
triangles quadrilaterals
uniform adaptive uniform adaptive
L 5 5 5 5
NN 2945 718 2945 405
NT 6144 1508 0 524
NQ 0 0 3072 175
(%) 1.3 1.5 3.6 3.9
q - 1.23 - 0.605
Example III.3.2. Now we consider the reaction-diusion equation
u +
2
u = f in
u = 0 on
in the square = (1, 1)
2
. The reaction parameter is chosen equal
to 100. The right-hand side f is such that the exact solution is
u = tanh
_
(x
2
+ y
2
1
4
)
_
.
Figure III.3.1. Adaptively rened triangulation of
level 5 for example III.3.1
Figure III.3.2. Adaptively rened partition into
squares of level 5 for example III.3.1
It exhibits an interior layer along the boundary of the circle of radius
1
2
centered at the origin. The coarsest mesh for a partition into squares
consists of 4 squares with sides of length 1. The coarsest triangulation
again is obtained by dividing each square into two triangles by joining
the top-left and right-bottom corners of the square. For the compari-
son of adaptive and uniform renement we proceed as in the previous
example. In order to take account of the reaction term, the error es-
timator now is the modied residual estimator
R;K
of section II.3.1.3
(p. 64).
L 5 6 5 6
NN 3969 1443 3969 2650
NT 8192 2900 0 1600
NQ 0 0 4096 1857
(%) 3.8 3.5 6.1 4.4
q - 0.047 - 0.041
Figure III.3.3. Adaptively rened triangulation of
level 6 for example III.3.2
Example III.3.3. Next we consider the convection-diusion equation
u +a u = 0 in
u = g on
squares of level 6 for example III.3.2
adaptive adaptive adaptive adaptive
excess 0% excess 20% excess 0% excess 20%
L 8 6 9 7
NN 5472 2945 2613 3237
NT 11102 6014 1749 3053
NQ 0 0 1960 1830
(%) 0.4 0.4 0.6 1.2
2
. The diusion parameter is
=
1
100
,
the convection is
a =
_
2
1
_
,
and the boundary condition is
g =
_
0 on the left and top boundary,
100 on the bottom and right boundary.
The exact solution of this problem is unknown, but it is known that
it exhibits an exponential boundary layer at the boundary x = 1,
y > 0 and a parabolic interior layer along the line connecting the
points (1, 1) and (1, 0). The coarsest meshes are determined as in
example III.3.2. Since the exact solution is unknown, we cannot give
the eciency index q and perform only an adaptive renement. The
error estimator is the one of section II.3.1.3 (p. 64). Since the exponen-
tial layer is far stronger than the parabolic one, the maximum strategy
of algorithm III.1.1 (p. 97) leads to a renement preferably close to
the boundary x = 1, y > 0 and has diculties in catching the para-
bolic interior layer. This is in particular demonstrated by gure III.3.7.
We therefor also apply the modied maximum strategy of section III.3
with an excess of 20%, i.e., the 20% elements with largest error are
rst rened regularly and the maximum strategy is then applied to the
remaining elements.
Figure III.3.5. Adaptively rened triangulation of ex-
ample III.3.3 with renement based on the maximum
strategy
Example III.3.4. Finally we consider a diusion equation
div(Agrad u) = 1 in
u = 0 on
ample III.3.3 with renement based on the modied max-
imum strategy with excess of 20%
squares of example III.3.3 with renement based on the
maximum strategy
squares of example III.3.3 with renement based on the
modied maximum strategy with excess of 20%
2
with a discontinuous diusion
A =
_
_
_
10
90
11
90
11
10
_
in 4x
2
+ 16y
2
< 1,
_
1 0
0 1
_
in 4x
2
+ 16y
2
1.
The exact solution of this problem is not known. Hence we cannot give
the eciency index q. The coarsest meshes are as in examples III.3.2
and III.3.3. The adaptive process is based on the error estimator of
section II.3.1.3 (p. 64) and the maximum strategy of algorithm III.1.1
(p. 97).
L 5 6 5 6
NN 3969 5459 3969 2870
NT 8192 11128 0 1412
NQ 0 0 4096 2227
(%) - 2.5 - 14.6
ample III.3.4
squares of example III.3.4
CHAPTER IV
Solution of the discrete problems
IV.1. Overview
To get an overview of the particularities of the solution of nite el-
ement problems, we consider a simple, but instructive model situation:
the model problem of section II.1.2 (p. 30) on the unit square (0, 1)
2
(d = 2) or the unit cube (0, 1)
3
(d = 3) discretized by linear elements
(section II.1.3 (p. 30) with k = 1) on a mesh that consists of squares
(d = 2) or cubes (d = 3) with edges of length h =
1
n
.
The number of unknowns is
N
h
=
_
1
n 2
_
d
.
The stiness matrix L
h
is symmetric positive denite and sparse; every
row contains at most 3
d
non-zero elements. The total number of non-
zero entries in L
h
is
e
h
= 3
d
N
h
.
The ratio of non-zero entries to the total number of entries in L
h
is
p
h
=
e
h
N
2
h
3
d
N
1
h
.
The stiness matrix is a band matrix with bandwidth
b
h
= h
d+1
N
1
1
d
h
.
Therefore the Gaussian elimination, the LR-decomposition or the Cho-
lesky decomposition require
s
h
= b
h
N
h
N
2
1
d
h
bytes for storage and
z
h
= b
2
h
N
h
N
3
2
d
h
arithmetic operations.
These numbers are collected in table IV.1.1. It clearly shows that
direct methods are not suited for the solution of large nite element
problems both with respect to the storage requirement as with respect
to the computational work. Therefore one usually uses iterative meth-
ods for the solution of large nite element problems. Their eciency
is essentially determined by the following considerations:
119
120 IV. SOLUTION OF THE DISCRETE PROBLEMS
Table IV.1.1. Storage requirement and arithmetic op-
erations of the Cholesky decomposition applied to the
linear nite element discretization of the model problem
on (0, 1)
d
d h N
h
e
h
b
h
s
h
z
h
1
16
225 1.1 10
3
15 3.3 10
3
7.6 10
5
2
1
32
961 4.8 10
3
31 2.9 10
4
2.8 10
7
1
64
3.9 10
3
2.0 10
4
63 2.5 10
5
9.9 10
8
1
128
1.6 10
4
8.0 10
4
127 2.0 10
6
3.3 10
10
1
16
3.3 10
3
2.4 10
4
225 7.6 10
5
1.7 10
8
3
1
32
3.0 10
4
2.1 10
5
961 2.8 10
7
2.8 10
10
1
64
2.5 10
5
1.8 10
6
3.9 10
3
9.9 10
8
3.9 10
12
1
128
2.0 10
6
1.4 10
7
1.6 10
4
3.3 10
10
5.3 10
14
The exact solution of the nite element problem is an approx-
imation of the solution of the dierential equation, which is
the quantity of interest, with an error O(h
k
) where k is the
polynomial degree of the nite element space. Therefore it is
sucient to compute an approximate solution of the discrete
problem which has the same accuracy.
If the mesh T
1
is a global or local renement of the mesh T
0
, the
interpolate of the approximate discrete solution corresponding
to T
0
is a good initial guess for any iterative solver for the
discrete problem corresponding to T
1
.
These considerations lead to the following nested iteration. Here T
0
,
. . ., T
R
denotes a sequence of successively (globally or locally) rened
meshes with corresponding nite element problems
L
k
u
k
= f
k
0 k R.
Algorithm IV.1.1. (Nested iteration)
(1) Compute
u
0
= u
0
= L
1
0
f
0
.
(2) For k = 1, . . ., R compute an approximate solution u
k
for
u
k
= L
1
k
f
k
by applying m
k
iterations of an iterative solver for
the problem
L
k
u
k
= f
k
with starting value I
k1,k
u
k1
, where I
k1,k
is a suitable inter-
polation operator from the mesh T
k1
to the mesh T
k
.
IV.1. OVERVIEW 121
Usually, the number m
k
of iterations in algorithm IV.1.1 is deter-
mined by the stopping criterion
|f
k
L
k
u
k
| |f
k
L
k
(I
k1,k
u
k1
)|.
That is, the residual of the starting value measured in an appropriate
norm should be reduced by a factor . Typically, || is a weighted
Euclidean norm and is in the realm 0.05 to 0.1. If the iterative solver
has the convergence rate
k
, the number m
k
of iterations is given by
m
k
=
_
ln
ln
k
_
.
Table IV.1.2 gives the number m
k
of iterations that require the clas-
sical Gau-Seidel algorithm, the conjugate gradient algorithm IV.3.1
(p. 124) and the preconditioned conjugate gradient algorithm IV.3.2
(p. 126) with SSOR-preconditioning IV.3.3 (p. 128) for reducing an
initial residual by the factor = 0.1. These algorithms need the fol-
lowing number of operations per unknown:
2d + 1 (Gau-Seidel),
2d + 6 (CG),
5d + 8 (SSOR-PCG).
Table IV.1.2. Number of iterations required for reduc-
ing an initial residual by the factor 0.1
h Gau-Seidel CG SSOR-PCG
1
16
236 12 4
1
32
954 23 5
1
64
3820 47 7
1
128
15287 94 11
Table IV.1.2 shows that the preconditioned conjugate gradient algo-
rithm with SSOR-preconditioning yields satisfactory results for prob-
lems that are not too large. Nevertheless, its computational work is
not proportional to the number of unknowns; for a xed tolerance it
approximately is of the order N
1+
1
2d
h
. The multigrid algorithm IV.4.1
(p. 132) overcomes this drawback. Its convergence rate is indepen-
dent of the mesh-size. Correspondingly, for a xed tolerance , its
computational work is proportional to the number of unknowns. The
advantages of the multigrid algorithm are reected by table IV.1.3.
Table IV.1.3. Arithmetic operations required by the
preconditioned conjugate gradient algorithm with SSOR-
preconditioning and the V-cycle multigrid algorithm
with one Gau-Seidel step for pre- and post-smoothing
applied to the model problem in (0, 1)
d
d h PCG-SSOR multigrid
1
16
16
200 11
700
2
1
32
86
490 48
972
1
64
500
094 206
988
1
128
3
193
542 838
708
1
16
310
500 175
500
3
1
32
3
425
965 1
549
132
1
64
4.0 10
7
1.3 10
7
1
128
5.2 10
8
1.1 10
8
IV.2. Classical iterative solvers
The setting of this and the following section is as follows: We want
to solve a linear system of equations
Lu = f
with N unknowns and a symmetric positive denite matrix L. We
denote by the condition of L, i.e. the ratio of its largest to its
smallest eigenvalue. Moreover we assume that N
2
d
.
All methods of this section are so-called stationary iterative solvers
and have the following structure:
Algorithm IV.2.1. (Stationary iterative solver)
(0) Given: matrix L, right-hand side f, initial guess u
0
and toler-
ance .
Sought: an approximate solution of the linear system of equa-
tions Lu = f.
(1) Set i = 0.
(2) If
[Lu
i
f[ < ,
return u
i
as approximate solution; stop.
(3) Compute
u
i+1
= F(u
i
; L, f)
increase i by 1 and return to step (2).
IV.3. CONJUGATE GRADIENT ALGORITHMS 123
Here, u F(u; L, f) is an ane mapping, the so-called iteration
method, which characterises the particular iterative solver. [[ is any
norm on R
N
, e.g., the Euclidean norm.
The simplest method is the Richardson iteration. The iteration
method is given by
u u +
1
(f Lu).
Here, is a damping parameter, which has to be of the same order as
the largest eigenvalue of L. The convergence rate of the Richardson
iteration is
1
+1
1 N
2
d
.
The Jacobi iteration is closely related to the Richardson iteration.
The iteration method is given by
u u + D
1
(f Lu).
Here, D is the diagonal of L. The convergence rate again is
1
+1

1 N
2
d
. Notice, the Jacobi iteration sweeps through all equations
and exactly solves the current equation for the corresponding unknown
without modifying subsequent equations.
The Gau-Seidel iteration is a modication of the Jacobi iteration:
Now every update of an unknown is immediately transferred to all
subsequent equations. This modication gives rise to the following
iteration method:
u u +/
1
(f Lu).
Here, / is the lower diagonal part of L diagonal included. The conver-
gence rate again is
1
+1
1 N
2
d
.
IV.3. Conjugate gradient algorithms
IV.3.1. The conjugate gradient algorithm. The conjugate
gradient algorithm is based on the following ideas:
For symmetric positive denite stiness matrices L the solu-
tion of the linear system of equations
Lu = f
is equivalent to the minimization of the quadratic functional
J(u) =
1
2
u (Lu) f u.
Given an approximation v to the solution u of the linear sys-
tem, the negative gradient
J(v) = f Lv
of J at v gives the direction of the steepest descent.
Given an approximation v and a search direction d ,= 0, J
attains its minimum on the line t v + td at the point
t
=
d (f Lv)
d (Ld)
.
When successively minimizing J in the directions of the neg-
ative gradients, the algorithm slows down since the search di-
rections become nearly parallel.
The algorithm speeds up when choosing the successive search
directions L-orthogonal, i.e.
d
i
(Ld
i1
) = 0
for the search directions of iterations i 1 and i.
These L-orthogonal search directions can be computed during
the algorithm by a suitable three-term recursion.
Algorithm IV.3.1. (Conjugate gradient algorithm)
(0) Given: a linear system of equations Lu = f with a symmetric,
positive denite matrix L, an initial guess u
0
for the solution,
and a tolerance > 0.
Sought: an approximate solution of the linear system.
(1) Compute
r
0
= f Lu
0
,
d
0
= r
0
,
0
= r
0
r
0
.
Set i = 0.
(2) If
i
<
2
return u
i
as approximate solution; stop. Otherwise go to step
3.
(3) Compute
s
i
= Ld
i
,
i
=

i
d
i
s
i
,
u
i+1
= u
i
+
i
d
i
,
r
i+1
= r
i
i
s
i
,
i+1
= r
i+1
r
i+1
,
i
=

i+1
i
,
d
i+1
= r
i+1
+
i
d
i
.
Increase i by 1 and go to step 2.
The convergence rate of the CG-algorithm is given by
=
+ 1
where is the condition number of L and equals the ratio of the largest
to the smallest eigenvalue of L. For nite element discretizations of
elliptic equations of second order, we have h
2
and correspondingly
1 h, where h is the mesh-size.
The following Java methods implement the CG-algorithm.
// CG algorithm
public void cg() {
int iter = 0;
double alpha, beta, gamma;
res = multiply(a, x);
res = add(b, res, 1.0, -1.0);
d = new double[dim];
copy(d, res);
gamma = innerProduct(res, res)/dim;
residuals[0] = Math.sqrt(gamma);
while ( iter < maxit-1 && residuals[iter] > tol ) {
iter++;
b = multiply(a, d);
alpha = innerProduct(d, b)/dim;
alpha = gamma/alpha;
accumulate(x, d, alpha);
accumulate(res, b, -alpha);
beta = gamma;
gamma = innerProduct(res, res)/dim;
residuals[iter] = Math.sqrt(gamma);
beta = gamma/beta;
d = add(res, d, 1.0, beta);
}
} // end of cg
// copy vector v to vector u
public void copy(double[] u, double[] v) {
for( int i = 0; i < dim; i++ )
u[i] = v[i];
} // end of copy
// multiply matrix c with vector y
public double[] multiply(double[][] c, double[] y) {
double[] result = new double[dim];
result[i] = innerProduct(c[i], y);
return result;
} // end of multiply
// return s*u+t*v
public double[] add(double[] u, double[] v, double s, double t) {
double[] w = new double[dim];
w[i] = s*u[i] + t*v[i];
return w;
} // end of add
// inner product of two vectors
public double innerProduct(double[] u, double[] v) {
double prod = 0;
prod += u[i]*v[i];
return prod;
} // end of inner product
// add s times vector v to vector u
public void accumulate(double[] u, double[] v, double s) {
u[i] += s*v[i];
} // end of accumulate
IV.3.2. The preconditioned conjugate gradient algorithm.
The idea of the preconditioned conjugate gradient algorithm is the
following:
Instead of the original system
Lu = f
solve the equivalent system
L u =

f
with
L = H
1
LH
t
f = H
1
f
u = H
t
u
and an invertible square matrix H.
Choose the matrix H such that:
The condition number of

L is much smaller than the one
of L.
Systems of the form
Cv = d
with
C = HH
t
are much easier to solve than the original system Lu = f.
Apply the conjugate gradient algorithm to the new system
L u =

f and express everything in terms of the original quan-
tities L, f, and u.
Algorithm IV.3.2. (Preconditioned conjugate gradient algorithm)
(0) Given: a linear system of equations Lu = f with a symmetric,
positive denite matrix L, an approximation C to L, an initial
guess u
0
for the solution, and a tolerance > 0.
(1) Compute
r
0
= f Lu
0
,
solve
Cz
0
= r
0
,
and compute
d
0
= z
0
,
0
= r
0
z
0
.
Set i = 0.
(2) If
i
<
2
return u
i
3.
(3) Compute
s
i
= Ld
i
,
i
=

i
d
i
s
i
,
u
i+1
= u
i
+
i
d
i
,
r
i+1
= r
i
i
s
i
,
solve
Cz
i+1
= r
i+1
,
and compute
i+1
= r
i+1
z
i+1
,
i
=

i+1
i
,
d
i+1
= z
i+1
+
i
d
i
.
For the trivial choice C = I, the identity matrix, algorithm IV.3.2
reduces to the conjugate gradient algorithm IV.3.1. For the non-
realistic choice C = A, algorithm IV.3.2 stops after one iteration and
produces the exact solution.
The convergence rate of the PCG-algorithm is given by
=
+ 1
where is the condition number of

L and equals the ratio of the largest
to the smallest eigenvalue of

L.
Obviously the eciency of the PCG-algorithm hinges on the good
choice of the preconditioning matrix C. It has to satisfy the contra-
dictory goals that

L should have a small condition number and that
problems of the form Cz = d should be easy to solve. A good compro-
mise is the SSOR-preconditioner. It corresponds to
C =
1
(2 )
(D U
t
)D
1
(D U)
where D and U denote the diagonal of L and its strictly upper diagonal
part, respectively and where (0, 2) is a relaxation parameter.
The following algorithm realizes the SSOR-preconditioning.
Algorithm IV.3.3. (SSOR-preconditioning)
(0) Given: r and a relaxation parameter (0, 2).
Sought: z = C
1
r.
(1) Set
z = 0.
(2) For i = 1, . . . , N
h
compute
z
i
= z
i
+ L
1
ii
r
i
N
h
j=1
L
ij
z
j
.
(3) For i = N
h
, . . . , 1 compute
z
i
= z
i
+ L
1
ii
r
i
N
h
j=1
L
ij
z
j
.
For nite element discretizations of elliptic equations of second or-
der and the SSOR-preconditioning of algorithm IV.3.3, we have
h
1
and correspondingly 1 h
1
2
, where h is the mesh-size.
The following Java methods implement algorithm IV.3.2 with the
preconditioning of algorithm IV.3.3.
// CG algorithm with SSOR preconditioning
public void ssorPcg() {
int iter = 0;
double alpha, beta, gamma;
res = multiply(a, x);
res = add(b, res, 1.0, -1.0);
z = new double[dim];
ssor(z, res); // 1 SSOR iteration with initial value 0
d = new double[dim];
d[i] = z[i];
gamma = innerProduct(res, z)/dim;
residual = Math.sqrt( innerProduct(res, res)/dim );
while ( iter < maxit-1 && residual > tol ) {
iter++;
b = multiply(a, d);
alpha = innerProduct(d, b)/dim;
alpha = gamma/alpha;
accumulate(x, d, alpha);
accumulate(res, b, -alpha);
ssor(z, res); // 1 SSOR iteration with initial value 0
beta = gamma;
gamma = innerProduct(res, z)/dim;
residual = Math.sqrt( innerProduct(res, res)/dim );
beta = gamma/beta;
d = add(z, d, 1.0, beta);
}
} // end of ssorPcg
// 1 SSOR iteration with initial value 0 for u and right-hand side v
public void ssor(double[] u, double[] v) {
double relax = 1.5;
for ( int i = 0; i < dim; i++ )
u[i] = 0.0;
for ( int i = 0; i < dim; i++ )
u[i] += relax*(v[i] - innerProduct(a[i], u))/a[i][i];
for ( int i = dim-1; i >= 0; i-- )
u[i] += relax*(v[i] - innerProduct(a[i], u))/a[i][i];
} // end of ssor
IV.3.3. Non-symmetric and indenite problems. The CG-
and the PCG-algorithms IV.3.1 and IV.3.2 can only be applied to
problems with a symmetric positive denite stiness matrix, i.e., to
scalar linear elliptic equations without convection and the displace-
ment formulation of the equations of linearized elasticity. Scalar lin-
ear elliptic equations with convection though possibly being small
and mixed formulations of the equations of linearized elasticity lead
to non-symmetric or indenite stiness matrices. For these problems
algorithms IV.3.1 and IV.3.2 break down.
There are several possible remedies to this diculty. An obvious
one is to consider the equivalent normal equations
L
t
Lu = L
t
f
which have a symmetric positive matrix. This simple device, however,
cannot be recommended, since passing to the normal equations squares
the condition number and thus doubles the number of iterations. A
much better alternative is the bi-conjugate gradient algorithm. It tries
to solve simultaneously the original problem Lu = f and its adjoint or
conjugate problem L
t
v = L
t
f.
Algorithm IV.3.4. (Stabilized bi-conjugate gradient algorithm Bi-
CG-stab)
(0) Given: a linear system of equations Lx = b, an initial guess
x
0
for the solution, and a tolerance > 0.
(1) Compute
r
0
= b Lx
0
,
and set
r
0
= r
0
, v
1
= 0 , p
1
= 0 ,
1
= 1 ,
1
= 1 ,
1
= 1 .
Set i = 0.
(2) If
r
i
r
i
<
2
return x
i
3.
(3) Compute
i
= r
i
r
i
,
i1
=

i
i1
i1
i1
.
If
[
i1
[ <
there is a possible break-down; stop. Otherwise compute
p
i
= r
i
+
i1
p
i1
i1
v
i1
,
v
i
= Lp
i
,
i
=

i
r
0
v
i
.
If
[
i
[ <
there is a possible break-down; stop. Otherwise compute
s
i
= r
i
i
v
i
,
t
i
= Ls
i
,
i
=
t
i
s
i
t
i
t
i
,
x
i+1
= x
i
+
i
p
i
+
i
s
i
,
r
i+1
= s
i
i
t
i
.
The following Java method implements the Bi-CG-stab algorithm.
Note that it performs an automatic restart upon break-down at most
MAXRESTART times.
// pre-contioned stabilized bi-conjugate gradient method
public void pBiCgStab( SMatrix a, double[] b ) {
set(ab, 0.0, dim);
a.mvmult(ab, x);
add(res, b, ab, 1.0, -1.0, dim);
set(s, 0.0, dim);
set(rf, 0.0, dim);
IV.4. MULTIGRID ALGORITHMS 131
double alpha, beta, gamma, rho, omega;
it = 0;
copy(rf, res, dim);
int restarts = -1; // restart in case of breakdown
while ( it < maxit && norm(res) > tolerance &&
restarts < MAXRESTART ) {
restarts++;
rho = 1.0, alpha = 1.0, omega = 1.0;
set(d, 0.0, dim);
set(v, 0.0, dim);
while ( it < maxit && lastRes > tolerance ) {
gamma = rho*omega;
if( Math.abs( gamma ) < EPS ) break;
// break-down do restart
rho = innerProduct(rf, res, dim);
beta = rho*alpha/gamma;
acc(d, v, -omega, dim);
add(d, res, d, 1.0, beta, dim);
a.mvmult(v, d);
gamma = innerProduct(rf, v, dim);
if( Math.abs( gamma ) < EPS ) break;
// break-down do restart
alpha = rho/gamma;
add(s, res, v, 1.0, -alpha, dim);
a.mvmult(ab, s);
gamma = innerProduct(ab, ab, dim);
omega = innerProduct(ab, s, dim)/gamma;
acc(x, d, alpha, dim);
acc(x, s, omega, dim);
add(res, s, ab, 1, -omega, dim);
it++;
}
}
} // end of pBiCgStab
IV.4. Multigrid algorithms
Multigrid algorithms are based on the following observations:
Classical iterative methods such as the Gau-Seidel algorithm
quickly reduce highly oscillatory error components.
Classical iterative methods such as the Gau-Seidel algorithm
on the other hand are very poor in reducing slowly oscillatory
error components.
Slowly oscillating error components can well be resolved on
coarser meshes with fewer unknowns.
IV.4.1. The multigrid algorithm. The multigrid algorithm is
based on a sequence of meshes T
0
, . . ., T
R
, which are obtained by suc-
cessive local or global renement, and associated discrete problems
L
k
u
k
= f
k
, k = 0, . . ., R, corresponding to a partial dierential equa-
tion. The nest mesh T
R
corresponds to the problem that we actually
want to solve.
The multigrid algorithm has three ingredients:
a smoothing operator M
k
, which should be easy to evaluate and
which at the same time should give a reasonable approximation
to L
1
k
,
a restriction operator R
k,k1
, which maps functions on a ne
mesh T
k
to the next coarser mesh T
k1
,
a prolongation operator I
k1,k
, which maps functions from a
coarse mesh T
k1
to the next ner mesh T
k
.
For a concrete multigrid algorithm these ingredients must be specied.
This will be done in the next sections. Here, we discuss the general
form of the algorithm and its properties.
Algorithm IV.4.1. (MG (k, ,
1
,
2
, L
k
, f
k
, u
k
) one iteration of the
multigrid algorithm on mesh T
k
)
(0) Given: the level number k of the actual mesh, parameters ,
1
, and
2
, the stiness matrix L
k
of the actual discrete prob-
lem, the actual right-hand side f
k
, and an inital guess u
k
.
Sought: an improved approximate solution u
k
.
(1) If k = 0, compute
u
0
= L
1
0
f
0
;
stop. Otherwise go to step 2.
(2) (Pre-smoothing) Perform
1
steps of the iterative procedure
u
k
= u
k
+ M
k
(f
k
L
k
u
k
).
(3) (Coarse grid correction)
(a) Compute
f
k1
= R
k,k1
(f
k
L
k
u
k
)
and set
u
k1
= 0.
(b) Perform iterations of MG(k 1, ,
1
,
2
, L
k1
, f
k1
,
u
k1
) and denote the result by u
k1
.
(c) Compute
u
k
= u
k
+ I
k1,k
u
k1
.
(4) (Post-smoothing) Perform
2
steps of the iterative procedure
u
k
= u
k
+ M
k
(f
k
L
k
u
k
).
The following Java method implements the multigrid method with
= 1. Note that the recursion has been resolved.
// one MG cycle
// non-recursive version
public void mgcycle( int lv, SMatrix a ) {
int level = lv;
count[level] = 1;
while( count[level] > 0 ) {
while( level > 1 ) {
smooth( a, 1 ); // presmoothing
a.mgResidue( res, x, bb ); // compute residue
restrict( a ); // restrict
level--;
count[level] = cycle;
}
smooth( a, 0 ); // solve on coarsest grid
count[level]--;
boolean prolongate = true;
while( level < lv && prolongate ) {
level++;
prolongateAndAdd( a ); // prolongate to finer grid
smooth( a, -1 ); // postsmoothing
count[level]--;
if( count[level] > 0 )
prolongate = false;
}
}
} // end of mgcycle
The used variables and methods have the following meaning:
lv number of actual level, corresponds to k,
a stiness matrix on actual level, corresponds to L
k
,
x actual approximate solution, corresponds to u
k
,
bb actual right-hand side, corresponds to f
k
,
res actual residual, corresponds to f
k
L
k
u
k
,
smooth perform the smoothing,
mgResidue compute the residual,
restrict perform the restriction,
prolongateAndAdd compute the prolongation and add the re-
sult to the current approximation.
S
P
S
P
E
Figure IV.4.1. Schematic presentation of a multigrid

algorithm with V-cycle and three grids. The labels have
the following meaning: o smoothing, 1 restriction, T
prolongation, c exact solution.
Remark IV.4.2. (1) The parameter determines the complexity of
the algorithm. Popular choices are = 1 called V-cycle and = 2
called W-cycle. Figure IV.4.1 gives a schematic presentation of the
multigrid algorithm for the case = 1 and R = 2 (three meshes).
Here, o denotes smoothing, 1 restriction, T prolongation, and c exact
solution.
(2) The number of smoothing steps per multigrid iteration, i.e. the pa-
rameters
1
and
2
, should not be chosen too large. A good choice for
positive denite problems such as the Poisson equation is
1
=
2
= 1.
For indenite problems such as mixed formulations of the equations of
linearized elasticity, a good choice is
1
=
2
= 2.
(3) If 2, one can prove that the computational work of one multi-
grid iteration is proportional to the number of unknowns of the actual
discrete problem.
(4) Under suitable conditions on the smoothing algorithm, which is
determined by the matrix M
k
, one can prove that the convergence rate
of the multigrid algorithm is independent of the mesh-size, i.e., it does
not deteriorate when rening the mesh. These conditions will be dis-
cussed in the next section. In practice one observes convergence rates
of 0.1 0.5 for positive denite problems such as the Poisson equation
and of 0.3 0.7 for indenite problems such as mixed formulations of
the equations of linearized elasticity.
IV.4.2. Smoothing. The symmetric Gauss-Seidel algorithm is
the most popular smoothing algorithm for positive denite problems
such as the Poisson equation. It corresponds to the choice
M
k
= (D
k
U
t
k
)D
1
k
(D
k
U
k
),
where D
k
and U
k
denote the diagonal and the strictly upper diagonal
part of L
k
respectively. The following Java method implements the
Gauss-Seidel algorithm for smoothing.
// one Gauss-Seidel sweep (x and b are defined on all levels)
// dir = 0: forward and backward
// dir = 1: only forward
// dir = -1; only backward
public void mgSGS( double[] x, double[] b, int dir ) {
double y;
if ( dir >= 0 ) { // forward sweep
for ( int i = dim[ lv-1 ]; i < dim[lv]; i++ ) {
y = b[i];
for ( int j = r[i]; j < r[ i+1 ]; j++ )
y -= m[j]*x[ dim[lv-1]+c[j] ];
x[i] += y/m[ d[i] ];
}
}
if ( dir <= 0 ) { // backward sweep
for ( int i = dim[lv] - 1; i >= dim[ lv-1 ]; i-- ) {
y = b[i];
for ( int j = r[i]; j < r[ i+1 ]; j++ )
y -= m[j]*x[ dim[lv-1]+c[j] ];
x[i] += y/m[ d[i] ];
}
}
} // end of mgSGS
The variables have the following meaning:
lv number of actual level, corresponds to k in the multigrid
algorithm,
dim dimensions of the discrete problems on the various levels,
m stiness matrix on all levels, corresponds to the L
k
s,
x approximate solution on all levels, corresponds to the u
k
s,
b right-hand side on all levels, corresponds to the f
k
s.
For non-symmetric or indenite problems such as scalar linear ellip-
tic equations with convection or mixed formulations of the equations
of linearized elasticity, the most popular smoothing algorithm is the
squared Jacobi iteration. This is the Jacobi iteration applied to the
squared system L
t
k
L
k
u
k
= L
t
k
f
k
and corresponds to the choice
M
k
=
2
L
t
k
with a suitable damping parameter satisfying > 0 and = O(h
2
K
).
The following Java method implements this algorithm:
// one squared Richardson iteration (x and b are defined on all levels)
// res is an auxiliary array to store the residual
public void mgSR( double[] x, double[] b, double[] res, double om ) {
copy( res, b, 0, dim[lv-1], dim[lv] );
for( int i = 0; i < dim[lv] - dim[ lv-1 ]; i++ )
for( int j = r[ dim[lv-1]+i ]; j < r[ dim[lv-1]+i+1 ]; j++ ) {
res[i] -= m[j]*x[ dim[lv-1]+c[j] ];
}
for( int i = 0; i < dim[lv] - dim[ lv-1 ]; i++ )
for( int j = r[ dim[lv-1]+i ]; j < r[ dim[lv-1]+i+1 ]; j++ )
x[ dim[lv-1]+c[j] ] += m[j]*res[i]/om;
} // end of mgSR
The variable om is the damping parameter . The other variables and
methods are as explained before.
IV.4.3. Prolongation. Since the partition T
k
of level k always is
a renement of the partition T
k1
of level k1, the corresponding nite
element spaces are nested, i.e., nite element functions corresponding
to level k1 are contained in the nite element space corresponding to
level k. Therefore, the values of a coarse-grid function corresponding
to level k 1 at the nodal points corresponding to level k are obtained
by evaluating the nodal bases functions corresponding to T
k1
at the
requested points. This denes the interpolation operator I
k1,k
.
Figures III.1.2 (p. 101) and III.1.3 (p. 102) show various partitions
of a triangle and of a square, respectively. The numbers outside the
element indicate the enumeration of the element vertices and edges.
Thus, e.g., edge 2 of the triangle has the vertices 0 and 1 as its end-
points. The numbers +0, +1 etc. inside the elements indicate the
enumeration of the child elements. The remaining numbers inside the
elements give the enumeration of the vertices of the child elements.
Example IV.4.3. Consider a piecewise constant approximation, i.e.
S
0,1
(T ). The nodal points are the barycentres of the elements. Every
element in T
k1
is subdivided into several smaller elements in T
k
. The
nodal value of a coarse-grid function at the barycentre of a child element
in T
k
then is its nodal value at the barycentre of the parent element in
T
k
.
Example IV.4.4. Consider a piecewise linear approximation, i.e.
S
1,0
(T ). The nodal points are the vertices of the elements. The re-
nement introduces new vertices at the midpoints of some edges of
the parent element and possibly when using quadrilaterals at the
barycentre of the parent element. The nodal value at the midpoint of
an edge is the average of the nodal values at the endpoints of the edge.
Thus, e.g., the value at vertex 1 of child +0 is the average of the values
at vertices 0 and 1 of the parent element. Similarly, the nodal value at
the barycentre of the parent element is the average of the nodal values
at the four element vertices.
IV.4.4. Restriction. The restriction is computed by expressing
the nodal bases functions corresponding to the coarse partition T
k1
in
terms of the nodal bases functions corresponding to the ne partition
T
k
and inserting this expression in the variational formulation. This
results in a lumping of the right-hand side vector which, in a certain
sense, is the transpose of the interpolation.
Example IV.4.5. Consider a piecewise constant approximation, i.e.
S
0,1
(T ). The nodal shape function of a parent element is the sum
of the nodal shape functions of the child elements. Correspondingly,
the components of the right-hind side vector corresponding to the child
elements are all added and associated with the parent element.
Example IV.4.6. Consider a piecewise linear approximation, i.e.
S
1,0
(T ). The nodal shape function corresponding to a vertex of a
parent triangle takes the value 1 at this vertex, the value
1
2
at the
midpoints of the two edges sharing the given vertex and the value 0 on
the remaining edges. If we label the current vertex by a and the mid-
points of the two edges emanating form a by m
1
and m
2
, this results
in the following formula for the restriction on a triangle
R
k,k1
(a) = (a) +
1
2
(m
1
) + (m
2
).
When considering a quadrilateral, we must take into account that the
nodal shape functions take the value
1
4
at the barycentre b of the parent
quadrilateral. Therefore the restriction on a quadrilateral is given by
the formula
R
k,k1
(a) = (a) +
1
2
(m
1
) + (m
2
) +
1
4
(b).
Remark IV.4.7. An ecient implementation of the prolongation and
restrictions loops through all elements and performs the prolongation
or restriction element-wise. This process is similar to the usual element-
wise assembly of the stiness matrix and the load vector.
Bibliography
[1] Mark Ainsworth and J. Tinsley Oden, A posteriori error estimation in nite
element analysis, Wiley, New York, 2000.
[2] D. Braess, Finite elements, second ed., Cambridge University Press, Cambridge,
2001, Theory, fast solvers, and applications in solid mechanics, Translated from
the 1992 German edition by Larry L. Schumaker.
[3] R. Verf urth, A review of a posteriori error estimation and adaptive mesh-re-
nement techniques, Teubner-Wiley, Stuttgart, 1996.
[4] , A review of a posteriori error estimation techniques for elasticity prob-
lems, Advances in adaptive computational methods in mechanics (Cachan,
1997), Stud. Appl. Mech., vol. 47, Elsevier, Amsterdam, 1998, pp. 257274.
[5] , A review of a posteriori error estimation techniques for elasticity prob-
lems, Comput. Methods Appl. Mech. Engrg. 176 (1999), no. 1-4, 419440, New
advances in computational methods (Cachan, 1997).
139
Index
Laplace operator, 15
|| L
2
-norm, 12
||
H(div;)
norm of H(div; ), 66
inner product, 15
gradient, 15
||
k
Sobolev norm, 17
||1
2
,
trace norm, 18
[[
1

1
-norm, 21
[[
k
Sobolev norm, 17
[[
-norm, 21
(, )
T
, 52
: dyadic product, 16
x
, 21
A closure of A, 16
c
T
faces of T , 27
A
T
vertices of T , 23
T partition, 19
C
0
() smooth functions, 16
1
+...+
d
x
1
1
...x
d
d
partial derivative, 17
c
K
faces of K, 19
c
T
faces of T , 20
c
T ,
boundary faces, 20
c
T ,
D
faces on the Dirichlet
boundary, 20
c
T ,
N
faces on the Neumann
boundary, 20
c
T ,
interior faces, 20
boundary of , 15
D
Dirichlet boundary, 15
N
Neumann boundary, 15
H(div; ), 66
H
1
0
(0, 1) Sobolev space, 13
H
1
D
() Sobolev space, 18
H
1
(0, 1) Sobolev space, 12
H
1
0
H
1
2
() trace space, 18
H
k
J, 80
I
T
quasi-interpolation operator, 25
I
n
, 80
K element, 19
L
2
(0, 1) Lebesgue space, 12
A
E
vertices of E, 20
A
K
vertices of K, 19
A
T
vertices of T , 20
A
T ,
boundary vertices, 20
A
T ,
D
vertices on the Dirichlet
boundary, 20
A
T ,
N
vertices on the Neumann
boundary, 20
A
T ,
interior vertices, 20
domain, 15
R
E
(u
T
), 32
R
K
, 73
R
K
(u
T
), 32
RT
0
(K) lowest order
Raviart-Thomas space on K,
66, 72
RT
0
(T ) lowest order
Raviart-Thomas space, 66
S
k,1
nite element space, 22
S
k,0
nite element space, 14, 22
S
k,0
D
S
k,0
0
S
k,0
0
(T ) nite element space, 14
T partition, 14
T
n
, 80
V
K
, 44
V
K
, 43
V
x
, 41
X
n
, 81
a
K,E
vertex of K opposite to face E,
58
C
T
shape parameter, 19
curl curl-operator, 67
div divergence, 15
I
, 84
D,K
, 43, 75
D,x
, 41
H
, 50
141
142 INDEX
N,K
, 45, 74
R,K
, 38, 70, 73
Z
, 53
Z,K
, 53
E
(), 73
K,E
vector eld in trace equality, 58
h
E
diameter of E, 20
h
K
diameter of K, 19, 20
J
E
() jump, 28
condition of a matrix, 122
, 69
x
nodal basis function, 23
, 69
n
E
normal vector, 28
E
sharing adjacent to E, 20

E
elements sharing a vertex with
E, 20
K
elements sharing a face with K,
20

K
elements sharing a vertex with
K, 20
x
elements sharing the vertex x, 20,
23
[
x
[ area or volume of
x
, 26
n
, 81
E
face bubble function, 27
K
element bubble function, 26
P
1
, 41
K
diameter of the largest ball
inscribed into K, 19
supp support, 16
n
, 80
tr, 69
a posteriori error estimate, 67
a posteriori error estimator, 34
admissibility, 19
advective ux, 89
advective numerical ux, 93
ane equivalence, 19
asymptotically exact estimator, 60
BDMS element, 72
Bi-CG-stab algorithm, 129
blue element, 101
body load, 69
Burgers equation, 89
CG algorithm, 124
characteristic equation, 87
coarse grid correction, 132
coarsening strategy, 104
compensated compactness, 95
condition, 122
conjugate gradient algorithm, 124
Crank-Nicolson scheme, 82
criss-cross grid, 61
curl operator, 67
damping parameter, 123
deformation tensor, 15, 69
degree condition, 81
Dirichlet boundary, 30
discontinuous Galerkin method, 95
displacement, 69
displacement formulation, 69
divergence, 15
dual nite volume mesh, 89
dyadic product, 16
edge bubble function, 27
edge residual, 32
eciency index, 60
ecient, 34
ecient estimator, 60
elasticity tensor, 69
element, 19
element bubble function, 26
element residual, 32
energy function, 14
equilibration strategy, 97
Euler equations, 90
Euler-Lagrange equation, 69
face bubble function, 27
nite volume method, 89, 91
ux, 89
Friedrichs inequality, 18
Galerkin orthogonality, 32
Gauss-Seidel algorithm, 134
Gau-Seidel algorithm, 105
Gau-Seidel iteration, 123
general adaptive algorithm, 7
geometric quality function, 106
gradient, 15
green element, 101
hanging node, 97, 101
Hellinger-Reissner principle, 71
Helmholtz decomposition, 67
Hessian matrix, 108
hierarchical a posteriori error
estimator, 50
hierarchical basis, 25
INDEX 143
implicit Euler scheme, 82
initial value, 89
inner product, 15
irregular renement, 97
iteration method, 123
Jacobi iteration, 123
L
2
-representation of the residual, 32
Lame parameters, 69
Laplace operator, 15
Lebesgue-Raum, 12
linear elements, 14
linear interpolant, 107
locking phenomenon, 70
longest edge bisection, 103
L
2
-norm, 12
marking strategy, 97
mass, 89
material derivative, 87
maximum strategy, 97
mesh-smoothing strategy, 105
method of characteristics, 87
method of lines, 79
MG algorithm, 132
minimum, 13, 14
mixed nite element approximation,
67
multigrid algorithm, 132
Navier-Stokes equations, 90
nearly incompressible material, 70
nested iteration, 120
Neumann boundary, 30
nodal basis, 14
nodal basis function, 15
nodal shape function, 23
non-degeneracy, 81
numerical ux, 91, 93
partition, 14, 19
PCG algorithm, 126
PEERS element, 72
Poincare inequality, 18
Poisson equation, 30
post-smoothing, 132
pre-smoothing, 132
preconditioned conjugate gradient
algorithm, 126
prolongation operator, 132
purple element, 101
quadratic elements, 14
quadratic interpolant, 107
quadrature formula, 14
quality function, 105
quasi-interpolation operator, 25
Raviart-Thomas space, 58, 66, 72
red element, 100
reference cube, 22
reference simplex, 22
renement level, 105
renement rule, 97
renement vertex, 105
regular renement, 97
reliable, 34
residual, 31
residual a posteriori error estimator,
38, 73
resolvable patch, 105
restriction operator, 132
Richardson iteration, 123
rigid body motions, 73
Rothes method, 79
saturation assumption, 47
shape parameter, 19
shape-regularity, 19
skew symmetric part, 69
smoothing operator, 132
smoothing procedure, 106
Sobolev space, 12, 13, 17
source, 89
space-time nite elements, 80
SSOR-preconditioning, 128
stabilized bi-conjugate gradient
algorithm, 129
stationary iterative solver, 122
Steger-Warming scheme, 94
stiness matrix, 14
strain tensor, 69
streamline upwind Petrov-Galerkin
discretization, 64
strengthened Cauchy-Schwarz
inequality, 47
stress tensor, 69
Sturm-Liouville problem, 11
SUPG discretization, 64
support, 16
symmetric gradient, 69
system in divergence form, 89
tangential component, 73
Taylors formula, 108
test function, 14
144 INDEX
-scheme, 82
total energy, 69
total variation, 95
trace, 18
trace space, 18
transition condition, 81
trial function, 14
unit tensor, 15
V-cycle, 133
van Leer scheme, 94
variational formulation, 66
viscous ux, 89
viscous numerical ux, 93
W-cycle, 133
weak derivative, 12
weakly dierentiable, 12

Adaptive Fem

Uploaded by

Copyright:

Available Formats

You might also like

Adaptive Fem

Uploaded by

Document Information

Original Description:

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Adaptive Fem

Uploaded by

Copyright:

Available Formats

Adaptive Finite Element Methods

Lecture Notes Winter Term 2011/12

(x)v(x) + q(x)u(x)v(x) = f(x)v(x)

(x)v(x)dx = u(1) v(1)

holds for all functions C

depend on the domain and are propor-

Figure I.3.1. Mixture of triangular and quadrilateral elements

and all edges E and E

that share at least one

Figure I.3.2. Some domains

Figure I.3.3. Some examples of domains

k ,if K is the reference cube

are as in example I.3.12 (1) and where all

are as in example I.3.12 (2) and where all

are as in example I.3.12 (3) and where all

Figure I.3.7. Examples of domains

. We want to show that

only depends on the shape parameter C

the space of linear polynomials in two variables.

II.2. A CATALOGUE OF ERROR ESTIMATORS 45

46 II. A POSTERIORI ERROR ESTIMATES

Figure II.2.2. Triangulation of example II.2.7 corre-

. In the case of dominant

ensures that the space RT

with constants of moderate size;

of the time-interval [0, T] into subintervals satisfying

for all elements, edges or faces S T c

does the corresponding job for the temporal error.

of the error estimator.

, we successively rene the partition T

. When this goal is achieved we accept

has a unique solution x(; x

) which exists for all t [0, t

), t). The total

[ denotes the length respective area of the common

Figure II.4.4. Dual mesh (red) via barycentres of pri-

. This is the renement

Figure III.1.1. Hanging node

Figure III.1.4. Forbidden green renement and sub-

Figure III.1.6. The vertex marked is resolvable in

, the vertex of K which is not

is called the renement vertex of K.

such that the points z

such that the two triangles

have equal qualities

Figure IV.4.1. Schematic presentation of a multigrid

You might also like