Numerical Solution Algorithms For Compressible Flows: Hrvoje Jasak

You might also like

Download as pdf or txt
Download as pdf or txt
You are on page 1of 207

Numerical Solution Algorithms

for Compressible Flows


Lecture Notes

Hrvoje Jasak
Faculty of Mechanical Engineering
and Naval Architecture
University of Zagreb, Croatia

Academic Year 2006-2007

Course prepared for the Aerospace Engineering Program


Tempus NUSIC Project JEP-18085-2003
c
2006.
Hrvoje Jasak, Wikki Ltd. All right reserved.

Contents
I

Introduction to Modern CFD

1 Introduction

2 Introduction: CFD in Aeronautical Applications


2.1 Modern Aircraft Design and CFD . . . . . . . . . . . . . . . . . .
2.2 Scope of Computational Efforts . . . . . . . . . . . . . . . . . . .
2.3 Finite Volume or Finite Element? . . . . . . . . . . . . . . . . . .

11
11
21
26

3 CFD in Automotive Applications

29

II

39

The Finite Volume Method

4 Mesh Handling
4.1 Introduction . . . . . . . . . . . .
4.2 Complex Geometry Requirements
4.3 Mesh Structure and Organisation
4.4 Manual Meshing: Airfoils . . . . .
4.5 Adaptive Mesh Refinement . . . .
4.6 Dynamic Mesh Handling . . . . .

.
.
.
.
.
.

41
41
42
45
50
53
57

.
.
.
.
.
.
.
.

61
61
61
63
65
67
70
72
74

6 Polyhedral Finite Volume Method


6.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
6.2 Properties of a Discretisation Method . . . . . . . . . . . . . . . .

77
77
77

.
.
.
.
.
.

.
.
.
.
.
.

.
.
.
.
.
.

.
.
.
.
.
.

.
.
.
.
.
.

.
.
.
.
.
.

.
.
.
.
.
.

.
.
.
.
.
.

.
.
.
.
.
.

5 Transport Equation in the Standard Form


5.1 Introduction . . . . . . . . . . . . . . . . . . . . .
5.2 Scalar Transport Equation in the Standard Form
5.2.1 Reynolds Transport Theorem . . . . . . .
5.2.2 Diffusive Transport . . . . . . . . . . . . .
5.3 Initial and Boundary Conditions . . . . . . . . . .
5.4 Physical Bounds in Solution Variables . . . . . . .
5.5 Complex Equations: Introducing Non-Linearity .
5.6 Inter-Equation Coupling . . . . . . . . . . . . . .

.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.

CONTENTS

6.3
6.4
6.5

Discretisation of the Scalar Transport


Face Addressing . . . . . . . . . . . .
Operator Discretisation . . . . . . . .
6.5.1 Temporal Derivative . . . . .
6.5.2 Second Derivative in Time . .
6.5.3 Evaluation of the Gradient . .
6.5.4 Convection Term . . . . . . .
6.5.5 Diffusion Term . . . . . . . .
6.5.6 Source and Sink Terms . . . .
6.6 Numerical Boundary Conditions . . .
6.7 Time-Marching Approach . . . . . .
6.8 Equation Discretisation . . . . . . . .
6.9 Convection Differencing Schemes . .
6.10 Examples . . . . . . . . . . . . . . .

Equation
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .

.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.

78
82
85
85
86
86
87
89
90
91
93
94
94
94

7 Algebraic Linear System and Linear Solver Technology


7.1 Structure and Formulation of the Linear System . . . . . .
7.2 Matrix Storage Formats . . . . . . . . . . . . . . . . . . .
7.3 Linear Solver Technology . . . . . . . . . . . . . . . . . . .
7.3.1 Direct Solver on Sparse Matrices . . . . . . . . . .
7.3.2 Simple Iterative Solvers . . . . . . . . . . . . . . .
7.3.3 Algebraic Multigrid . . . . . . . . . . . . . . . . . .
7.4 Parallelisation and Vectorisation . . . . . . . . . . . . . . .

.
.
.
.
.
.
.

.
.
.
.
.
.
.

.
.
.
.
.
.
.

.
.
.
.
.
.
.

95
95
96
99
100
102
105
107

8 Solution Methods for Coupled Equation Sets


8.1 Examining the Coupling in Equation Sets . . .
8.2 Examples of Systems of Simultaneous Equations
8.3 Solution Strategy for Coupled Sets . . . . . . .
8.3.1 Segregated Approach . . . . . . . . . . .
8.3.2 Fully Coupled Approach . . . . . . . . .
8.4 Matrix Structure for Coupled Algorithms . . . .
8.5 Coupling in Model Equation Sets . . . . . . . .
8.6 Special Coupling Algorithms . . . . . . . . . . .

.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.

111
111
111
115
115
116
117
122
126

III

.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.

Numerical Simulation of Fluid Flows

9 Governing Equations of Fluid Flow


9.1 Compressible Navier-Stokes Equations .
9.2 Flow Classification based on Flow Speed
9.3 Steady-State or Transient . . . . . . . .
9.4 Incompressible Formulation . . . . . . .
9.5 Inviscid Formulation . . . . . . . . . . .

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

127
.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

129
129
130
132
133
133

CONTENTS

9.6
9.7

Potential Flow Formulation . . . . .


Turbulent Flow Approximations . . .
9.7.1 Direct Numerical Simulation .
9.7.2 Reynolds Averaging Approach
9.7.3 Large Eddy Simulation . . . .

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

10 Pressure-Velocity Coupling
10.1 Nature of Pressure-Velocity Coupling . . . . . . . . . . . . . .
10.2 Density-Based Block Solver . . . . . . . . . . . . . . . . . . .
10.3 Pressure-Based Block Solver . . . . . . . . . . . . . . . . . . .
10.3.1 Gradient and Divergence Operator . . . . . . . . . . .
10.3.2 Block Solution Techniques for a Pressure-Based Solver
10.4 Segregated Pressure-Based Solver . . . . . . . . . . . . . . . .
10.4.1 Derivation of the Pressure Equation . . . . . . . . . . .
10.4.2 SIMPLE Algorithm and Related Methods . . . . . . .
10.4.3 PISO Algorithm . . . . . . . . . . . . . . . . . . . . . .
10.4.4 Pressure Checkerboarding Problem . . . . . . . . . . .
10.4.5 Staggered and Collocated Variable Arrangement . . . .
10.4.6 Pressure Boundary Conditions and Global Continuity .
11 Compressible Pressure-Based Solver
11.1 Handling Compressibility Effects in Pressure-Based Solvers
11.2 Derivation of the Pressure Equation in Compressible Flows
11.3 Pressure-Velocity-Energy Coupling . . . . . . . . . . . . .
11.4 Additional Coupled Equations . . . . . . . . . . . . . . . .
11.5 Comparison of Pressure-Based and Density Based Solvers .
12 Turbulence Modelling for Aeronautical Applications
12.1 Nature and Importance of Turbulence . . . . . . . . . . . .
12.2 Direct Numerical Simulation of Turbulence . . . . . . . . .
12.3 Reynolds-Averaged Turbulence Models . . . . . . . . . . .
12.3.1 Eddy Viscosity Models . . . . . . . . . . . . . . . .
12.3.2 Reynolds Transport Models . . . . . . . . . . . . .
12.3.3 Near-Wall Effects . . . . . . . . . . . . . . . . . . .
12.3.4 Transient RANS Simulations . . . . . . . . . . . . .
12.4 Large Eddy Simulation . . . . . . . . . . . . . . . . . . . .
12.5 Choosing a Turbulence Model . . . . . . . . . . . . . . . .
12.5.1 Turbulence Models in Airfoil Simulations . . . . . .
12.5.2 Turbulence Models in Bluff-Body Aerodynamics . .
12.6 Future of Turbulence Modelling in Industrial Applications

.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.

134
134
135
136
137

.
.
.
.
.
.
.
.
.
.
.
.

139
139
141
144
144
146
147
147
150
152
154
157
159

.
.
.
.
.

161
161
162
164
165
166

.
.
.
.
.
.
.
.
.
.
.
.

169
169
172
173
174
177
179
180
181
185
185
186
187

CONTENTS

13 Large-Scale Computations
13.1 Background . . . . . . . . . . . . . . .
13.1.1 Computer Power in Engineering
13.2 Classification of Computer Platforms .
13.3 Domain Decomposition Approach . . .
13.3.1 Components . . . . . . . . . . .
13.3.2 Parallel Algorithms . . . . . . .
14 Fluid-Structure Interaction
14.1 Scope of Simulations . . . . .
14.2 Coupling Approach . . . . . .
14.3 Discretisation of FSI Systems
14.4 Examples . . . . . . . . . . .

.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.

. . . . . . . .
Applications
. . . . . . . .
. . . . . . . .
. . . . . . . .
. . . . . . . .
.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.

189
189
189
191
196
196
198

.
.
.
.

201
201
204
205
206

Part I
Introduction to Modern CFD

Chapter 1
Introduction
Computational Fluid Dynamics
Definition of CFD, from Versteeg and Malalasekera: An Introduction
to Computational Fluid Dynamics
Computational Fluid Dynamics or CFD is the analysis of systems involving fluid flow, heat transfer and associated phenomena
such as chemical reactions by means of computer-based simulation.
CFD is also a subset of Computational Continuum Mechanics: fundamentally identical numerical simulation technology is used for many sets of
simular partial differential equations
Numerical stress analysis
Electromagnetics, including low- and high-frequency phenomena
Weather prediction and global oceanic/atmosphere circulation models
Large scale systems: galactic dynamics and star formation
Complex heat and mass transfer systems
Fluid-structure interaction and similar coupled systems
In all cases, equations are very similar: capturing conservation of mass,
momentum, energy and associated transport phenomena

10

Introduction

Chapter 2
Introduction: CFD in
Aeronautical Applications
2.1

Modern Aircraft Design and CFD

In this section we will explore the role and history of Computational Fluid Dynamics in the aerospace industry. Problems of aerospace design were leading the
technological push for a long period in the 20th, dictating areas of research and,
together with nuclear research, expanding the use of numerical modelling.
Even today, aerospace and related technology (e.g. rocket design) is considered
sufficiently serious to limit the access to latest design by world powers to other
governments: however, major parts of technology may be considered more than
50 years old. Example: Chuck Yeager and the first supersonic flight, 1947?
In NASA, the new push towards manned space exploration and reach-out
towards a manned Mars mission involves mainly fluid dynamics challenges. The
mission requirements beyond Earths orbit are more or less settled.
New work in aircraft design concentrates on optimising the existing technology
with very few revolutionary new ideas. Main area of work still follws traditional
functional decomposition of functions on an airplane: wings for lift, rudder for
steering, body as useful volume. However, new simulation techniques allow us to
re-visit functionally interesting alternatives: flying wind configuration has been
revived recently in the B-2 bomber after 50 years from the first attempt.
Introduction
Aerospace industry is the first and most prevalent in the use of numerical
techniques, including Computational Fluid Dynamics (CFD)
Early beginning of CFD in early 1960s
First successes came to prominence in the 1970s

12

Introduction: CFD in Aeronautical Applications

The creation of the CFD-service industry started in the 1980s


The CFD industry expanded significantly in the 1990s
First fully computer-based design process for external aerodynamics design
in a commercial aircraft: Airbus 380 in 2000s
In most phases of the process, it was the aerospace industry driving the
CFD development to answer to its needs
Early adpotion of numerical modelling in aerospace applications has brought
with it some interesting consequences: living with expensive computer hardware
and limited memory space leads to simplified modelling techniques and carefully
tuned solution algorithms for the set of problems under consideration. Example:
1-equation turbulence models for airfoil calculations, e.g. Baldwin-Lomax.
Aerospace Industry and CFD
Use of CFD is no longer in question: definitely used throughout the design
process
Questions on fidelity and accuracy: can we get sufficiently reliable results?

Roll-out of CFD continues with more complex requirements, increase of the


computer power and applicability of new methods (optimisation)
Some problems still hit issues with level of performance increase: how much
is the difference in results quality between steady 2-D RANS and 3-D LES
for single airfoil design

2.1 Modern Aircraft Design and CFD

13

Challenges in aircraft design moving elsewhere: systems integration, control components (e.g. electro-hydraulics), packaging, computer support and
battlefield integration, advanced materials (single-crystal turbine blades)
etc.
The only truly revolutionary new technology on its way to (military) aircraft
is a scram-jet engine: air-breading jet engine without a compressor or a turbine, where shock management in supersonic flow is used to create the necessary
compression.
State of the market
Boeing and Airbus totally dominate the commercial airliner market. Number of smaller players on the edges and in business/regional jet business:
ATR, Gulfstream, Raytheon etc.
Military situation a bit more diverse: BAE Systems, Lockheed Martin,
Sukhoi and a number of smaller manufacturers
I also count missile systems and aircraft engine manufacturers (General
Electric, Rolls Royce, Pratt and Whitney)
NASA. In the latest budget statement, claims that all of its critical problems
are associated with managing fluid flow: the space bit in the middle is not
nearly that difficult.
High-speed car aerodynamics: highly specialised, very rich, with very clear
requirements. Often bundled with aerospace: wings are turned upsidedown, creating down-force instead of lift; concerns about drag very different than in standard car industry. However, the aerodynamics problem is
much more complex than in aircraft. Example: proximity to the ground:
important boundary layer effects; trying to organise a much more complex
flow pattern
Aircraft design also includes other flow physics and auxiliary component
simulations
In order to understand the requirements of various uses of CFD, let us introduce a simple flow classification, based on how tightly managed is the flow
field.
My flow classification
Smooth flows: engineering machinery specifically organises the flow for
maximum stability and efficiency. Design conditions are clearly defined and

14

Introduction: CFD in Aeronautical Applications

their variation is relatively small. Fractional changes in flow characteristics have profound performance effects (detached flow, small recirculations,
turbulence. Example: aircraft at cruising speed, turbo-machinery blades,
yacht design
Rough flows: flow regime is uncertain, the main object of design is not
flow management, but flow may still have critical effect on performance.
Example: electronics cooling, passenger compartment comfort in aircraft,
swimsuit of Olympic swimmers
In aerospace, we mostly deal with smooth flows

Scope of Simulations
Traditionally, experimental studies in aerospace are important, but fullscale models are more and more out of the question. This creates ideal
scope for numerical studies
Questions we look to answer with numerical simulation techniques range
from simple lift and drag studies to extremely complex physical problems:
stall characteristics, stability in manoeuvres, sensitivity and robust design,
optimisation, aero-acoustic noise. A number of new techniques stem from
use of CFD in aerospace and are still spreading through the rest of CFD
and industry
The baseline physics involved is relatively simple: compressible Newtonian flows of an ideal gas
. . . complications easy to add: incompressible to hypersonic flow regime

Speed Range
low subsonic
high subsonic
transonic
supersonic
hypersonic

Mach Number
< 0.3
0.3 0.6
0.6 1.1
15
>5

15

2.1 Modern Aircraft Design and CFD


u, m/s

jet
aircraft

general
aviation

1000
model
airplanes

100
10
1

gliders

insects

airships

0.1
dust
0.01
0

9 log(Re)

Even in simplest flows, we do not have an easy job: turbulence complicates the situation immensely! The problem of turbulence modelling for
engineering applications is still unsolved; however, the physics is straightforward and well understood
Away from the baseline, physics can get considerably more complex: combustion, de-icing, multi-phase flow etc.
There is significant penetration of general purpose CFD tools into the
aerospace companies, but this is still considered a massive untapped market
from the commercial CFD point of view. It is unclear that general purpose
tools will be sufficiently good to do the job.
Numerical simulation software
You dont do CFD without computers! Early efforts with pieces of paper
and rooms of people date from UK Metheorological office, running large
scale weather forecasting simulations
In the last 10 years, CFD performance and use coming together
Computers power is a cheap commodity. Massively parallel computers
are commonplace today and can be easily handled in software
In aerospace, understanding the physics is typically not a problem
Numerical methods cleaned up of systemic errors and gross failures
Sufficient experience in research departments
Validation against trusted experimental data
Understanding of simplifications and assumptions
In other industries, roll-out of numerical simulation tools limited by experience. Phases of integration of CFD in the design process:

16

Introduction: CFD in Aeronautical Applications

1. Research and development departments: validation and assessment


of capabilities. Typically involves detailed study of old designs or
production pieces and comparison with available measured data.
2. Pre-design: experimenting with early prototypes and new ideas away
from the current development line
3. Design and pre-production: new product development.
4. Production: optimisation of existing components and incremental development of the running design
In aerospace design, it is no longer sufficient to make a plane fly
Economy, fuel consumption
Government regulations: noise and pollution levels. Example: noise
pollution caused by the supersonic shock wave on the ground killed
supersonic flight! Simulation objective: dissipate the shock between
the plane and the ground
Passenger comfort. Includes both oscillatory and non-oscillatory flows
around the aircraft, as well as cabin heating and air-conditioning. Example: Boeing 747-300 with a wiggly tail
Military application requirements: agile manoeuvring system and unstable aerodynamic configurations
In some cases, aerodynamics design does not dominate: instead, it
is necessary to make a bad aerodynamic shape fly. Example F-117
stealth bomber. This is also a good example of what happens when the
numerical simulation software (in this case, simulation followed electromagnetic signature) cannot handle a traditional engineering shape
of an aircraft. Note that F-117 is an old aircraft: scheduled to be
retired from US Air Force by the end of 2008.
Process of performing a CFD simulation has evolved through the years, with
maturing numerical simulation tools and transition of the design work from the
drafting board to a Computer-Aided Desegn (CAD) system. This is still ongoing:
over the last few years, providers of CAD solutions have started talking about
Product Lifetime Management (PLM) solutions, moving the complete cycle under
a single IT-based system. Even in a relatively modern field like CFD, legacy
practices still act as a stumbling block. Example: traditionally, a meshing tool
and CFD solver are two separate software components; add to this the problem of
transferring geometrical data from a CAD package into a mesher and attempting
to run an optimisation simulation in this manner. From the moves in the market,
it is expected that software convergence may happen over the next 5-10 years (one
generation of CFD software tools. Note that the similar problem in structural

2.1 Modern Aircraft Design and CFD

17

analysis has already been overcome: compared to fluid flow, physics involved in
structural simulation is significantly simpler.
Phases of a CFD Simulation
Description of the geometry. Airfoil curve data, CAD surface or anywhere in between. External aerodynamics = geometry of interest located
in a large domain (atmosphere)
Extraction of the fluid domain. In cases where a CAD description is
given, a considerable amount of clean-up may be required. This is not easily
done, no reliable automatic tools.
Mesh generation. Based on the given fluid domain, a computational mesh
is created. Tools range form manual (points and cells), semi-automatic
(block splitting, template geometries, surface wrapping (adaptation of a
mesh template to a given surface) to fully automatic (tetrahedral and hexahedral/polyhedral automatic mesh generators). Mesh generation is the
most demanding and time-consuming process today. Significant push to
automatic tools. In spite of automatic tools, there is room for engineering judgement, as a quality solution can be obtained more cheaply by constructing a quality mesh. A good mesh takes into account what the solution
should look like.
Physics setup. Select the governing equations and specify the material
properties and boundary conditions involved. Second level of engineering
judgement: how much does the knowledge of detailed material behaviour
improve the final result. Example: specific heat capacity of water as a
function of temperature; thermal expansion coefficient of water as a function
of temperature T .
Boundary condition setup. This includes both the location and type of
boundary conditions used. The role of boundary conditions is to include
the influence of the environment to the solution. In big box cases, this is
easier than is other engineering simulations
Solver setup and simulation. Choice of discretisation parameters and
numerical solution procedure: differencing schemes, relaxation parameters,
multigrid, convergence tolerance etc..
Data post-processing and analysis of results. Not always straightforward.
Integral studies. In simple lift and drag studies, we could be looking
at a small number of integral properties.

18

Introduction: CFD in Aeronautical Applications

Flow organisation, where global characteristics of the flow are controlled to achieve stability or a desired pattern
Management of detailed flow structure. Example: remove the vortex
depositing dirt on a part of the windshield
Sensitivity and robust design studies. Usually cannot be seen in results
without experience or require specialised simulations.
Advanced visualisation tools are a part of the game: provides a way of
managing the wealth of data.
20 years ago, leading CFD tools were developed at Universities, centred
around strong research groups and attracting significant funding from the industry. As a response to deployment problems, large aerospace companied develop
their own research teams and in-house expertise.
Today, CFD software development at Universities is winding down significantly: the components a good research platform requires are substantial and
very few groups can afford to finance the effort (there is little research value and
publishable results in writing a new known technology CFD code. Majority of
groups rely on commercial CFD software to do their research.
In-house software development in large companies suffers a similar fate: the
work that can be done by commercial software is migrated to commercial CFD
codes and sometimes even outsorced. Apart from financial pressures, this is
related to software development, maintenance and validation work required to
keep in-house codes working and up-to-date with technology.
The above pushes larg-scale software work to commercial organisation, which
have grown from small companies in late 1980s and early 1990s to large organisations. Current trend towards large packages that can satisfy all simulation needs
to all customers under the same hood acts as a counter-weight for this state
of affairs. Specialised software for specialised needs and technology components
giving competitive advantage will be kept separate.
CFD Software Development
Small experimental codes: playing around with physics and numerical methods
In-house general CFD solver development
In-house custom-written software for specific purposes: e.g. wing-nacelle engine system, turbine blade optimisation, simulation of unstable manoeuvres
in military jets, calculation of directional derivatives and solution stability,
matching computations with measured data sets etc.
Complex and tuned panel method codes

2.1 Modern Aircraft Design and CFD

19

Simplified physics, e.g. potential flow and boundary layer codes


Hooked-up mesh generation and parametrisation
Special purpose codes: sensitivity, aero-acoustics etc.
In-house development kept secret: competitive advantage. Example:
Pratt & Whitney material properties databases
Government-sponsored (National Labs) developments
General-purpose CFD packages: from a fridge to a stealth plane
University research codes; public-domain software
Write-your-own CFD solver
Software getting increasingly complex: you need a PhD to join the game
Market situation: Aerospace CFD
Aerospace atypical for the general CFD picture: early adopter with lots of
experience in-house and specific tools targeted to applications
In-house codes extremely important and integrated into the design process.
However, currently approaching vintage status
Example: Boeing dominated by multi-block structured solvers, which currently hinders development. Airbus came in later and developed unstructured solvers in-house, with the massive competitive advantage
There are problems with in-house codes: development effort more complex,
people with knowledge move on, process of acceptance and validation very
long
Simulation software needs to become more user-friendly and closer to the
CAE line. This implies extra work apart from raw solver capability which
is not easily handled in-house.
Additional CAD-related requirements and cost of keeping CFD development teams in-house opens the room for commercial general-purpose CFD
packages
There also exists a number of consortium or government-sponsored codes.
Example: NASA (USA), DLR (Germany)

20

Introduction: CFD in Aeronautical Applications

Remaining Challenges
Mesh generation, especially parallel mesh generation
Handling massively parallel simulations
Integration into the CAD-based design process
Fluid-structure interaction and aeroacoustics
On the cusp between two generations of general-purpose CFD solvers: procedural programming, Fortran and C against object orientation
The push for bigger, faster, more accurate simulations in external aerodynamics not so strong in the aerospace market: meshes are already sufficiently large. Also, extensive experience of the required size of the model,
mesh resolution and locally fine meshes from the days when computer power
was expensive
In aircraft engine design, the opposite is the case. ASC Project (Advanced
Simulation and Computing), US Dept of Energy, Los Alamos, Livermore,
Sandia, Stanford University and other partners
http://www.stanford.edu/group/cits/research/index.html
http://www.llnl.gov/PAO/news/asc/
Tip-to-toe simulation of a turbo-fan aircraft engine, including fan,
turbo compressor, combustion chambers and turbine. Preferred modelling technique: Large Eddy Simulation
Integrated Multicode Simulation Framework
As a part of the project, worlds biggest parallel computers have been
built:

ASC Red, Sandia 1996


ASC Blue Livermore, Los Alamos, 1998
ASC White, Livermore, 2001
ASC Q, Los Alamos, 2003
ASC Red Storm, Sandia, 2004, 40-TeraOps
ASC Purple, Livermore, 2005, 100-TeraOps
ASC Blue Gene Livermore, Los Alamos, : 130 000 CPUs and 360
TeraFlops performance.

For comparison, ASC Linux, 960 node-linux box with 1920 processors
and 3.8 TB produces peak performance of 9.2 TeraFlops/s
The idea of doing a complete engine is somewhat abandoned: not
enough power for LES on compressor or turbine. Using combined
RANS/LES simulation approach with coupling on interfaces.

21

2.2 Scope of Computational Efforts

2.2

Scope of Computational Efforts

The level and fidelity of numerical simulation is tailored to the design process:
it will cover everything form preliminary design tools running in 1-2 seconds to
full transient CFD studies for complex physics simulations. The use of analytical
and pedestrian methods in early design phases cannot be ignored: laying out
the initial set-up of a jet engine compressor is done using precisely the techniques
taught in University turbomachinery courses. Once, the basic design is laid down,
more detailed tools will be used to satisfy design requirements and optimise the
performance.
Aerodynamic Drag
Drag varies with the velocity squared: major influence at aerospace speed.
Narrow improvements in drag lead to considerable advances:
A 15% drag reduction on the Airbus A340-300B would yield a
12% fuel saving, other parameters being constant.
(Mertens, 1998)
Chasing drag improvements in highly optimised shapes is only of marginal
interest

Cd = 0.47
Sphere

Cd = 0.80
Angled cube

Cd = 0.42
Half sphere

Cd = 0.82
Long cylinder

Cd = 0.50
Cone

Cd = 1.15
Short cylinder

Cd = 1.05
Cube

Cd = 0.04
Streamlined body

Simulations include functional subset cases, e.g. airfoils, wings, tails configuration, nacelle-to-wing assembly, but also full aircraft models
Subjects of interest include shock-boundary layer interaction: effects of
shocks on standard turbulence model prediction is still in question.

22

Introduction: CFD in Aeronautical Applications

High-Lift Aerodynamics
High-lift wing configuration very important: lower take-off and landing
speed, higher pay-load etc.
Study of multi-element airfoil configuration: high flow curvature, flow separation, wakes from upstream elements, laminar-to-turbulent boundary layer
transition etc.
High-lift devices added to wings include flaps and slats (common), but also
leading edge extensions, vortex generators and blown flaps

The subject of control is boundary layer management and flow stability


(avoiding stall)

2.2 Scope of Computational Efforts

23

Looking at Formula 1 aerodynamics, many similar devices can be found


Unsteady Aerodynamics
In most cases, aerodynamic flow are considered steady-state: flight at cruising speed, steady-state lift-off configuration etc.
Unsteady effects are sometimes critical, both in oscillatory and non-oscillatory
regime
Oscillatory instability: dynamic stall on helicopter rotor blades in forward
flight; vortex shedding behind bluff bodies
Non-oscillatory flows: flow separation at the high angle of attack. Turbulence effects are critical for accurate modelling
Unsteady transonic effects, moving or oscillating shock studies: significant
effect on the performance, especially in cases of high-speed helicopter rotor
blades
Unsteady aerodynamics is closely related to aero-elasticity. Sources of unsteadiness are mechanically generated: flutter
Rotary Aerodynamics
Simulation of helicopter rotor blades usually considered a specialised area
of research: special assumptions and modelling regime
Study of dynamic stall, blade-vortex interaction, blade-to-blade interaction,
blade tip effects and transonic flow effects
Similar effects, but at lower speeds can be found in other devices, e.g. wind
turbines, propeller design, turbo-machinery
High-Speed Aerodynamics
At high speed, the equation of state and ideal gas assumptions break down.
In other aspects, the flow is becoming easier to handle. Generally refers to
speed of Ma = 5 and above
For high speed, and due to the real gas effects we speak of aerothermodynamics rather than aerodynamics.
Regimes of hypersonic flow: separation is done based on the choice of equation of state

24

Introduction: CFD in Aeronautical Applications

Perfect gas. Flow regime still Mach number independent, but there
are problems with adiabatic wall conditions
Two-temperature ideal gas. Rotational and vibrational motion
of the molecules needs to be separated and leads to two-temperature
models. Used in supersonic nozzle design
Dissociated gas. Multi-molecular gases begin to dissociate at the
bow shock of the body.
Ionised gas. The ionised electron population of the stagnated flow
becomes significant, and the electrons must be modelled separately:
electron temperature. Effect important at speeds of 10 12km/s
Rudder and Steering Diagrams
In automated steering/targeting systems, the aircraft/missile is controlled
by a computer: given target or flight path
Automatic control systems rely on the diagrams showing the response on
steering commands: in practice, large look-up tables or fitted functional
data. Consider a case of a rotating missile with 2 4 control surfaces.
The steering data created by computation: combinations of control configurations with lift, drag, pitch, yaw orientation and force response. This
typically involves 5-10 thousand simulations, done automatically on massively parallel computers. Automatic mesh generation, massive parallelism
and controlled accuracy are essential.
http://people.nas.nasa.gov/ aftosmis/home.html
Internal Flows and Auxiliary Devices
Internal flows: incompressible, low speed, aerodynamics forces typically of
no consequence
Example: passenger compartment comfort, heating, cooling and ventilation. Closer to standard CFD and usually handled by general-purpose
CFD packages
Stability and Robust Design
Stability analysis takes into account the effects of uncertainly (noise) in the
input parameters. Example: how much will the lift coefficient on the airfoil
change with a 5% change in the angle of attack?
Away from stall point: lift is stable to small change in conditions

2.2 Scope of Computational Efforts

25

At stall: catastrophic change


What about a NACA 0012 (symmetric airfoil profile) at zero angle of
attack?
Stability of the solution on small perturbations can be examined in different
ways:
Lots of simulations: detailed analysis, lots of work
Special numerical techniques: forward derivatives, adjoint equations
(continuous and discrete), Proper Orthogonal Decomposition methods
All of the above are extensively used in aerospace simulations. However,
looking at results is not easy: need to understand the meaning
Robust design studies
Under normal circumstances, looking to maximise the performance of
a device in absolute terms. Example: maximum lift in multi-element
airfoils
In reality, requirements are different: consider aircraft landing in a
storm, where angle of attack is not constant. Thus, the optimisation
process should account for uncertainty of the input parameters and
provide stable performance across the range.
Such effects typically lead to different optimisation results: envelope
of performance instead of maximum lift
Matching of computations with experimental data in combined experimental and numerical studies. Example: unknown flow pattern at the entry
of the jet engine combustor, but measured pressure and temperature data
available at the outlet.
Fluid-Structure Interaction
The first step in modelling is to choose the domain of interest. In simple
situations, this will cover only a single material or a single governing law.
Unfortunately, this is not always the case
Example: wing flutter
Aerodynamic forces from fluid flow determine the load on the wing.
Wing itself is an elastic structure and deforms under load
Deflection of the elastic wing changes the flow geometry: a new solution produces different surface load
Interaction between the two may be stable or unstable: flutter

26

Introduction: CFD in Aeronautical Applications

Fluid-structure simulations involve both the fluid and solid domain. Care
must be given to the coupling methods and stability of the algorithm

2.3

Finite Volume or Finite Element?

Two sets of numerical techniques handling computational continuum mechanics


dominate the field: the Finite Volume Method (FVM) and the Finite Element
Method. Once can clearly show both are based on the same principles and are
closely mathematically related. Various variants and generalisations can also be
devised, but so far their impact has been limited. Some deserve a mention:
Discontinuous Galerkin discretisation provides a common framework for the
FVM and FEM. It combines the conservative flux formulation which is a
basis of the FVM with the elemental shape function and a weak formulation of the FEM. One of interesting uses would be a formal higher-order
extension beyond second-order integrals. So far, the most important use is
the generalisation of mathematical machinery underpinning both methods
Lattice Gas and Lattice Boltzmann methods claim to simulate the flow
equations from basic principles of molecular dynamics instead of using the
continuum equations. Clearly, averaging over sufficient number of latice
operations will yield the original PDEs and producing the required solution.
Attractions of this method follow from simplifications of latice operations
to very primitive accuracy (e.g. 3 velocity levels) and simplifications in
complex geometry handling
Numerical Techniques in Aerospace Simulations: Spatial Techniques
Finite Difference Method (FDM): really appropriate only for structured
meshes; no conservation properties. Not used commercially. Important use
of FDM is in aero-acoustic simulations, where high-order discretisation is
essential (e.g. 6th order in space and 10th order in time). Problems with
high-order boundary conditions.
Finite Volume Method: dominates the fluid simulation arena
Finite Element Method. No particular reason why it cannot be used; however, the bulk of the numerical method development targeted to FVM. As a
result, some techniques and solution methodology not suitable for fluid flow.
I do not know any FEM fluid flow aerospace solvers, but FEM dominates
the structural analysis arena
Discontinuous Galerkin: a formal unification of the FEM and FVM ideas.
Strongly conservative and consistent, but extensions are still impractical

2.3 Finite Volume or Finite Element?

27

(control of matrix properties, solution techniques etc.). Consider it workin-progress


Monte Carlo Methods: extensively used in low-density high-speed aerodynamics (Space Shuttle re-entry). Techniques are specialised for high efficiency
Spectral techniques: special purposes only. Extremely efficient and accurate
for box in a box and cyclic matching simulations, e.g. DNS
Handling Temporal Variation
Steady state: no temporal discretisation required
Time domain: bulk of transient flow simulations
Frequency domain: special purposes. Example: in turbo-machinery simulations, it is possible to extract the dominant frequencies. Instead of solving
a time-dependent problem, a series of steady simulations is set up, each for
a selected frequency (effects of the temporal derivative now convert into
a source/sink term). The time-dependent behaviour is recovered from the
combination of frequency solutions.
Simplified Flow Solvers in Industrial Use
It is not always necessary to run a full Navier-Stokes solver to obtain usable
results. Also, the simulation time is sometimes critical: approximate result
now.
Panel method. Combination of source, sinks, doublets and vortex elements used to assemble a zero streamline form which represents the
body. Extremely fast and capable of producing indicative solutions with
experience.
http://www.engapplets.vt.edu/fluids/vpm/
Potential Flow Solvers. Incompressible formulation considered too basic. However, the compressible potential formulation, or even a transient
compressible potential can be very useful. The main effect missing in the
simplified form is the viscosity effect in the boundary layer: effective change
of shape for the potential region. Potential flow solver can be used to accelerate the solution to steady-state for more complex solver: initialisation
of the solution
Potential Flow with Boundary Layer Correction. Here, a combination of the compressible potential and boundary layer correction takes into
account the near-wall effect: the geometry is corrected for displacement
thickness in the boundary layer

28

Introduction: CFD in Aeronautical Applications

Euler Flow Solver. Neglects the viscous effects but the compressibility
physics can be handled in full.

Chapter 3
CFD in Automotive Applications
CFD Methodology
Numerous automotive components involve fluid flow and require optimisation. This opens a wide area of potential of CFD use in automotive
industry
CFD approaches the problem of fluid flow from fundamental equations:
no problem-specific or industry-specific simplification
A critical step involves complex geometry handling: it is essential to
capture real geometrical features of the engineering component under consideration
Traditional applications involve incompressible turbulent flow of Newtonian
fluids
While most people think of automotive CFD in terms of external aerodynamics simulations, reality of industrial CFD use is significantly different

Automotive CFD Today


In numbers of users in automotive companies, CFD today is second only to
CAD packages
In some areas, CFD replaces experiments

30

CFD in Automotive Applications

Engine coolant jackets


Under-hood thermal management
Passenger compartment comfort

In comparison with CFD, experimental studies are expensive, carry limited


information and it is difficult to achieve sufficient turn-over
The biggest obstacle is validation: can CFD results be trusted?

In other areas, CFD is insufficiently accurate for complete design studies

CFD in Automotive Applications

31

Required accuracy is beyond the current state of physical modelling


(especially turbulence modelling)
Simulation cost is prohibitive or turn-around is too slow
Flow physics is too complex: incomplete modelling or insufficient understanding of detailed physical processes
In some cases, combined 1-D/3-D studies capture the physics without
resorting to complete 3-D study
Examples:
Prediction of the lift and drag coefficient on a car body
In-cylinder simulations in an internal combustion engine
Complete internal combustion engine system: air intake, turbo-charger,
engine ports and valves, in-cylinder flow, exhaust and gas after-treatment
CFD can still contribute: parametric study (trends), reduced experimental
work etc.
Numerical modelling is particularly useful in understanding the flow or
looking for qualitative improvements: e.g. optimisation of vehicle soiling
pattern on windows
Examples of External Aerodynamics Simulations

32

CFD in Automotive Applications

CFD in Automotive Applications

33

CFD is used across the industry, at various levels of sophistication

Impact of simulations and reliance on numerical methods is greatest in areas


that were not studied in detail beforehand

Considerable use in cases where it is difficult to quantify the results in


simple terms like the lift and drag coefficient

Flow organisation, stability and optimisation


Detailed look at the flow field, especially in complex geometry
Optimisation of secondary effects: fuel-air mixture preparation

34

CFD in Automotive Applications

CFD Capabilities in 1980s: Early Adoption in Aerospace Industry


Historically, early efforts in CFD involve simplified equations and simulations relevant for aerospace industry
Experience in achieving best results with limited computational resources:
attention given to solution acceleration techniques
Application-specific physical models
Linearised potential equations, Hess and Smith, Douglas Aircraft 1966
3-D panel codes developed by Boeing, Lockheed, Douglas and others
in 1968
Specific turbulence models for aerospace flows, e.g. Baldwin-Lomax
Coupled boundary layer-potential flow solver, Euler flow solver
Capabilities beyond steady-state compressible flow were very limited

CFD in Automotive Applications

35

Early Automotive CFD Simulations


First efforts aimed at simplified external aerodynamics (1985-1988)
. . . but airfoil assumptions are not necessarily applicable
Joint numerical and experimental studies: validation of numerical techniques and simulation tools, qualitative results, analysis of flow patterns
and similar

It is quickly recognised that the needs of automotive industry and (potential) capabilities of CFD solvers are well beyond contemporary experimental
work
Focus of early numerical work is on performance-critical components: internal combustion engines and external aerodynamics
Geometry and flow conditions are simplified to help with simulation set-up
Example: Intake Valve and Manifold
2-D steady-state incompressible turbulent fluid flow
Axi-symmetric geometry with a straight intake manifold and fixed valve lift

36

CFD in Automotive Applications

Simulation by Peric, Imperial College London 1985

Automotive of CFD in 1990s: Expanding Computer Power and Validated Models


Numerical modelling is moving towards product design
Improvements in computer performance: reduced hardware cost, Moores
law
Improved physical modelling and numerics: fundamental problems are
with flow, turbulence and discretisation are resolved
Sufficient validation and experience accumulated over 10 years

CFD in Automotive Applications

37

Notable improvement in geometrical handling: realistic 3-D geometry


Graphical post-processing tools and animations: easier solution analysis
Mesh generation for complex geometry is a bottle-neck: need better tools

Expansion of Automotive CFD


Increase in computer performance drives the expansion of CFD into new
areas by reducing simulation turn-over time
Massively parallel computers provide the equivalent largest supercomputers
at prices affordable in industrial environment (1000s of CPUs)

38

CFD in Automotive Applications

Physical Modelling
New physical models quickly find their use, e.g. free surface flows
Looking at more complex systems in transient mode and in 3-D: simulation
of a multi-cylinder engine, with dynamic effects in the intake and exhaust
system
Computing power brings in new areas of simulation and physical modelling
paradigms. Example: Large Eddy Simulation (LES) of turbulent flows
Integration into a CAE Environment
Computer-Aided Design software is the basis of automotive industry
Historically, mesh generation and CFD software are developed separately
and outside of CAD environment, but the work flow is CAD based!
Current trend looks to seamlessly include CFD capabilities in CAD
Summary: Automotive CFD Today
CFD is successfully used across automotive product development
Initial landing target of external aerodynamics and in-cylinder engine
simulation still not reached (!) sufficient accuracy difficult to achieve
Lessons Learned
The success of CFD in automotive simulation is based on providing industry needs rather than choosing problems we may simulate: find a critical
broken process and offer a solution
Numerical simulation tools will be adopted only when they fit the product
development process: robust, accurate and validated solver, rapid turn-over
Experimental and numerical work complement each other even if sufficient
accuracy for predictive simulations cannot be achieved
Validation of simulation results understanding experimental set-up
Parametric studies: speeding up experimental turn-over
True impact of simulation tools is beyond the obvious uses: industry will
drive the research effort to answer its needs

Part II
The Finite Volume Method

Chapter 4
Mesh Handling
4.1

Introduction

When presenting a continuum mechanics problem for computer simulation, one


needs to establish not only the mathematical model but also the computational
domain. While the choice of physics is relatively general, numerical description
of the domain of interest is considerably more complex. Looking at the area of
external aerodynamics in aerospace, compressible Navier-Stokes equations for an
ideal gas with typically suffice, while the wealth of geometrical shapes defies even
basic classification.
In most cases, shape of the spatial domain is of primary interest: capturing it
in all relevant detail is essential. In transient simulations, handling the temporal
axis is considerably simpler. Due to uni-directional nature of interaction, it is
sufficient to split the time interval into a finite number of time-steps and march
the solution forward in time.
It quickly becomes clear that fidelity of geometrical description of an engineering object plays an important role. For example, in a heat exchanger, it is
necessary to capture active surface area with some precision in order to correctly
calculate the total heat transfer. At the same time, it is a question of engineering
judgement to decide which geometrical features are important for the result and
which may be omitted.
A computational mesh splits the space into a finite number of elements (cells,
control volumes or similar), bounded by faces and supported by points. Computational locations are located in the cells or on the points in a regular manner.
The idea of mesh support is to discretise the governing equations over each cell
and handle cell-to-cell interaction. Some mesh validity criteria follow directly
from the above:
Computational cells should not overlap;
Computational cells should completely fill the domain of interest.

42

Mesh Handling

Every discretisation method bring its own mesh validity criteria and measures
of mesh quality. In general terms, a mesh that visually pleasing is also likely to
support a quality solution. Our second concern is the interaction between the
mesh resolution and (known or implied) solution characteristics. Features such
as shocks, boundary layers and mixing planes require higher resolution that a
far field section of the domain. Construction of a quality mesh is usually a
question of experience and use of quality mesh generation tools. An ideal mesh
would be the one uniformly distributing the discretisation error in the solution
volume and producing user-independent (or, more precisely, user-experienceindependent) result. The quest for fast and robust automatic mesh generators
iteratively sensitised to the solution is still ongoing.

4.2

Complex Geometry Requirements

Computational Mesh
A computational mesh represents a description of spatial domain in the
simulation: external shape of the domain and highlighted regions of interest,
with increased mesh resolution
Mesh-less methods are possible (though not popular): the issue of describing the domain of interest to the computer still remains
Mesh generation is the current bottle-neck in CFD simulations. Fully automatic mesh generators are getting better and are routinely used. At the
same time, requirements on rapid and high-quality meshing and massively
increased mesh size are becoming a problem
Routinely used mesh size today
Small mesh for model experimentation and quick games: 100 to 50k
cells. Fast turn-around and qualitative results. Note that a number of
flow organisation problems may be solved on this mesh resolution
2-D geometry: 10k to 1m cells. Low-Re turbulent simulations may
require more, due to near-wall mesh resolution requirements
3-D geometry: 50k to several million cells
Complex geometry, 3D, industrial size, 100k to 10-50 million cells.
Varies considerably depending on geometry and physics, steady/transient
flow etc.
Large Eddy Simulation (LES) 3-D, transient, 1-10 million cells. LES
requires very long transient runs and averaging (20-50k time steps),
which keeps the mesh resolution down

4.2 Complex Geometry Requirements

43

Full car aerodynamics, Formula 1: 20-200 million cells for routine use.
Large simulations under discussion: 1 billion cells!
On very large meshes, problem swith the current generation of CFD software becomes a limiting factor: missing parallel mesh generation, data file
read/write, post-processing of results, hardware and software prices
Handling Complex Geometry
In aerospace applications, geometrical information is usually available before the simulation. In general, this is not the case: for simple applications,
a mesh may be the only available description of the geometry
Domain description is much easier in 2-D: real complications can only be
seen in 3-D meshes
Geometrical data formats
2-D boundary shape: airfoils. Usually a detailed map of xy locations
on the surface. Sometimes defined as curve data
http://www.ae.uiuc.edu/m-selig/ads/coord database.html
Stereo Lithographic Surface (STL): a surface is represented by a set
of triangular facets. Resolution can be automatically adjusted to capture the surface curvature or control points. Creation of STL usually
available from CAD packages
Native CAD description: Initial Graphics Exchange Specification (IGES),
solid model etc. In most cases, the surface is represented by NonUniform Rational B-Splines or approximated by quadric surfaces. Typically, both are too expensive for the manipulations required in mesh
generation and either avoided or simplified
Geometry clean-up. Very rarely is the CAD description built specifically
for CFD in most cases, CAD surfaces (wing, body, nacelle) are assembled
from various sources, with varying quality and imperfect matching. Surface
clean-up is time-consuming and not trivial. In some cases, the mesh generator may be less sensitive to errors in surface description, which simplifies
the clean-up
Feature removal. CAD description or STL surface may contain a level
of detail too fine to be captured by the desired mesh size, causing trouble
with 3-D mesh generation. Feature removal creates an approximation of
the original geometry with the desired level of detail

44

Mesh Handling

Surface Mesh Generation


In cases where the surface description is not discrete, a surface mesh may
be created first
STL surface is already a mesh. It may be necessary to additionally split
the surface for easier imposition of boundary conditions: inlet, outlet, symmetry plane etc.
Surface mesh is usually triangular or quadrilateral. There are potential
issues with capturing surface curvature: surface mesh will be considered
sufficiently fine
Volume Mesh Generation
The main role of the volume mesh is to capture the 3-D geometry
The cells should not overlap and should completely fill the computational
domain. Additionally, some convexness criteria (FVM) or a library of predefined cell shapes (FEM) is included.
Computational mesh defines the location and distribution of solution points
(vertices, cells etc.). Thus, filling the domain with the mesh is not sufficient
- ideally some aspects of the solution should be taken into account.
A-priori knowledge of the solution is useful in mesh generation. Trying to
locate the regions of high mesh resolution (fine mesh) to capture critical
parts of the solution: shocks, boundary layers and simular
Quality of the mesh critical for a good solution and is not measured only
in mesh resolution
Mesh quality measures depend on the discretisation method
Cell aspect ratio
Non-orthogonality
Skewness
Cell distortion from ideal shape
. . . etc.

4.3 Mesh Structure and Organisation

4.3

45

Mesh Structure and Organisation

Influence of Mesh Structure


Some numerical solution techniques require specific mesh types. Example:
Cartesian meshes for high-order finite difference method
Supported mesh structure may severely limit the use of a chosen discretisation method
With mesh generation as a bottle-neck, it makes sense to generalise the
solver to be extremely flexible on the meshing side, simplifying the most
difficult part of the simulation process
Cartesian Mesh
x y z mesh aligned with the coordinate system. May be defined by 2
points and resolution in 3 directions
Mesh addressing (cells to neighbour cells, cells to points, points to neighbour
points etc.) can be calculated on the fly given the mesh dimension
Simple to define, efficient and can be used with any type of discretisation
Severe limitation on the geometry that can be handled: a box within a box
Extensions may include blocked-out cells or staircase boundaries

Structured Body-Fitted Mesh


Body-fitted meshes originate from the non-orthogonal curvilinear coordinate system approach. The case-specific coordinate system is created to fit
the boundary

46

Mesh Handling

The mesh is hexahedral and regularly connected. Real geometry can be


captured but with insufficient control over local mesh resolution
The use of contravariant coordinates for the solution vectors was quickly
abandoned

Multi-Block Mesh
Mesh created as a combination of multiple body-fitted blocks. All block
and cells are still hexahedral
In FVM, special coding is done on block interfaces, where the mesh connectivity cannot be implicitly established
Much more control over mesh grading and local resolution. However, mesh
generation in 3-D for relatively complex shapes is still hard and timeconsuming: meshes need to match

4.3 Mesh Structure and Organisation

47

Unstructured Shape-Consistent Mesh


At this stage, all meshes are hand-built. A complex 3-D mesh could take
2-3 months to construct
Block connectivity above introduces the concept of storing mesh connectivity rather than calculating it: unstructured mesh
Loose definition of connectivity allows more freedom: hexahedral and degenerate hexahedral meshes: prisms, pyramids, wedges etc. allow easier
meshing

48

Mesh Handling

From the numerical simulation point of view, this is a major step forward.
Geometries of industrial interest can now be tackled with a detailed description, which satisfies the design engineer
At this stage, numerical simulation in an industrial setting really takes off.
Handling airfoils and single wing or even wing-fuselage assembly is not too
difficult. Hand-built meshes for a complete aircraft are still quite difficult

Tetrahedral and Hybrid Tet-Hex Meshes


Tetrahedral mesh are not good from the numerics point of view
. . . but they could be generated automatically!
In a solver can support tetrahedral meshes, mesh generation time for complex geometry reduces from weeks to hours.
Great saving in mesh generation effort, faster turn-around of simulations
and geometrical variation, mesh sensitivity studies can be performed on
realistic geometries
Tetrahedra are particularly poor in boundary layers close to walls. A hybrid mesh is built by creating a layered hexahedral mesh next to the wall.

4.3 Mesh Structure and Organisation

49

The rest of the domain is filled with tetrahedra. A combined tet-hex mesh
is a great improvement in quality
On the negative side, cell count for a tetrahedral mesh of equivalent resolution is higher than for hexahedra. A part of the price is paid in lower
accuracy of the solver on tetrahedra: limited neighbourhood connectivity.

Tetrahedral mesh generation techniques


Advancing front method: starting from the boundary triangulation,
insert tetrahedra from the live front using priority lists
Delaunay triangulation: point insertion and re-triangulation. The initial mesh is created by triangulating the boundary. New points are
added in a way which improves the quality of the most distorted triangles and creates a convex hull around each point

Overset and Chimera Meshes


Used for cases where a simple solver is used for complex cases or parts of
geometry move relative to each other
Each part is meshed in a simple manner and over-set on a background mesh.
In regions of overlap, special discretisation practices couple the solution
Chimera approach is numerically problematic: issues of coupling, conservation and accuracy in overlap regions.

50

Mesh Handling

Polyhedral Mesh Support


In spite of automatic generation techniques, tetrahedral meshes are not of
sufficient quality for industrial use. On the other hand, automatic hexahedral mesh generation has proven to be extremely challenging
Finite Volume discretisation is not actually dependent on the cell shape:
unlike FEM, there are no pre-defined shape functions and transformation
tensors. This brings the possibility of polyhedral mesh support
Finite Volume discretisation algorithm is reformulated into loops over cells
and faces (still doing the same job)
Polyhedral meshes are considerably better than tetrahedra, can be manipulated to be predominantly hexahedral, orthogonal and regular and can be
created automatically

4.4

Manual Meshing: Airfoils

Mesh Structure for 2-D Airfoils


Manual meshing of airfoil profiles really belongs to the past; it is still indicative to show how mesh handling governs the use of CFD
O-mesh: NACA0012 example
C-mesh: NACA32012 example, prettier in raeProfile
H-mesh

51

4.4 Manual Meshing: Airfoils

Hybrid mesh structure: triangular mesh with prismatic layers: twoElement


Adapting to the geometry: transfinite mapping techniques
Adapting to the solution: shock capturing with r-refinement
Meshing multi-element airfoil configurations
Mesh Generation by Partial Differential Equation
Transfinite mapping operation can be viewed as a solution of the Laplace
equation. Thus, a mesh can be created by solving an equation
Mesh grading can be controlled by sizing functions: Laplace equation with
variable coefficients
An equivalent formulation exists for controlling mesh orthogonality
This approach to mesh generation is useful in parametric studies, where
a large number of similar geometries needs to be simulated. An initial
template mesh is built and adjusted to the correct shape

e
e40

e
e27

e
e35

e
e34

e
e40

e
e25
e
e37
e
e19
e
e12
ee6
e11
e10
e
e
e4

e
e35
e
e69

e19
e4
e
ee11
e10
ee12
e

e
e66
e
e16
e
e41
e
e68

e
e69

e
e16
e
e66

e
e20

e
e30

e
e38

e42e17
e
e
e
e38

e5
e
ee8
e9
ee7
ee3
e

e
e68
e
e31

e
e15

e30
e
e
e39
e
e41

e
e37

e
e6
e
e39

e
e20

e
e42
e
e17

e
e31

e
e5
e
ee8
e9
ee7
ee3
e e15

e
e26
e
e33
e
e1

e
e1
e
e32

e
e32
e
e25

e
e29
e
e28
e
e29
e
e77
e
e13

e
e76

e
e27
e
e34

e
e26
e
e77
e
e13

e
e76

e
e33
e
e22

e
e28

Polyhedral Mesh Generation


Tessalated mesh
The Delaunay triangulation algorithm introduces points on proximity
rules. During the creation of the mesh, a dual mesh of convex polyhedra is created and can be extracted by a post-processing operation
Interaction on the tessalated mesh and the boundary needs to be recovered after polyhedral mesh assembly
Local control of mesh size achieved in the same way as in tetrahedral
meshes

52

Mesh Handling

Pi

V1

Voronoi vertex

X
Y

P1

4.5 Adaptive Mesh Refinement

53

Cut hexahedral and cut polyhedral mesh


Most of mesh generation is straightforward: filling space with nonoverlapping cells. Even close to boundaries, it is easy to build high
quality layered structure
Problematic parts of mesh generation are related to interaction of advancing generation surfaces or boundary interaction in complex corners
of regions where the mesh resolution dos not match the level of detail
on the boundary description.
Cut cell technology creates a rough mesh background mesh, either
uniform hexahedral or capturing major features of the geometry. The
mesh inside of the domain is kept and the one interacting with the
boundary surface is adjusted or cut by the surface
In some cases, the background mesh resolution can be automatically
adjusted around the surface to match the local resolution requirements
Meshes are good quality and can be generated rapidly. Prismatic
boundary layers may also be added. In some cases, background mesh
adjustment or concave cell corrections are required.

Examples
3rd AIAA CFD Drag Prediction Workshop
http://aaac.larc.nasa.gov/tsab/cfdlarc/aiaa-dpw/

4.5

Adaptive Mesh Refinement

From the above examples it can be seen how the structure and quality of the
mesh influences the solution. In first approximation, the number and distribution of computational points determines out picture of the solution even in the
absence of computational errors. In places where the solution varies rapidly or

54

Mesh Handling

complex physical processes occur, it is advisable to locally increase the density


of computational points.
Putting the resolution requirement on a firmer basis, ona may postulate that
every discretisation method aimed at continuum mechanics postulates a local
variation of the solution between the computational points. A largest source
of discretisation error is a discrepancy between the postulated and actual field
variation. Grouping computational points closer together relaxes the difference
between the prescribed and actual variation in the solution, reducing the discretisation error.
Mesh Resolution
Mesh structure specifies where the computational points are located. Discretisation practice postulates the shape of solution between the computational points, which is the main source of discretisation error
A sensible meshing strategy requires high resolution in regions of interest
instead of uniformly distributing points in the domain. This implies some
knowledge of the solution during mesh generation.
The same can be achieved in an iterative way
1. Create initial mesh and initial solution
2. Examine the solution from the point of view of accuracy or resolution
in regions of interest
3. Based on the available solution, adjust mesh resolution in order to
improve the solution in the selected parts of the domain
4. Repeat until sufficient accuracy is achieved or computer resources are
exhausted
Performing mesh improvement by hand is tedious and time-consuming. For
an automatic procedure, two questions need to be answered:
Where to refine the mesh (adjust resolution)?
How to change the mesh to achieve the required accuracy
Types of Mesh Refinement
Global refinement: mesh sensitivity studies
h-refinement: introducing new computational points in regions of interest
r-refinement: re-organise the existing points such that more points fall into
the region of interest

4.5 Adaptive Mesh Refinement

55

p-refinement: enriching the space of shape functions in order to capture the


solution more closely
Mesh refinement cannot be done indiscriminately: locally refined meshes
typically introduce increased mesh-induced errors as well. The trick it to
locate the regions of poor mesh away from the regions of interest

BBBBBBBBBBBBBBBBBBBBBBBBB
BBBBBBBBBBBBBBBBBBBBBBBBB
BBBBBBBBBBBBBBBBBBBBBBBBB
BBBBBBBBBBBBBBBBBBBBBBBBB
BBBBBBBBBBBBBBBBBBBBBBBBB
BBBBBBBBBBBBBBBBBBBBBBBBB
BBBBBBBBBBBBBBBBBBBBBBBBB
BBBBBBBBBBBBBBBBBBBBBBBBB
BBBBBBBBBBBBBBBBBBBBBBBBB
CCCCCCCCCCCC
BBBBBBBBBBBBBBBBBBBBBBBBB
BBBBBBBBBBB
CCCCCCCCCCCC
CCCCCCCCCCCC
CCCCCCCCCCCC
BBBBBBBBBBBBBBBBBBBBBBBBB
BBBBBBBBBBB
CCCCCCCCCCCC
CCCCCCCCCCCCCCCCCCCCC
CCCCCCCCCCCC
BBBBBBBBBBBBBBBBBBBBBBBBB
BBBBBBBBBBB
BBBBB
CCCCCCCCCCCCCCC
CCCCCCCCCCCC
CCCCCCCCCCCCCCCCCCCCC
CCCCCCCCCCCC
CCCCCCCCCCCCCCC
CCCCCCCCCCCC
CCCCCCCCCCCCCCCCCCCCC
CCCCCCCCCCCC
BBBBBBBBBBBBBBBBBBBBBBBBB
BBBBBBBBBBB
BBBBB
CCCCCCCCCCCCCCC
CCCCCCCCCCCC
CCCCCCCCCCCCCCCCCCCCC
CCCCCCCCCCCC
CCCCCCCCCCCCCCC BBBBBBBBBBBBBBBBBBBBBBBBB
BBBBBBBBBBB
CCCCCCCCCCCC
CCCCCCCCCCCCCCCCCCCCC
BBBBB
CCCCCCCCCCCC
CCCCCCCCCCCCCCC
CCCCCCCCCCCC
CCCCCCCCCCCCCCCCCCCCC
CCCCCCCCCCCC
BBBBBBBBBBBBBBBBBBBBBBBBB
BBBBBBBBBBB
BBBBB
CCCCCCCCCCCCCCC
CCCCCCCCCCCC
CCCCCCCCCCCCCCCCCCCCC
CCCCCCCCCCCC
CCCCCCCCCCCCCCC
CCCCCCCCCCCC
CCCCCCCCCCCCCCCCCCCCC
BBBBBBBBBBBBBBBBBBBBBBBBB
BBBBBBBBBBB
BBBBB
CCCCCCCCCCCCCCC
CCCCCCCCCCCC
CCCCCCCCCCCCCCCCCCCCC
BBBBBBBBBBBBBBBBBBBBBBBBB
BBBBBBBBBBB
CCCCCCCCCCCCCCC
CCCCCCCCCCCC
CCCCCCCCCCCCCCC
CCCCCCCCCCCC
BBBBBBBBBBBBBBBBBBBBBBBBB
BBBBBBBBBBB
CCCCCCCCCCCCCCC
BBBBBBBBBBB
CCCCCCCCCCCCCCC BBBBBBBBBBBBBBBBBBBBBBBBB
BBBBBBBBBBBBBBBBBBBBBBBBB
BBBBBBBBBBBBBBBBBBBBBBBBB
BBBBBBBBBBBBBBBBBBBBBBBBB
BBBBBBBBBBBBBBBBBBBBBBBBB
BBBBBBBBBBBBBBBBBBBBBBBBB
BBBBBBBBBBBBBBBBBBBBBBBBB
Error- or Indicator-Driven Adaptivity
In strongly shocked flows, it is relatively easy to identify regions of interest:
shocks, boundary layer, contact discontinuities. In more complex situations
or in presence of flow features of different strength, this is much more difficult. Mesh-induced discretisation errors (poor mesh quality or insufficient
resolution) also needs to be taken into account.
A region of interest can usually be recognised by high gradients: rapidly
varying solution

56

Mesh Handling

Error indicators: highlight regions of interest. Example: magnitude of


the second pressure gradient, Mach number distribution etc.
Error estimates: apart from the spatial information (error distribution),
they provide guidance on the absolute error level
Adjusting to Original Boundary Shape
Traditionally, mesh adaptation was a part of the CFD solver instead of
mesh generator. In cases where the refinement algorithm resorts to cell
splitting, we may end up with a faceted surface representation instead of a
smooth surface, which compromises the results.
Solution: geometrical description of the boundary needs to be available from
the solver instead of trying to recover the data from the original (coarse)
mesh
A further step is related to the specification of boundary conditions. In,
for example, wind tunnel simulations, the velocity and turbulence at the
inlet plane in shown from the measured data and interpolated onto the inlet
patch of the mesh. Ideally, the boundary condition should be associated
with space or with the boundary description, avoiding problems with interpolation. This leads to issues of CAD integration, which is beyond our
scope
Examples of Automatic Meshing and Adaptivity
Supersonic flow, h-refinement

57

4.6 Dynamic Mesh Handling

0.0

4.6

0.5

1.0

1.5

2.0

2.5

3.0

Dynamic Mesh Handling

Many relevant simulations in continuum mechanics involve the cases where the
shape of computational domain changes during the simulation, either in a manner prescribed up front or as a function of the solution. As we will show later,
handling such cases generalises the discretisation practice to some form of Arbitrary Lagrangian-Eulerian practice, combining the view from the Lagrangian
and Eulerian reference frame. This is usually terms dynamic mesh handling,
coming in a number of different guises.
From the point of view of mesh handling, we can recognise two distinct situations:
Mesh deformation, where the structure and connectivity of the mesh
remains unchanged, but the position of points supporting its shape changes.
Mesh deformation is characterised by the fact that the number of point,
faces, cells and boundary faces remains constant, as does the connectivity
between the shapes;
In a topologically changing mesh, the number of points, faces and cells
or their connectivity varies during the simulation.
It will be shown that standard discretisation methods handle cases of mesh
deformation without loss of accuracy, while topological changes may (depending

58

Mesh Handling

on the algorithm) involve solution re-mapping, with associated interpolation or


data redistribution errors. Thus, mesh deformation is usually preferred, unless it
implies excessive mesh-induced discretisation errors.
Typical examples of dynamic mesh handling in aerospace application include
moving flap and slat simulation, aircraft landing, bomb or missile release (opening
of the ordonnance bay), multi-stage turbomachinery simulations with rotor-stator
interaction etc.
Moving Deforming Mesh
There exist cases where the shape of the domain varies during the calculation. Boundary motion may be prescribed in advance as a part of the case
setup or be a part of the solution itself
Internal mesh influences mainly the discretisation error: it is the external
shape of the domain which carries the major influence. Moving deforming mesh algorithm will allow the domain to change its shape during the
simulation and preserve its validity
Shape changes are performed by point motion: the connectivity and structure of the mesh remains unchanged
Topological Mesh Changes
In cases of extreme shape change, moving deforming mesh is not sufficiently
flexible: deforming the mesh to accommodate extreme boundary deformation would introduce high discretisation errors
Mesh motion can be accommodated by adding or removing computational
cells to accommodate the boundary deformation. This is associated with
higher discretisation errors and complications in the algorithm, but is sometimes essential
Common types of topological changes:
Attach/detach boundary
Cell layer addition/removal
Sliding interface

4.6 Dynamic Mesh Handling

59

Typically, a combination of several topological changes will be used together


to achieve mode complex mesh changes
Example: in-cylinder simulations in internal combustion engines

60

Mesh Handling

Chapter 5
Transport Equation in the
Standard Form
5.1

Introduction

The importance of a scalar transport equation in the standard form lies in the fact
that it contains typical forms of rate-of change, transport and volume source/sink
terms present in continuum governing laws. These include convective transport, based on the convective velocity field, gradient-driven diffusive transport, rate-of-change terms and localised volume sources and sinks. Understanding the behaviour of various terms and their interaction will help the reader
comprehend even the most complex physical models.
Governing equations of physical interest regularly take the form of the scalar
transport equation. The derivation and modelling rationale is straightforward:
the rate of change and convection terms follow directly from The Reynolds Transport Theorem, while the diffusive transport is the simplest gradient-based model
of surface sources and sinks. A good example of generalisation of the scalar
transport equation is the density-based compressible flow solver, often written as
a scalar transport of a composite variable [, u, E].
In what follows, we will offer a brief overview of the background and derivation
of the scalar transport equation, its initial and boundary conditions and various
often-encountered generalisations.

5.2

Scalar Transport Equation in the Standard


Form

Background
Scalar transport equation in the standard form will be our model for discretisation. Conservation laws, governing the continuum mechanics adhere

62

Transport Equation in the Standard Form

to the standard form: good example


Standard form is not the only one available: modelled equations may be
more complex or some source/sink terms can be recognised as transport.
This leads to other forms, but the basics are still the same
Moving away from physics, almost identical equations can be found in other
areas: for example financial modelling
The common factor for all equations under consideration is the same set
of operators: temporal derivative, gradient, divergence, Laplacian, curl, as
well as various source and sink terms
Nomenclature
Scalar, vector, tensor represent a property in a point. In the equations
under consideration, we will need tensors only up to second order
Scalars in lowercase: a
Vectors in bold: a = ai
Tensors in bold capitals: A = Aij
All vectors will be written in the global Cartesian coordinate system and
in 3-D space
Inner and outer product of vectors and tensors. Vector notation will be
used feel free to shadow in the Einstein notation in the notes and I will
help
Scalar product: ab = a bi
Inner vector product, producing a scalar: ab = ai bi
Outer vector product, producing a second rank tensor: ab = ai bj
Inner product of a vector and a tensor (mind the index)
product from the left: aC = ai Cij
product from the right: Ca = aj Cij
Field algebra
Continuum mechanics deals with field variables: according to the continuity assumption, a variable (e.g. pressure) is defined in each point
in space for each moment in time
I will use as a name for the generic variable

5.2 Scalar Transport Equation in the Standard Form

63

From the field definition = (x, t), which means that we can define
the spatial and temporal derivative
Divergence and gradient
For convenience, we need to define the gradient operator to extract
the spatial component of the derivative as a vector. Formally this
would be
x
=

=
i+
j+ k
x
x
y
z

(5.1)

Thus, for a scalar , is a vector


=

(5.2)

If we imagine defined in a 2-D space as a 2-D surface, for each point the
gradient vector points in the direction of the steepest ascent, i.e. up the
slope
For vector and tensor fields, we define the inner and outer product with the
gradient operator. Please pay attention to the definition of the gradient:
multiplication from the left!
Gradient operator for a vector u creates a second rank tensor
u =

uj
uj =
xi
xi

(5.3)

Divergence operator for a vector u creates a scalar


u =

5.2.1

ui
xi

(5.4)

Reynolds Transport Theorem

Reynolds Transport Theorem is a mathematical derivation of the relationship


between the Lagrangian and Eulerian analysis framework. It is essential to recognise that it involves no simplifications or modelling but it establishes the basis
for Euler view of the continuum.
It is sometimes tempting to dwell on the interaction of Lagrangian particles
and look at the generalising their behaviour to the continuum level: after all,
matter itself is composed of discrete particles rather than field variables. In
fact, classical physics has already covered this in kinetic theory of gasses,
where the continuum behaviour and transition scales are established from basic
principles. However, scales of engineering interest today are sufficiently removed
from the mean free path to warrant the use of continuum mechanics in most
engineering disciplines for some time to come.

64

Transport Equation in the Standard Form

Reynolds Transport Theorem


Reynolds transport theorem is a first step to assembling the standard transport equation
Examine a region of space: a Control Volume (CV)

dS
00
11
11
00
00
11
00
11

inflow

n
outflow

The rate of change of a general property in the system is equal


to the rate of change of in the control volume plus the rate of
net outflow of through the surface of the control volume.
Mathematically:
Z
Z
I
d

dV =
(nu)dS
dV +
dt Vm
Vm t
Sm
d
dt

dV =
V

Z
V

+ (u) dV
t

(5.5)

(5.6)

Here u represents the convective velocity: flux going in is negative


(un < 0). The convective velocity in general terms can be considered
as a coordinate transformation.
u is also a function of space and time: our coordinate transformation is not
trivial. Examples: solid body motion, solid rotation, cases where u is not
divergence-free
Sources and Sinks
Apart from convection (above), we can have local sources and sinks of .
Volume source: distributed through the volume, e.g. gravity
Surface source: act on external surface S, e.g. heating. Typically modelled
using gradient-based models

65

5.2 Scalar Transport Equation in the Standard Form

qs
dS
00
11

inflow
Qv

11
00
00
11
00
11

n
outflow

d
dt

dV =
V

qv dV

(nqs )dS

+ (u) = qv qs
t

5.2.2

(5.7)

(5.8)

Diffusive Transport

Gradient-based transport plays a very different role from the Reynolds Transport Theorem terms derived above. One should keep in mind that diffusion is
a physical model for the behaviour of surface terms rather that a result of direct mathematical manipulation. However, its generality and special mathematical properties are much deeper. Gradient-based transport si observed regularly
in many physical phenomena, from conductive heat transfer to equilibration of
species concentration. It can be seen as the effect of molecular dynamics on the
macro-scale, in presence of sufficient scale separation.
Diffusive Transport
Gradient-based transport is a model for surface source/sink terms
Consider a case where is a concentration of a scalar variable and a closed
domain. Diffusion transport says that will be transported from regions of
high concentration to regions of low concentration until the concentration
is uniform everywhere.
Taking into account that point up the concentration slope, and the
transport will be in the opposite direction, we can define the following
diffusion model
qs = ,
where is the diffusivity.

(5.9)

66

Transport Equation in the Standard Form

Generic Transport Equation


Assembling the above yields the transport equation in the standard form

t
|{z}

temporal derivative

(u)
| {z }

convection term

() =
| {z }
diffusion term

qv
|{z}

(5.10)

source term

Temporal derivative represents inertia of the system


Convection term represents the convective transport by the prescribed velocity field (coordinate transformation). The term has got hyperbolic
nature: information comes from the vicinity, defined by the direction of the
convection velocity
Diffusion term represents gradient transport. This is an elliptic term:
every point in the domain feels the influence of every other point instantaneously
Sources and sinks account for non-transport effects: local volume production and destruction of
Conservation Equations
As promised, conservation equations in continuum mechanics follow the
above form
Conservation of mass: continuity equation

+ (u) = 0
t

(5.11)

Conservation of linear momentum


(u)
+ (uu) = g +
t

(5.12)

Energy conservation equation


(e)
+ (eu) = g.u + (.u) q + Q
t

(5.13)

5.3 Initial and Boundary Conditions

5.3

67

Initial and Boundary Conditions

The role of boundary conditions is to isolate the system under consideration from
the external environment. Location and type of boundary conditions depends on
our knowledge about flow and physical conditions and their influence on the
solution. Boundary conditions can be classified as numerical and physical
boundary conditions.
Numerical boundary conditions can be considered at the equation level. Main
types are the fixed value or Dirichlet condition, zero (Neumann) or fixed gradient
condition (flux condition) and a mixed or Robin condition.
Physical boundary conditions are related to the model under consideration
and involves combinations of individual equations under consideration. This
reduces to numerical boundary conditions between various equations, where the
fixed value, flux or their combination are updated in a physically meaningful
sense.
Examples of physical boundary conditions in fluid flows are flow inlets and
outlets, wall, symmetry planes and far field conditions.
Boundary Conditions
The role of boundary conditions is to isolate the system from the rest of
the Universe. Without them, we would have to model everything
Position of boundaries and specified condition requires engineering judgement. Badly placed boundaries will compromise the solution or cause numerical problems. Example: locating an outlet boundary across a recirculation zone.
Incorporating the knowledge of boundary conditions from experimental
studies or other sources into a simulation is not trivial: it is not sufficient
to pick up some arbitrary data and force in on a simulation. Choices need
to be based on physical understanding of the system
Numerical Boundary Conditions
Dirichlet condition: fixed boundary value of
Neumann: zero gradient or no flux condition: nqs = 0
Fixed gradient or fixed flux condition: nqs = qb . Generalisation of the
Neumann condition
Mixed condition: Linear combination of the value and gradient condition

68

Transport Equation in the Standard Form

More Numerical Conditions


More numerical conditions, related to simplifications in the shape or size of
the computational domain. The idea is to limit or decrease the size of the
computational domain (saving on the cell count) by using the properties of
the solution and boundary conditions
Symmetry plane. In cases where the geometry and boundary conditions
are symmetric and the flow is steady (or the equation is linear in the symmetrical direction), only a section of the problem may be modelled. The
simplification will not work if the expected flow pattern is not symmetric
as well: manoeuvring aircraft, cross-wind etc.

Cyclic and periodic conditions. In cases of repeating geometry (e.g.


tube bundle heat exchangers) or fully developed conditions, the size of
domain can be reduced by modelling only a representative segment of the
geometry. In order to account for periodicity, a self-coupled condition
can be set up on the boundary. In special cases, a jump condition can
be specified for variables that do not exhibit cyclic behaviour. Example:
pressure in fully developed channel flow

69

5.3 Initial and Boundary Conditions


Symmetry plane

Wall

Wall

5D
0.
0.
5D

Wall

Developed inlet profile

Outlet

Symmetry plane
H

Symmetry plane
X

Implicit implementation of the condition (depending on the current value)


improves the numerical properties of the condition
A more general (re-mapping) form of the condition can also be specified,
but not in the implicit form

Physical Boundary Conditions


Currently, we are dealing with a passive transport of a scalar variable:
physical meaning of the boundary condition is trivial
In case of coupled equation sets or a clear physical meaning, it is useful to
associate physically meaningful names to the sets of boundary conditions
for individual equations. Examples
Subsonic velocity inlet: fixed value velocity, zero gradient pressure,
fixed temperature
Supersonic outlet: all variables zero gradient

70

Transport Equation in the Standard Form

Heated wall: fixed value velocity, zero gradient pressure, fixed gradient
temperature (fixed heat flux)
Initial Condition
Boundary conditions are only a part of problem specification. Initial conditions specify the variation of each solution variable in space. In some
cases, this may be irrelevant:
Steady-state simulation result should not depend on the initial condition
In oscillatory transient cases (e.g. vortex shedding), the initial condition is irrelevant

. . . but in other simulations it is essential: relaxation problems


Initial field should in principle satisfy the governing equation and physical
bounds. Importance of this will depend on the robustness of the algorithm. Example: initialise the flow simulations using the potential flow
solver to satisfy continuity. In practice, robust solvers only care about
physical bounds

5.4

Physical Bounds in Solution Variables

An important property of physical variables are their natural bounds. Examples here would include kinetic energy, which always remains positive; species
concentration, bounded between 0 and 1 (100 %) and many others.
Physical bounds may be implied from the nature of the variable, but also from
the differential equations governing the system. A good test of understanding of
the equation systems involves the analysis of boundedness from the source and
sink terms and their interaction.
Enforcing Physical Bounds
When transport equations are assembled, they represent real physical properties. A set of equations under consideration relies on the fact that physical

5.4 Physical Bounds in Solution Variables

71

variables obey certain bounds: if the bounds are violated, the system exhibits unrealistic behaviour
Examples of variables with physical bounds
Negative density value: 3 kg/m3
Negative absolute temperature
Negative kinetic energy (to turbulent kinetic energy
Concentration value below zero or above one: Two phase flow, using
a scalar concentration to indicate the presence of fluid A
= 1.05, 1 = 1 kg/m3 , 2 = 1000 kg/m3
= 1 + (1 ) 2
= 1.05 1 + (1 1.05) 1000
= 1.05 0.05 1000 = 1.05 50 = 48.95 kg/m3
Physical bounds on solution variables are easily established. However, our
task is not only to recognise this in the original equations but to enforce it
during the iterative solution process. If at any stage we obtain a locally negative density, the convergence of the iterative algorithm will be disrupted:
this is not trivial
For vector and tensor variables, the physical bounds are not as straightforward and may be more difficult to enforce
Diffusion coefficient and stability. An example of how the iterative
process breaks down is a case of negative diffusion introducing positive
feed-back in the system. The diffusion model:
qs = ,
assumes positive value of : the gradient transport will act to decrease
the maximum value of in the domain and tend towards the uniform
distribution. For negative , the process is reversed and is accumulated
at the location of highest , which tends to infinity in an unstable manner.
If you encounter cases where is genuinely negative (e.g. financial modelling
equations), there is still a way to solve them: marching in time backwards!
Bounding source and sink terms. Looking at a scalar variable with
bounds, e.g. 0 1, governed by a generic transport equation, a sanity
check can be performed on the volumetric source term: as approaches
its bounds, the value of qv must tend to zero. This is how the form of the
differential equation preserves the sanity of the variable; the same property
needs to be achieved in the discretised form of the equation

72

Transport Equation in the Standard Form

Examples
Convection-dominated problems
Diffusion problems
Negative diffusion coefficient
Convection-diffusion and Peclet number
Source and sink terms: preserving the boundedness

5.5

Complex Equations: Introducing Non-Linearity

Scalar transport equation in its standard form represents a relatively simple physical system, including convective and diffusive transport and linearised source and
sink terms. The equation is sufficiently easy to fathom to provide a number of analytical solutions (e.g. line source in cross flow) but does not capture the richness
and complexity of may real-life phenomena.
We shall now look at a series of seemingly simple modifications to the form
of various terms and their effect. An important property of good discretisation is
to enforce physical bounds on all relevant variables not only on convergence but
also on intermediate solution on the iterative process.
Vector and Tensor Transport
A transport equation for a vector and tensor quantity very similar to the
scalar form: becomes d. However, having d as a transported variable
allows the introduction of some interesting new terms
Variable convected by itself: (d d)

Laplace transpose: (d)T


Divergence (trace): Id

The tricky terms will introduce non-linearity or inter-component coupling


and produce interesting solutions
For now, we can consider the question of coupling: are the components of
the transported vector coupled or decoupled?

5.5 Complex Equations: Introducing Non-Linearity

73

Multiple Convection or Diffusion Terms


Some equations can contain multiple transport terms, sometimes disguised
as sources or sinks. Recognising the real nature of the term is critical in its
correct numerical treatment
( b)
+ (u b) = St |b|
t

(5.14)

Multiple diffusion terms can appear in the same variable or in a different


one, e.g. (b) in the equation for . Diffusion terms in the same variables can be combined into a single term; the ones in a different equation
require special treatment.
Non-Linear Transport
The non-linearity in convection, (u u) is the most interesting term in the
Navier-Stokes equations. Complete wealth of interaction in incompressible
flows stems from this term. This includes all turbulent interaction: in
nature, this is an inertial effect
In compressible flows, additional effects, related to inter-equation coupling
appear: shocks, contact discontinuities.
Another form of non-linearity introduces the diffusion coefficient as a
direct or indirect function of the solution: much less interesting
Non-Linear Source and Sink Terms
As mentioned before, for bounded scalar variables, source and sink terms
need to tend to zero as approaches its bounds. Therefore, cases where qv
is a function of are a rule rather than exception
qv = qv () usually leads to the decomposition of the term into a source
and sink. This strictly only makes sense when is bounded below by zero
and has no upper bound, but it is instructive. The linearisation is only
first-order, i.e. qu and qp can still depend on .
qv = qu qp ,

(5.15)

where both qu 0 and qp 0. This kind of linearisation also follows from


numerical considerations and will be re-visited later.

74

Transport Equation in the Standard Form

5.6

Inter-Equation Coupling

True complexity of physical processes in engineering is rarely seen from single


transport equations, be they linear or non-linear. It is the interaction between
multiple physical phenomena interacting with each other that represents a true
challenge. Consider for example how the simplicity of a mass continuity equation enforces constraints on the momentum transfer in incompressible fluid flow.
Adding to that the dependence of material properties on state variables further
increases the complexity.
Many sets of coupled differential equations stem not only from basic physical
principles, but from the need to describe very complex physical systems in simpler terms. Good examples would include combustion and turbulence models.
Here, the complete physics of interest may or may not be understood, but is
too complex for to be captured in its entirety. Therefore, a modeller chooses
some representative variables (e.g. turbulent length scale, eddy turn-over time,
laminar flame speed) and incorporates their interaction in a set of coupled partial
differential equations.
Coupled Equations Sets
Inter-equation coupling introduces additional complexity: a set of physical
phenomena which depend on each other.
Complexity, strength of coupling and non-linearity varies wildly, to the level
of inability to handle certain models numerically. The most difficult ones
involve separation of scales, where the fastest interaction (e.g. chemical
reaction) occurs at time-scales several order of magnitude faster than the
slowest (e.g. turbulent fluid flow)
Example: Two Coupled Scalar Equations
k model of turbulence:
k: turbulence kinetic energy
: dissipation turbulence kinetic energy
u: velocity. Consider it fixed for the moment
C , C1 , C2 : model coefficients.
k-equation:
k
+ (u k) (t k) = G ,
t

(5.16)

where
t = C

k2

(5.17)

5.6 Inter-Equation Coupling

75

and
G = t [u + (u)T ] : u.

(5.18)

-equation:

+ (u ) (t ) = C1 G C2 ,
t
k
k

(5.19)

The coupling looks very complex but is benign: the most critical part is
the treatment of sink terms to preserve the boundedness during an iterative
solution sequence. Example: sink term in the k-equation
=

old
knew
kold

(5.20)

Exercise Examine the boundedness of above equations for k and , given a


prescribed velocity field by analysing various source and sink terms.

76

Transport Equation in the Standard Form

Chapter 6
Polyhedral Finite Volume
Method
6.1

Introduction

In this chapter we will lay out the HJ HERE!!!

6.2

Properties of a Discretisation Method

Discretisation
Generic transport equation can very rarely be solved analytically: this is
why we resort to numerical methods
Discretisation is a process of representing the differential equation we wish
to solve by a set of algebraic expressions of equivalent properties (typically
a matrix)
Two forms of discretisation operators. We shall use a divergence operator
as an example.
Calculus. Given a vector field u, produce a scalar field of u
Method. For a given divergence operator , create a set of matrix
coefficients that represent u for any given u
The Calculus form can be easily obtained from the Method (by evaluating
the expression), but this is not computationally efficient
Properties A discretised form of equation needs to consistently represent the
original equation

78

Polyhedral Finite Volume Method

1. Consistency: when the mesh spacing tends to zero, the discretisation


should become exact
2. Stability: a solution method is stable if it does not magnify the errors that
appear during the numerical solution process
3. Convergence: the solution of the discretised equations should tend to the
exact solution of the differential equation as the mesh spacing tends to zero
4. Conservation: at steady-state and in the absence of sources and sinks the
amount of a conserved quantity leaving the system is equal to the amount
entering it
5. Boundedness: for variables that possess physical (sanity) bounds, boundedness should be preserved in the discretised form
6. Realisability: discretised version of the model should be such that solutions obtained numerically are physically realistic
7. Accuracy: produce the best possible solution on a given mesh

6.3

Discretisation of the Scalar Transport Equation

We shall now review the technique of second-order Finite Volume discretisation


on polyhedral meshes. After specifying the spatial and temporal distribution, we
shall visit a number of operators and present their explicit and implicit form.
Discretisation Methodology
1. We shall assemble the discretisation on a per-operator basis: visit each
operator in turn and describe a strategy for evaluating the term explicitly
and discretising it
2. Describe space and time: a computational mesh for the spatial domain
and time-steps covering the time interval
3. Postulate spatial and temporal variation of required for a discrete representation of field data
4. Integrate the operator over a cell
5. Use the spatial and temporal variation to interpret the operator in discrete
terms

79

6.3 Discretisation of the Scalar Transport Equation

Representation of a Field Variable


Equations we operate on work on fields: before we start, we need a discrete
representation of the field
Main solution variable will be stored in cell centroid: collocated cell-centred
finite volume method. Boundary data will be stored on face centres of
boundary faces
For some purposes, e.g. face flux, different data is required in this case it
will be a field over all faces in the mesh
Spatial variation can be used for interpolation in general: post-processing
tools typically use point-based data.
Nomenclature: Computational Cell
N

sf
df
f

rP
z
y

VP

The figure shows a convex polyhedral cell boundary be a set of convex


polygons
Cell volume is denoted by VP
Point P is the computational point located at cell centroid xP . The definition of the centroid reads:
Z
(x xP ) dV = 0.
(6.1)
VP

For the cell, there is one neighbouring cell across each face. Neighbour cell
and cell centre will be marked with N.
Delta vector for the face f is defined as
df = P N

(6.2)

80

Polyhedral Finite Volume Method

The face centre f is defined in the equivalent manner, using the centroid
rule:
Z
(x xf ) dS = 0.
(6.3)
Sf

Face area vector sf is a surface normal vector whose magnitude is equal to


the area of the face. The face is numerically never flat, so the face centroid
and area are calculated from the integrals.
Z
sf =
n dS.
(6.4)
Sf

The fact that the face centroid does not necessarily lay on the plane of
the face is not worrying: we are dealing with surface-integrated quantities.
However, we shall require the cell centroid to lay within the cell
In practice, cell volume and face area calculated by decompositions into
triangles and pyramids
Types of faces in a mesh
Internal face, between two cells
Boundary face, adjacent to one cell only and pointing outwards of
the computational domain
When operating on a single cell, assume that all face area vectors sf point
outwards of cell P
Spatial and Temporal Variation
Postulating spatial variation of : second order discretisation in space
(x) = P + (x xP )()P

(6.5)

This expression is given for each individual cell. Here, P = (xP ).


Postulating linear variation in time: second order in time

(t + t) = + t
t
t

where t = (t)

(6.6)

81

6.3 Discretisation of the Scalar Transport Equation

Polyhedral Mesh Support


In FVM, we have specified the shape function without reference to the
actual cell shape (tetrahedron, prism, brick, wedge). The variation is always
linear. Doing polyhedral Finite Volume should be straightforward!
In contrast, FEM specifies various forms for shape function for various
shapes and provides options for higher order elements. Example: 27-node
brick. However, I am not aware of the possible FEM formulation which s
shape independent
Volume and Surface Integrals
Discretisation is based on the integral form of the transport equation over
each cell
Z
I
I
Z

dV + (nu) dS (n) dS =
qv dV
(6.7)
V t
S
S
V
Each term contains volume or surface integral. Evaluate the integrals, using
the prescribed variation in space
Volume integral
Z

dV =

[P + (x xP )()P ] dV
Z
Z
= P
dV + ()P (x xP )dV = P VP
V

Surface integral splits into a sum over faces and evaluates in the same
manner
I
XZ
XZ
n[f + (x xf )()f ]
nf dSf =
n dS =
S

Sf

Sf

sf f

The above integrals show how the assumption of linear variation of and
the selection of P in the centroid eliminate the second part of the integral
and create second-order discretisation

82

Polyhedral Finite Volume Method

6.4

Face Addressing

Software Organisation
Assuming that f depends on the values of in the two cells around the face,
P and N, let us attempt to calculate a surface integral for the complete
mesh. Attention will be given on how the mesh structure influences the
algorithm
Structured mesh. Introducing compass notation: East, West, North,
South
The index of E, W , N and S can be calculated from the index of P : n + 1,
n 1, n + colDim, n colDim

S
Looping structure
Option 1: For all cells, visit East, West, North, South and sum up the
values. Not too good: each face value calculated twice. Also, poor
optimisation for vector computers we want to do a relatively short
operation for lots and lots of cells
Option 2:
For all cells, do East face and add to P and E
For all cells, do North face and add to P and N
Better, but stumbles on the boundary. Nasty tricks, like zero-volume
boundary cells on the W and S side of the domain.
OK, I can do a box. How about implementing a boundary condition:
on E, W , N and S. Ugly!
Block-structured mesh. Same kind of looping as above

83

6.4 Face Addressing

On connections between blocks, the connectivity is no longer regular,


e.g. on the right side I can get a N cell of another block
Solution: repeat the code for discretisation and boundary conditions
for all possible block-to-block connections
Repeated code is very bad for your health: needs to be changed consistently,
much more scope for errors, boring and difficult to keep running properly.
Tetrahedral mesh. Similar to structured mesh.
A critical difference to above is that in a tetrahedral mesh we cannot
calculate the neighbouring indices because the mesh is irregular. Thus,
cell-to-cell connectivity needs to be calculated during mesh generation
or at the beginning of the simulation and stored. Example: for each
tetrahedron, store 4 indices of neighbour cells across 4 faces in order.
1

3
2

s1

Unstructured mesh. We can treat a block structured mesh in the same


manner: forget about blocks and store neighbour indices for each cell. Much
better: no code duplication.
Mixed cell types. When mixed types are present, we will re-use the unstructured mesh idea, but with holes: a tetrahedron only has 4 neighbours
and a brick has got six
Option 1: For all cells, visit all neighbours. Woops: short loop inside
a long loop AND all face values calculated twice
Option 2:
For all neighbours, up to max number of neighbours
For all cells
. . . do the work if there is a neighbour
Works, but not too happy: I have to check if the neighbour is present

84

Polyhedral Finite Volume Method

Face Addressing
Thinking about the above, all I want to do is to visit all cell faces and then
all boundary faces. For internal face, do the operation and put the result
into two cells around the face
Orient face from P to N: add to P and subtract from N (because the face
area vector points the wrong way)
Addressing slightly different: for each internal face, record the left and right
(owner and neighbour) cell index. Owner will be the first one in the cell
list
Much cleaner, compact addressing, fast and efficient (some cache hit issues
are hidden but we can work on that)
Most importantly, it no longer matters how many faces there is in the cell:
nothing special is required for polyhedral cells
Gauss theorem
Gauss theorem is a tool we will use for handing the volume integrals of
divergence and gradient operators
Divergence form
Z
I
a dV =

dsa

(6.8)

Gradient form
Z
I
dV =

ds

(6.9)

VP

VP

VP

VP

Note how the face area vector operates from the same side as the gradient
operator: fits with our definition of te gradient of for a vector field
In the rest of the analysis, we shall look at the problem face by face. A
diagram of a face is given below for 2-D. Working with vectors will ensure
no changes are required when we need to switch from 2-D to 3-D.
A non-orthogonal case will be considered: vectors d and s are not parallel
s
f
P

85

6.5 Operator Discretisation

6.5

Operator Discretisation

In the following section we will look at the discrete representation of various


operators. Operators which do not interact can be looked at in isolation and will
be considered in the increased order of complexity.

6.5.1

Temporal Derivative

Time derivative captures the rate-of-change of . We only need to handle the


volume integral.
Using the prescribed temporal variation in a point, defining time-step size
t
tnew = told + t, defining time levels n and o
o = (t = told )

(6.10)

n = (t = tnew )

(6.11)

Temporal derivative, first and second order approximation

n o
=
t
t
3 n

o + 21 oo

= 2
t
t
Thus, with volume integral:
Z

n o

dV =
VP
t
t

(6.12)

Calculus: given n , o and t create a field of the time derivative of


Method: matrix representation. Since
in cell P depends on P , the
t
matrix will only have a diagonal contribution and a source
Diagonal value: aP =

VP
t

Source contribution: rP =

VP ,o
t

86

Polyhedral Finite Volume Method

6.5.2

Second Derivative in Time

Second derivative in time


This term will appear when we try to do stress analysis
Very similar to the above: second temporal derivative is calculated using
two old-time levels of
2
n 2o + oo
=
,
t2
t2

(6.13)

where n = (t + t), o = (t) and oo = (t t).


One can also construct a second-order accurate form of
old-time levels (ooo = (t 2 t)):

2
t2

using three

2
2n 5o + 4oo ooo
=
.
t2
t2

(6.14)

Exercise: what needs to be done if the time step is not constant between
the two old time levels?

6.5.3

Evaluation of the Gradient

Gauss Theorem
Evaluation of the gradient is a direct application of the Gauss Theorem
Z
I
dV =
ds
(6.15)
VP

VP

Discretised form splits into a sum of face integrals


I
X
sf f
n dS =
S

(6.16)

It still remains to evaluate the face value of . Consistently with secondorder discretisation, we shall assume linear variation between P and N
f = fx P + (1 fx )N
Gradient evaluation almost exclusively used as a calculus operation

(6.17)

87

6.5 Operator Discretisation

Least Squares Fit


On highly distorted meshes, accuracy of Gauss gradients is compromised
Lest squares fit uses a set of neighbouring points without reference to cell
geometry to assemble the gradient
Assuming a linear variation of a general variable , the error at N is:
eN = N (P + dN ()P )
Minimising the least square error:
X
e2P =
(wN eN )2

(6.18)

(6.19)

with the weighting function


wN =

1
|dN |

(6.20)

leads to the following expression:


X
2
()P =
wN
G1 dN (N P ),

(6.21)

where G is a 3 3 symmetric matrix:


X
2
G=
wN
dN dN

(6.22)

This produces a second-order accurate gradient irrespective of the arrangement of the neighbouring points

6.5.4

Convection Term

Convection term captures the transport by the convective velocity. In general


terms, convection can be seen as a coordinate transformation: information
(variable) is carried by the flow field from one region to another. A concept of
upwind or downwind direction is needed to understand the convective process.
Convection Operator and Face Flux
Convection operator splits into a sum of face integrals
Two different ways of writing the same term: integral and differential form
I
Z
(nu)dS =
(u) dV
(6.23)
S

88

Polyhedral Finite Volume Method

Integration follows the same path as before


I
X
X
f F
f (sf uf ) =
(nu)dS =
S

(6.24)

where f is the face value of and


F = sf uf

(6.25)

is the face flux


In general, face flux is a face field giving the measure of the flow through the
face. In some algorithms, it may come from different expressions, depending
on the overall algorithm
Primary unknowns are the cell centre values, not face values
In order to close the system, we need a way of evaluating f from the cell
values P and N : face interpolation
Face Interpolation Schemes
Simplest face interpolation: central differencing. Second-order accurate,
but causes oscillations
f = fx P + (1 fx )N

(6.26)

where fx = f N/P N
Upwind differencing: taking into account the transportive property of
the term: information comes from upstream. No oscillations, but smears
the solution
f = max(F, 0) P + min(F, 0) N

(6.27)

There exists a large number of schemes, trying to achieve good accuracy


without causing oscillations: e.g. TVD, and NVD families: f = f (P , N , F, . . .)
F

D
f
+

We shall re-visit the schemes with examples

89

6.5 Operator Discretisation

Matrix Coefficients
In the convection term, f depends on the values of in two computational
points: P and N.
Therefore, the solution in P will depend on the solution in N and vice
versa, which means weve got an off-diagonal coefficient in the matrix.
In the case of central differencing on a uniform mesh, a contribution for a
face f is
Diagonal value: aP = 12 F
Off-diagonal value: aN = 21 F
Source contribution: in our case, nothing. However, some other schemes
may have additional (gradient-based) correction terms
Note that, in general the P -to-N coefficient will be different from the
N-to-P coefficient: the matrix is asymmetric

6.5.5

Diffusion Term

Diffusion term captures the gradient transport


Diffusion Operator
Integration same as before
I
XZ
(n)dS =
S

(n) dS
Sf

f sf ()f

f evaluated from cell values using central differencing


Evaluation of the face-normal gradient. If s and df = P N are aligned, use
difference across the face
sf ()f = |sf |

N P
|df |

(6.28)

This is the component of the gradient in the direction of the df vector


For non-orthogonal meshes, a correction term may be necessary

90

Polyhedral Finite Volume Method

Matrix Coefficients
For an orthogonal mesh, a contribution for a face f is
Diagonal value: aP = f

|sf |
|df |

Off-diagonal value: aN = f

|sf |
|df |

Source contribution: for orthogonal meshes, nothing. Non-orthogonal


correction will produce a source
The P -to-N and N-to-P coefficients are identical: symmetric matrix
Non-Orthogonal Correction
We wish to keep the part with coefficient creation as above even for nonorthogonal meshes . . . but this would not be correct
Solution: add a correction
Decompose the s vector into a component parallel with d and the rest.
For the parallel component, same as above
Correction = k()f . The missing gradient will be calculated at cell
centres and interpolated, just as f above

f
P

6.5.6

Source and Sink Terms

Source and sink terms are integrated over the volume


Z
qv dV = qv VP

(6.29)

In general, qv may be a function of space and time, the solution itself, other
variables and can be quite complex. In complex physics cases, the source
term can carry the main interaction in the system. Example: complex
chemistry mechanisms. We shall for the moment consider only a simple
case.

91

6.6 Numerical Boundary Conditions

Typically, linearisation with respect to is performed to promote stability


and boundedness
qv () = qu + qd
where qd =

qv ()

(6.30)

and for cases where qd < 0 (sink), treated separately

Matrix Coefficients
Source and sink terms do not depend on the neighbourhood
Diagonal value created for qd < 0: boosting diagonal dominance
Explicit source contribution: qu

6.6

Numerical Boundary Conditions

Implementation of Numerical Boundary Conditions


Boundary conditions will contribute the the discretisation through the prescribed boundary behaviour
Boundary condition is specified for the whole equation
. . . but we will study them term by term to make the problem simpler
Dirichlet Condition: Fixed Boundary Value
Boundary condition specifies f = b
Convection term: fixed contribution F b . Source contribution only
Diffusion term: need to evaluate the near-boundary gradient
n()b =

b P
|db |

This produces a source and a diagonal contribution


What about source, sink, rate of change?

(6.31)

92

Polyhedral Finite Volume Method

Neumann and Gradient Condition


Boundary condition specifies the near-wall gradient n()b = gb
Convection term: evaluate the boundary value of from the internal value
and the known gradient
b = P + db ()b = P + |db |gb

(6.32)

Use the evaluated boundary value as the face value. This creates a source
and a diagonal contribution
Diffusion term: boundary-normal gb gradient can be used directly. Source
contribution only
Mixed Condition
Combination of the above
Very easy: times Dirichlet plus (1 ) times Neumann.
Symmetry Plane
Above boundary conditions were the same for scalars. vectors and tensors.
On a symmetry plane, there will be a different condition on scalar (zero
gradient), vector (zero normal component and zero-gradient condition on
the tangential component
For scalars, the surface-normal gradient is zero
For vectors, draw a ghost cell on the opposite side of the boundary with
the value the same as in P but with mirror transformation and do the
discretisation as usual
Note: symmetry plane boundary condition for a vector couples the components
Cyclic, Periodic and Other Coupled Conditions
Cyclic and periodic boundary conditions couple near-boundary cells to cells
on another boundary
A coordinate transformation is applied between the two sides: N to N and
vise-versa; the rest of the discretisation is performed as if this is an internal
face of the mesh

93

6.7 Time-Marching Approach

P
y
x

6.7

Time-Marching Approach

Time Advancement
Having completed the discretisation of all operators we can now evolve the
solution in time
There are two basic types of time advancement: Implicit and explicit
schemes. Properties of the algorithm critically depend on this choice, but
both are useful under given circumstances
There is a number of methods, with slightly different properties, e.g. fractional step methods,
Temporal accuracy depends on the choice of scheme and time step size
Steady-state simulations
If equations are linear, this can be solved in one go!
For non-linear equations or special discretisation practices, relaxation
methods are used, which show characteristics of time integration (we
are free to re-define the meaning of time
Explicit Schemes
The algorithm uses the calculus approach, sometimes said to operate on
residuals
In other words, the expressions are evaluated using the currently available
and the new is obtained from the time term
Courant number limit is the major limitation of explicit methods: information can only propagate at the order of cell size; otherwise the algorithm
is unstable
Quick and efficient, no additional storage
Very bad for elliptic behaviour

94

Polyhedral Finite Volume Method

Implicit Schemes
The algorithm is based on the method: each term is expressed in matrix
form and the resulting linear system is solved
A new solution takes into account the new values in the complete domain:
ideal for elliptic problems
Implicitness removed the Courant number limitation: we can take larger
time-steps
Substantial additional storage: matrix coefficients!

6.8

Equation Discretisation

The equation we are trying to solve is simply a collection of terms: therefore,


assemble the contribution from
Initial condition. Specifies the initial distribution of
. . . and we are ready to look at examples!

6.9

Convection Differencing Schemes

Testing differencing schemes on standard profiles


Simple second-order discretisation: upwind differencing, central differencing, blended differencing, NVD schemes
First-order scheme: Upwind differencing. Take into account the transport
direction
Exercise: how does all this relate to the discretisation of the Euler equation
described in the previous lectures?

6.10

Examples

Forms of convection discretisation and kinds of error they introduce


Positive and negative diffusion terms
Temporal discretisation: first and second-order, implicit or explicit discretisation

Chapter 7
Algebraic Linear System and
Linear Solver Technology
7.1

Structure and Formulation of the Linear System

Matrix Assembly
Assembling the terms from the discretisation method
Time derivative: depends on old value
Convection: u provided; f depends on P and N
Diffusion: sf ()f depends on P and N
Thus, the value of the solution in a point depends on the values around
it: this is always the case. For each computational point, we will create an
equation
X
aP P +
aN N = r
(7.1)
N

where N denotes the neighbourhood of a computational point

Every time P depends on itself, add contribution into aP


Every time N depends on itself, add contribution into aN
Other contributions into r
Examples of matrix structure

Structured mesh, Finite Volume


Unstructured mesh, Finite Volume
2-D linear quad elements, Finite Element
2-D linear triangular elements, Finite Element

96

Algebraic Linear System and Linear Solver Technology

Implicit and Explicit Methods


Explicit method: nP depends on the old neighbour values oN
Visit each cell, and using available o calculate
P
r N aN oN
n
P =
aP

(7.2)

No additional information needed


Fast and efficient; however, poses the Courant number limitation: the information about boundary conditions is propagated
very slowly and poses a limitation on the time-step size
Implicit method: nP depends on the new neighbour values nN
P
r N aN nN
n
P =
(7.3)
aP
Each cell value of for the new level depends on others: all
equations need to be solved simultaneously
Linear System: Nomenclature
Equations form a linear system or a matrix
[A][] = [r]

(7.4)

where [A] contain matrix coefficients, [] is the value of P in all cells and
[r] is the right-hand-side
[A] is potentially very big: N cells N cells
This is a square matrix: the number of equations equals the number of
unknowns
. . . but very few coefficients are non-zero. The matrix connectivity is always
local, potentially leading to storage savings if a good format can be found
What about non-linearity?

7.2

Matrix Storage Formats

Storing Matrix Coefficients


Dense matrix format. All matrix coefficients have are stored, typically
in a two-dimensional array

97

7.2 Matrix Storage Formats

Diagonal coefficients: aii , off-diagonal coefficients: aij


Convenient for small matrices and direct solver use
Matrix coefficients represent a large chunk of memory: efficient operations imply memory management optimisation
It is impossible to say if the matrix is symmetric or not without floating
point comparisons
Sparse matrix format. Only non-zero coefficients will be stored
Considerable savings in memory
Need a mechanism to indicate the position of non-zero coefficients
This is static format, which imposes limitations on the operations:
if a coefficient is originally zero, it is very expensive to set its value:
recalculating the format. This is usually termed a zero fill-in condition
Searching for coefficients is out of the question: need to formulate
sparse matrix algorithms
Sparse Matrix Storage
Compressed row format. Operate on a row-by-row basis. Diagonal
coefficients may or may not be stored separately
Coefficients stored in a single 1-D array. Coefficients are ordered in a
row-by-row structure
Addressing in two arrays: row start and column array
The column array records the column index for each coefficients. Size
of column array equal to the number of off-diagonal coefficients
The row array records the start and end of each row in the column
array. Thus, row i has got coefficients from row[i] to row[i + 1].
Size of row arrays equal to number of rows + 1
Coding [b] = [A] [x]
vectorProduct(b,
{
for (int n =
{
for (int
{
b[n]
}

with compressed row addressing


x)
0; n < count; n++)
ip = row[n]; ip < row[n+1]; ip++)
= coeffs[ip]*x[col[ip]];

98

Algebraic Linear System and Linear Solver Technology

}
}
Good for cases where coefficients are present in each row
Symmetric matrix cannot be recognised easily
Arrow format. Arbitrary sparse format. Diagonal coefficients typically
stored separately
Coefficients stored in 2-3 arrays: diagonal, upper triangle, lower triangle (if needed)
Diagonal addressing implied
Off-diagonal addressing stored in 2 arrays: owner or row index array
and neighbour or column index array. Size of addressing arrays equal
to the number of off-diagonal coefficients
The matrix structure (fill-in) is assumed to be symmetric: presence of
aij implies the presence of aji
If the matrix coefficients are symmetric, only the upper triangle is
stored a symmetric matrix is easily recognised and stored only half
of coefficients
Coding [b] = [A] [x] with arrow addressing
vectorProduct(b, x)
{
int c0, c1;
for (int n = 0; n < coeffs.size(); n++)
{
c0 = owner(n);
c1 = neighbour(n);
b[c0] = upperCoeffs[n]*x[c1];
b[c1] = lowerCoeffs[n]*x[c0];
}
}
Matrix Format and Discretisation Method
Relationship between the FV mesh and a matrix:
A cell value depends on other cell values only if the two cells share
a face. Therefore, a correspondence exists between the off-diagonal
matrix coefficients and the mesh structure
In practice, the matrix is assembled by looping through the mesh

7.3 Linear Solver Technology

99

Finite Element matrix assembly


Connectivity depends on the shape function and point-to-cell connectivity in the mesh
In assembly, a local matrix is assembled and then inserted into the
global matrix
Clever FEM implementations talk about the kinds of assembly without
the need for searching: a critical part of the algorithm

7.3

Linear Solver Technology

The Role of a Linear Solver


Good (implicit) numerical simulation software will spend 50-90 % percent
of CPU time inverting matrices: performance of linear solvers is absolutely
critical for the performance of the solver
Like in the case of mesh generation, we will couple the characteristics of a
discretisation method and the solution algorithm with the linear solver
Only a combination of a discretisation method and a linear solver will result
in a useful solver. Typically, properties of discretisation will be set up in a
way that allows the choice of an efficient solver
Solution Approach
Direct solver. The solver algorithm will perform a given number of operations, after which a solution will be obtained
Iterative solver. The algorithm will start from an initial solution and
perform a number of operations which will result in an improved solution.
Iterative solvers may be variants of the direct solution algorithm with special characteristics
Explicit method. New solution depends on currently available values of
the variables. The matrix itself is not required or assembled; in reality, the
algorithm reduces to point-Jacobi or Gauss-Seidel sweeps
Direct or Iterative Solver
Direct solvers: expensive in storage and CPU time but can handle any sort
of matrix
Iterative solvers: work by starting from an initial guess and improving the
solution. However, require matrices with special properties

100

Algebraic Linear System and Linear Solver Technology

For large problems, iterative solvers are the only option


Fortunately, the FVM matrices are ideally suited (read: carefully constructed) for use with iterative solvers
Full or Partial Convergence
When we are working on linear problems with linear discretisation in steadystate, the solution algorithm will only use a single solver call. This is very
quick and very rare: linear systems are easy to simulate
Example: linear stress analysis. In some FEM implementations, for matrices under a certain size the direct solver will be used exclusively for matrices
under a given size
In cases of coupled or non-linear partial differential equations, the solution algorithm will iterate over the non-linearity. Therefore, intermediate
solution will only be used to update the non-linear parameters.
With this in mind, we can choose to use partial convergence, update the
non-linearity and solve again: capability of obtaining an intermediate solution at a fraction of the cost becomes beneficial
Moreover, in iterative procedures or time-marching simulations, it is quite
easy to provide a good initial guess for the new solution: solution form the
previous iteration or time-step. This further improves the efficiency of the
algorithm
Historically, in partial convergence cases, FEM solvers use tighter tolerances
that FVM: 6 orders of magnitude for FEM vs. 1-2 orders of magnitude for
the FVM

7.3.1

Direct Solver on Sparse Matrices

Properties of Direct Solvers


The most important property from the numerical point of view is that the
number of operations required for the solution is known and intermediate
solutions are of no interest
Matrix fill-in. When operating on a large sparse matrix like the one from
discretisation methods, the direct solver will create entries for coefficients
that were not previously present. As a consequence, formal matrix storage
requirement for a direct solver is a full matrix for a complete system: huge!
This is something that needs to be handled in a special way

7.3 Linear Solver Technology

101

Advantage of direct solvers is that they can handle any sort of well-posed
linear system
In reality, we additionally have to worry about pollution by the round-off
error. This is partially taken into account through the details of the solution
algorithm, but for really bad matrices this cannot be helped
Gaussian Elimination
Gaussian elimination is the easiest direct solver: standard mathematics.
Elimination is performed by combining row coefficients until a matrix becomes triangular. The elimination step is followed by backwards substitution to obtain the solution.
Pivoting: in order to control the discretisation error, equations are chosen
for elimination based on the central coefficient
Combination of matrix rows leads to fill in
Gaussian elimination is one of the cases of I-L-U decomposition solvers and
is rarely used in practices
The number of operations in direct solvers scales with the number of equations cubed: very expensive!
Multi-Frontal Solver
When handling very sparse systems, the fill-in is very problematic: leads
to a large increase in storage size and accounts for the bulk of operations
Window approach: modern implementation of direct solvers
Looking at the structure of the sparse system, it can be established
that equation for P depends only on a small subset of other nodes:
in principle, it should be possible to eliminate the equation for P just
by looking at a small subset of the complete matrix
If all equations under elimination have overlapping regions of zero offdiagonal coefficients, there will be no fill-in in the shared regions of
zeros!
Idea: Instead of operating on the complete matrix, create an active
window for elimination. The window will sweep over the matrix,
adding equations one by one and performing elimination immediately
The window matrix will be dense, but much smaller than the complete
matrix. The triangular matrix (needed for back-substitution) can be
stored in a sparse format

102

Algebraic Linear System and Linear Solver Technology

The window approach may reduce the cost of direct solvers by several orders of magnitude: acceptable for medium-sized systems. The number of
operations scales roughly with N M 2 , where N is the number of equations
and M is the maximum size of the solution window
Implementing Direct Solvers
The first step in the implementation is control of the window size: the
window changes its width dynamically and in the worst case may be the
size of the complete matrix
Maximum size of the window depends on the matrix connectivity and ordering of equation. Special optimisation software is used to control the
window size: matrix renumbering and ordering heuristics
Example: ordering of a Cartesian matrix for minimisation of the band
Most expensive operation in the multi-frontal solver is the calculation of
the Schurs complement: the difference between the trivial and optimised
operation can be a factor of 10000! In practice, you will not attempt this
(cache hit rate and processor-specific pre-fetch operations)
Basic Linear Algebra (BLAs) library: special assembly code implementation for matrix manipulation. Code is optimised by hand and sometimes
written specially for processor architecture. It is unlikely that a handwritten code for the same operation achieves more than 10 % efficiency of
BLAs. A good implementation can now be measured in how much the code
spends on operations outside of BLAs.

7.3.2

Simple Iterative Solvers

Iterative solvers
Performance of iterative solvers depends on the matrix characteristics.
The solver operates by incrementally improving the solution, which leads to
the concept of error propagation: if the error is augmented in the iterative
process, the solver diverges
The easiest way of analysing the error is in terms of eigen-spectrum of
the matrix
One categorisation of iterative solvers is based on their smoothing characteristics:

7.3 Linear Solver Technology

103

Smoothers, or smoothing algorithms guarantee that the approximate


solution after each solver iteration will be closer to the exact solution
than all previous approximation. An example of a smoother would be
the Gauss-Seidel algorithm
For rougheners, this is not the case: in the iterative sequence, the
solution can temporarily move away from the exact solution, followed
by a series of convergence steps
Matrix Properties
A matrix is sparse if it contains only a few non-zero elements
A sparse matrix is banded if its non-zero coefficients are grouped in a
stripe around the diagonal
A sparse matrix has a multi-diagonal structureif its non-zero off-diagonal
coefficients form a regular diagonal pattern
A symmetric matrix is equal to its transpose
[A] = [A]T

(7.5)

A matrix is positive definite if for every [] 6= [0]


[]T [A][] > [0]

(7.6)

A matrix is diagonally dominant if in each row the sum of off-diagonal


coefficient magnitudes is equal or smaller than the diagonal coefficient
aii

N
X

|aij | ; j 6= i

(7.7)

j=1

and for at least one i


aii >

N
X

|aij | ; j 6= i

(7.8)

j=1

Residual
Matrix form of the system we are trying to solve is
[A][] = [r]

(7.9)

104

Algebraic Linear System and Linear Solver Technology

The exact solution can be obtained by inverting the matrix [A]:


[] = [A]1 [r]

(7.10)

This is how direct solvers operate: number of operations required for the
inversion of [A] is fixed and until the inverse is constructed we cannot get
[]
Iterative solvers start from an approximate solution []0 and generates a
set of solution estimates []k , where k is the iteration counter
Quality of the solution estimate is measured through a residual, or error
e:
[e] = [r] [A][]k

(7.11)

Residual is a vector showing how far is the current estimate []k from the
exact solution []. Note that for [], [e] will be zero
[e] defines a value for every equation (row) in [A]: we need a better way
to measure it. A residual norm ||r|| can be assembled in many ways, but
usually
||r|| =

N
X

|rj |

(7.12)

j=1

In CFD software, the residual norm is normalised further for easier comparison between the equations etc.
Convergence of the iterative solver is usually measured in terms of residual
reduction. When
||rk ||
<
||r0||

(7.13)

the matrix is considered to be solved.


Examples of Simple Solvers
The general idea of iterative solvers is to replace [A] with a matrix that is
easy to invert and approximates [A] and use this to obtain the new solution
Point-Jacobi solution
Gauss-Seidel solver
Tri-diagonal system and generalisation to 5- or 7-diagonal matrices

7.3 Linear Solver Technology

105

Propagation of information in simple iterative solvers. Point Jacobi


propagates the data one equation at a time: very slow. For Gauss-Seidel,
the information propagation depends on the matrix ordering ans sweep
direction. In practice forward and reverse sweeps are alternated
Krylov space solvers
Looking at the direct solver, we can imagine that it operates in Ndimensional space, where N is the number of equations and searches
for a point which minimises the residual
In Gaussian elimination, we will be visiting each direction of the Ndimensional space and eliminating it from further consideration
The idea of Krylov space solvers is that an approximate solution can be
found more efficiently if we look for search directions more intelligently.
A residual vector [e] at each point contains the direction we should
search in; additionally, we would like to always search in a direction
orthogonal to all previous search directions
On their own, Krylov space solvers are poor; however, when matrix
preconditioning is used, we can assemble efficient methods. This is
an example of an iterative roughener
In terms of performance, the number of operations in Krylov space
solvers scales with N log(N), where N is the number of unknowns
For more details, see Shevchuk: Conjugate Gradient Method without
Agonizing Pain

7.3.3

Algebraic Multigrid

Basic Idea of Multigrid


Operation of a multigrid solver relies on the fact that a high-frequency error
is easy to eliminate: consider the operation of the Gauss-Seidel algorithm
Once the high-frequency error is removed, iterative convergence slows down.
At the same time, the error that looks smooth on the current mesh will
behave as high-frequency on a coarser mesh
If the mesh is coarser, the error is both eliminated faster and in fewer
iterations.
Thus, in multigrid the solution is mapped through a series of coarse levels,
each of the levels being responsible for a band of error

106

Algebraic Linear System and Linear Solver Technology

Algebraic Multigrid (AMG)


When performing CFD operations, we can readily assemble a multigrid
algorithm by creating a series of coarse grids. This in itself is not trivial:
convexness of cells, issues with boundary conditions, etc.
In terms of matrices and linear solvers, the same principle should apply:
our matrices come from discretisation! However, it would be impractical to
build a series of coarse meshes just to solve a system of linear equations
At the same time, we can readily recognise that all the information about
the coarse mesh (and therefore the coarse matrix) already exists in the fine
mesh!
Example: assembling the convection, diffusion and source operator on the
imaginary coarse mesh directly from the data on a fine mesh
Algebraic multigrid generalises this idea: a coarse matrix is created directly
from the fine matrix
An alternative view of multigrid can be propagation of information from
one boundary to another. In elliptic systems, each point in the solution
depends on every other point. Thus, it is critical to transfer the boundary
condition influences to each point in the domain, which is done efficiently
Algebraic Multigrid Operations
Matrix coarsening. This is roughly equivalent to creation of coarse mesh
cells. Two main approaches are:
Aggregative multigrid (AAMG). Equations are grouped into clusters in a manner similar to grouping fine cells to for a coarse cell. The
grouping pattern is based on the strength of off-diagonal coefficients
Selective multigrid (SAMG). In selective multigrid, the equations
are separated into two groups: the coarse and fine equations. Selection
rules specifies that no two coarse points should be connected to each
other, creating a maximum possible set. Fine equations form a fineto-coarse interpolation method (restriction matrix), [r], which is used
to form the coarse system.
Restriction of residual handles the transfer of information from fine to
coarse levels. A fine residual, containing the smooth error component, is
restricted and used as the r.h.s. (right-hand-side) of the coarse system.

7.4 Parallelisation and Vectorisation

107

Prolongation of correction. Once the coarse system is solved, coarse


correction is prolongated to the fine level and added to the solution. Interpolation introduces aliasing errors, which can be efficiently removed by
smoothing on the fine level.
Multigrid smoothers. The bulk of multigrid work is performed by transferring the error and correction through the multigrid levels. Smoothers
only act to remove high-frequency error: simple and quick. Smoothing can
be applied on each level:
Before the restriction of the residual, called pre-smoothing
After the coarse correction has been added, called post-smoothing
Algorithmically, post-smoothing is more efficient
Cycle types. Based on the above, AMG can be considered a two-level
solver. In practice, the coarse level solution is also assembled using multigrid, leading to multi-level systems.
The most important multigrid cycle types are
V-cycle: residual reduction is performed all the way to the coarsest
level, followed by prolongation and post-smoothing. Mathematically,
it is possible to show that the V-cycle is optimal and leads to the
solution algorithm where the number of operations scales linearly with
the number of unknowns
Flex cycle. Here, the creation of coarse levels is done on demand,
when the smoother stops converging efficiently
Other cycles, e.g. W-cycle or F-cycle are a variation on the V-cycle theme

7.4

Parallelisation and Vectorisation

Solver Performance
Time spent in the solvers is a significant amount of the total simulation
time. Therefore, efficiency of solvers and choice of algorithm is critical for
the overall performance
We can make the simulation run faster either by devising a better solution
algorithm (hard!) or by performing operations and handling data faster
The subject here is rarely the solution algorithm itself: the design of solvers
is typically left to mathematicians. Instead, we are looking for operations
that can be efficiently executed on computers

108

Algebraic Linear System and Linear Solver Technology

Two main devices we have at disposal are


When designing solvers to work on high-performance computers, two main
devices we have at disposal are:
Vector registers
Multiple CPUs
Other (and configurable) structures for efficient execution include pipelining
and short vector optimisation, but the principle is the same
Vector Operations
We can simplify numerous solver operations into vector-matrix multiply
c = a*x + b

This is what the computer does for us


An operation like the above, uses computer resources in 3 ways
Configuring the registers
Fetching the data
Performing the operation
The idea of vector computers is to perform this operation simultaneously
on a large amount of data, defined by the vector length (e.g. 256 or 1024
operations together)
Efficient algorithm should therefore perform the same operation on a large
data-set, without if-statements, function calls, data inter-dependency etc.
For practical purposes, vector computers are (currently) dead: however,
lessons on vector programming are extremely useful on current-generation
chips
Parallelisation
The idea of parallelisation is to split the large loop of
for (int i = 0; i < N; i++)
{
c[i] = a[i]*x[i] + b[i];
}

7.4 Parallelisation and Vectorisation

109

between a number of CPUs, with each CPU responsible for its own part
Problem decomposition can be done in several ways
Algorithmic decomposition, or decomposition over the numerical
procedure, with each CPU being responsible for its own part of the
algorithm
Decomposition over time steps or Time decomposition
Domain decomposition, where each CPU is responsible for its part
of the computational domain
Fine-grain decomposition decomposes the solver on a loop-by-loop basis:
typically done by the compiler
A critical part of the parallel solution approach is to ensure that every
CPU has approximately the same amount of work; otherwise, CPUs end
up waiting for each other
Iterative solvers parallelise well: operations have weak data dependency and
few synchronisation points. It is relatively easy to establish the necessary
communication pattern for data dependency between CPUs
In direct solvers, the problem is more serious: multiple solution windows
can propagate the solution front independently on each CPU, but problems
arise when two windows on two separate CPUs need to merge

110

Algebraic Linear System and Linear Solver Technology

Chapter 8
Solution Methods for Coupled
Equation Sets
8.1

Examining the Coupling in Equation Sets

Nature of Coupling
The nature of coupling is not usually examined in general terms: all our
equations look very similar
Additionally, the nature and strength of coupling depends not only on the
equation but also on the state of the system and material properties. Example: change of viscosity in the fluid flow equations. Typically, such changes
are described in terms of dimensionless groups, e.g. Reynolds number Re
In principle, difficult systems of equations encompass a large range of space
and time-scales. In fact, the equations are not the culprit: we are trying to
assemble the solution on an inappropriate scale
Inappropriate scale is usually chosen for efficiency: the actual scale of the
physical phenomenon may be very fast and lead to extremely long simulation times
Example: chemical reactions in fully premixed flames

8.2

Examples of Systems of Simultaneous Equations

In the next paragraphs, we shall review several mathematical models from the
point of view of equation interaction.

112

Solution Methods for Coupled Equation Sets

Porous Media: Darcys Equation


Darcys Law:
u = p

(8.1)

Darcys law, combined with the mass conservation equation for the incompressible liquid creates the Laplace equation which controls the system
(p) = 0

(8.2)

Velocity field is obtained from the pressure distribution in a post-processing


step.
Cases where is a scalar field represent uniform flow resistance in all directions: isotropic porous medium
For directed Darcys law, the flow resistance may be depend on spatial
direction, e.g. flow straighteners. This produces an orthotropic resistance
tensor:
u = p

(8.3)

where

xx 0
0
= 0 yy 0
0
0 zz

(8.4)

A general form, where is a full symmetric tensor is also possible. In a


generalised form of Darcys law, we can introduce the more general form,
where the resistance tensor is a function of local velocity
Linear Stress Analysis
Solution variable: displacement vector d
2 (d)
[d + (d)T + I tr(d)] = f.
t2

(8.5)

Equation is assembled by substituting the linear stress-strain relationship


into the momentum (force balance) equation:
= 2 + tr() I

(8.6)

and
=

1
d + (d)T
2

Displacement is a vector variable and the equation is linear

(8.7)

8.2 Examples of Systems of Simultaneous Equations

113

Incompressible Navier-Stokes Equations


Solution variables: velocity u and pressure p
Momentum equation:
u
+ (uu) (u) = p
t

(8.8)

Continuity equation:
u = 0

(8.9)

is the kinematic viscosity and p kinematic pressure


Compressible Navier-Stokes Equations
Solution variables: density , momentum u and energy e
Continuity equation:

+ (u) = 0
t

(8.10)

Momentum equation:

2
(u)
T
+(uu) u + (u)
= g P + u (8.11)
t
3
Energy equation:

(e)
2
+ (eu) (T ) = gu (P u)
(u) u
t
3
(8.12)


T
+ u + (u) u + Q,

Equation of state
= (P, T )

(8.13)

The transport coefficients and are also functions of the thermodynamic


state variables:
= (P, T ),
= (P, T ).
Pressure or density formulation?

(8.14)
(8.15)

114

Solution Methods for Coupled Equation Sets

k Turbulence Model
Solution variables: turbulence kinetic energy k and its dissipation
k-equation:
k
+ (u k) (t k) = G ,
t

(8.16)

with
k2
t = C

G = t [u + (u)T ] : u

(8.17)
(8.18)

-equation:

2
+ (u ) (t ) = C1 G C2 ,
t
k
k

(8.19)

Chemical Reactions
Example set of chemical reactions
3 C1 C2 + 2 C3 + 9 H
C2 AH + 2 H
15 C3 2 C12 A7 + C2 + 21CH + 66H

(8.20)
(8.21)
(8.22)

Solution variables: species concentration C1 , C2 and C3


Transport equations for species
2
44
4
C1
+ (C1 u) (c C1 ) = 3C1 S1 + C2 S2 + C3 S3 + C2 S2
t
3
15
45
(8.23)
C2
+ (C2 u) (c C2 ) = 2C2 S2
(8.24)
t
22
2
C3
+ (C3 u) (c C3 ) = C3 S3 + C2 S2
(8.25)
t
5
15
Arrhenius law (reaction rate):

Ei
Si (T ) = A exp
RT

(8.26)

8.3 Solution Strategy for Coupled Sets

8.3

115

Solution Strategy for Coupled Sets

We shall review the options of handling the coupled vector variables or coupled
equation sets in a numerical solution algorithm.
Coupled solution algorithms are designed to handle systems of equations in
the most efficient way possible
The option of solving all equations together always exists, but it is very
expensive and in most cases unnecessary
The objective is to treat important and nice terms implicitly and handle the coupling algorithmically whenever possible
Numerically well behaved terms help with the stability of discretisation
Time derivative: inertial behaviour
Diffusion: smoothing: no new minima or maxima are introduced
Convection: coordinate transformation
Linear and bounded sources and sinks: control of boundedness

8.3.1

Segregated Approach

Segregated Solution Technique


In the segregated approach, the set of equations will be solved one at a
time. The coupling terms will be evaluated from the currently available
solution and lagged
For vector equations, vector components will be solved individually. Componentto-component coupling terms are lagged (source/sink) by one iteration
In algorithmic terms, the segregated solver corresponds to successive substitution: there is no guarantee a converged solution can be reached
Equation segregation makes smaller matrices: one for each component.
Matrices are solved one at a time, re-using the storage arrays and are usually
identical for all components (apart from the source/sink terms)
Equation segregation is not always desirable: it may convert a linear componentcoupled problem into a non-linear one and require iterations

116

Solution Methods for Coupled Equation Sets

Under-Relaxation
In order to improve the convergence, we sometimes use under-relaxation.
Here, only a part of the correction is added, potentially slowing down convergence but increasing stability
Types of under-relaxation
Explicit under-relaxation: when a new solution p is obtained, the
value for the next iteration will only use a part of the correction
new = old + (p old )

(8.27)

where 0 < < 1


Implicit under-relaxation. When a linear equation for P is formed
the diagonal is boosted and an appropriate correction is added to the
r.h.s.:
X
1
aP
P +
aN N =
aP old
(8.28)
P +R

N
When convergence is reached P = old
P and the two terms cancel out
The form of under-relaxation is equivalent to time-stepping, but the
time step size is not equal for all cells in the mesh
Note that under-relaxation may sometimes be counter-intuitive or slow
down the solution process.

8.3.2

Fully Coupled Approach

Block Matrix
For cases of strong coupling between the components of a vector, the components can be solved as a block variable: (ux , uy , uz ) will appear as
variables in the same linear system
In spite of the fact that the system is much larger, the coupling pattern still
exists: components of u in cell P may be coupled to other components in
the same point or to vector components in the neighbouring cell
With this in mind, we can still keep the sparse addressing defined by the
mesh: if a variable is a vector, a tensorial diagonal coefficients couples the
vector components in the same cell. A tensorial off-diagonal coefficient
couples the components of uP to all components of uN , which covers all
possibilities

117

8.4 Matrix Structure for Coupled Algorithms

For Multi-variable block solution like the compressible Navier-Stokes


system above, the same trick is used: the cell variable consists of (, u, E)
and the coupling can be coupled by a 5 5 matrix coefficient
Important disadvantages of a block coupled system are
Large linear system: several variables are handled together
Different kinds of physics can be present, e.g. the transport-dominated
momentum equation and elliptic pressure equation. At matrix level, it
is impossible to separate them, which makes the system more difficult
to solve
Nature of Coupling
Block matrix represents complete coupling for a block variable
We can examine cases of partial coupling by looking at degenerate forms of
the coefficients. This will reveal special cases of coupling where alternatives
to a fully coupled solution approach may be considered

8.4

Matrix Structure for Coupled Algorithms

Matrix Connectivity and Mesh Structure


Irrespective of the level of coupling, the FVM dictates that a cell value will
depend only on the values in surrounding cells

S
We still have freedom to organise the matrix by ordering entries for various
components of . Also, the matrix connectivity pattern may be changed
by reordering the computational points

118

Solution Methods for Coupled Equation Sets

Example: block-coupled vector equation (ux , uy , Uz )


Per-variable organisation: first ux for all cells, followed by uy and uz .
Ordering of each sub-list matches the cell ordering.

[ux ux ] [ux uy ] [ux uz ]


aP = [uy ux ] [uy uy ] [uy uz ]
(8.29)
[uz ux ] [uz uy ] [uz uz ]
Diagonal blocks, e.g. [ux ux ] have the size equal to the number
of computational points and contain the coupling within the single
component. All matrix coefficients are scalars. Off-diagonal block
represent variable-to-variable coupling.

Per-cell organisation: (ux , uy , Uz ) for each cell. A single numbering


space for all cells, but each individual coefficient is more complex:
contains complete coupling
Both choices have advantages and choice depends on software infrastructure
and matrix assembly methods. In order to illustrate the nature of coupling,
we shall choose per-cell organisation
Coupling Coefficient
Consider a linear dependence between two vectors m and n. We can write
a general form as
m = Ab

(8.30)

We shall evaluate the shape of A for various levels of coupling. We shall


think of A as a matrix coefficient in the block matrix. The diagonal matrix
entry is termed AP and the off-diagonal as AN . Matrix connectivity is
dictated by the mesh structure
Component-wise coupling describes the case where mx depends only on nx ,
my on ny and mz on nz
1. Scalar component-wise coupling
2. Vector component-wise coupling
3. Full (block) coupling
Explicit methods do not feature here because it is not necessary to express
them in terms of matrix coefficients
For reference, the linear equation for each cells featuring in the matrix reads
X
AP mP +
AN mN = R
(8.31)
N

8.4 Matrix Structure for Coupled Algorithms

119

Scalar-Implicit Coupling
In scalar implicit coupling, components of m at P do not depend on each
other. Thus, AP and AN is a diagonal tensor:

axx 0
0
A = 0 ayy 0
(8.32)
0
0 azz
In most terms
axx = ayy = azz = a

(8.33)

A = aI

(8.34)

or

In this case, the block system represents 3 equations written together but
not interacting: the block notation for the system is misleading for the level
of coupling present in discretisation
This leads towards a segregated method: we have three independent equations written together. Lack of off-diagonal coefficients indicate the absence
of component-to-component coupling
Example of scalar coefficient terms: temporal derivative, diagonal and offdiagonal of convection and diffusion with scalar diffusivity
Block-Point Implicit Coupling
In block-point implicit coupling the components of a vector variable m
depend on each other in the same computational point, but each individual component depends only of the neighbouring value of the same
component
Thus:
In point P , mx depends on self, my and mz . Thus, the diagonal
coefficient ap would be a full 3 3 matrix

axx axy axz


AP = ayx ayy ayz
azx azy azz

(8.35)

120

Solution Methods for Coupled Equation Sets

In the off-diagonal, mx fo location P will depend only on mx at N,


creating a diagonal-only coefficient.

axx 0
0
AN = 0 ayy 0
(8.36)
0
0 azz
As before, in most cases, the diagonal components are identical.
AN = a I

(8.37)

The first form is typical for anisotropic porous media.


In this situation, the transport part of the system (as depicted by
AN exhibits segregated behaviour, combined by a point-coupled problem for each computational point
Scalar-Point Vector-Implicit Coupling
In the third combination, local point components of mx are decoupled, but
the coupling to the neighbouring locations is complete. Thus

axx 0
0
AP = 0 ayy 0
(8.38)
0
0 azz
and

axx axy axz


AN = ayx ayy ayz
azx azy azz

(8.39)

Such cases are relatively rare and typically appear from tensorial diffusion
problems and in some cases of rotational coupling
Full Block Coupling
In full block coupling, each component of m depends on all other components both in the local and neighbouring computational points. Thus, both
the diagonal and off-diagonal coefficient take full tensor form:

axx axy axz


(8.40)
AP = ayx ayy ayz
azx azy azz

axx axy axz


AN = ayx ayy ayz
(8.41)
azx azy azz
(note that component values will be different between the two)

8.4 Matrix Structure for Coupled Algorithms

121

This is the most complex form of coupling, where everything is related to


everything else [Lenin]
Composite Variables
In some equations, the system will be coupled not only across the components of vectors and tensors, but also across different variables. In such
cases, we may write a composite variable formulation, where all equations
are grouped together into a single equation
The fact that a composite variable is not a Cartesian tensor needs to be
kept in mind. Calculation of gradients, divergence etc. is no longer trivial:
the physical meaning of the field needs to be taken into account
Example: compressible Navier-Stokes equations

U = u
e

(8.42)

Note that U above holds 5 scalar values: 1 for the density, 3 momentum
components (ux , uy , uz ) and one for energy
This tactics makes sense only if the variables are strongly coupled to each
other. Thus, full block coupling typically appears for such systems
Non-Linear Coupling
Additional complications will arise for cases where the matrix coefficients
are also a function of the solution: non-linearity
Example: convection term in the momentum equation (u u). Here, components of AP and AN depend on the solution itself, thus creating a nonlinear system
Standard methods, line the Newton linearisation require the evaluation of
the Jacobian, which is complex and costly. In reality, simple linearisation
is used most often: evaluate AP and AN based on the current value of u
and re-calculate u.
Saddle Block Systems
A system of equations central to our interest (incompressible Navier-Stokes
equations) has a worrying property: wrong equations!

122

Solution Methods for Coupled Equation Sets

Unknowns: velocity vector u (3 vector components) and pressure p


(scalar)
Equations: momentum equation (3 vector components)
u
+ (uu) (u) = p
t

(8.43)

Continuity equation:
u = 0

(8.44)

Continuity equation sets a condition on velocity divergence u, which


is a scalar this makes is a scalar equation
Formally, we have 1 vector equation and one vector unknown and one
scalar equation
. . . but the scalar equation is given in terms of u and not p!!!
This kind of system is termed the saddle-point system: equations that
govern p do not depend on it. Formally, we can write the system as follows:


[Au ] [(.)] u
0
=
(8.45)
[(.)]
[0]
p
0
Note the absence of entries for p in the diagonal matrix! Off diagonal
blocks actually represent the discretised form of the gradient and divergence
operator, multiplied by p and u, respectively. The diagonal block [Au ]
contains the discretised form of the momentum equation, excluding the
pressure gradient term
While there exists a large set of zero diagonal entries, this matrix can be
solved. However, naive solution method would require a direct linear equation solver, making it extremely expensive. We shall look for cheaper and
faster solution methods
In compressible flows, the density-pressure relationship replaces the zero
diagonal block. However, as we approach the incompressibility limit, the
system approaches the saddle point form

8.5

Coupling in Model Equation Sets

Porous Media: Darcys Equation


Solution is governed by the Laplace equation: easy, simple and cheap to
solve

8.5 Coupling in Model Equation Sets

123

The nature of equation dictates that every point in the domain influences
every other point: elliptic nature of the equation. This can be seen in the
operation of iterative solvers large number of sweeps due to the fact that
the information is global
For directed resistance, may be different in different directions, but the
above still holds
Linear Stress Analysis
The equation is linear and easy to solve. No convection term = symmetric
matrix
The significant new term in the system is [(d)T ]. It can be shown that
it represents rotation, coupling the components of d to each other
In solid body rotation, the components of the vector change together: strong
inter-dependency of vector components
Note that a segregated solution approach is very detrimental in this case.
This would imply decoupling the vector components of d and lagging crosscomponent coupling. As a result, an initially linear problem is nonlinearised, potentially massively increasing solution cost
Incompressible Navier-Stokes
Velocity coupled to itself: non-linear convection term
Pressure coupled to velocity in a linear way
Notes on the form of the pressure
Stress term is modelled using the velocity gradient u
Pressure is the spherical part of the stress tensor
The continuity equation specifies the condition on the divergence of
velocity, which is the trace of the gradient tensor
Thus, the role of the pressure is to make sure the velocity is divergence
free
Simple solution methods will not work due to a zero diagonal block in
pressure equations: need specialised pressure-velocity coupling algorithms

124

Solution Methods for Coupled Equation Sets

Compressible Navier-Stokes
Complex coupling:
Density appears in the momentum equation and velocity in the continuity equation
Compressibility effect (speed of sound) changes the nature of the densitymomentum coupling
energy affects density through the equation of state, with feed-back
both directly through the density and the momentum
Close coupling between the equations recognised in the block form. Rewriting the same equations to emphasise strong coupling:
U
+ F V = 0
t

(8.46)

where the solution variable U is:


U = u
e

(8.47)

the convective flux F is:

u
F = uu + pI
(e + p)u

and the diffusive flux V reads:

0
V =
u q

(8.48)

(8.49)

The above emphasises the fact that the face flux of the system (mass, momentum, energy) needs to be evaluated together: it depends on (, u, e)lef t
and (, u, e)right
At the same time, the coupled system hides the issues with the coupling
at low speed. For example, a pressure difference of 3 5 Pa can drive a
significant amount of flow. The associated density difference (air at atmospheric conditions) is of the order of 5 105 kg/m3 at mean density of
1.176829 kg/m3 , which causes numerical problems
Note that in the limit of incompressibility, decoupling between density and
pressure complicates the numerical approach

8.5 Coupling in Model Equation Sets

125

k Turbulence Model
Both equations source-dominated, with relatively short time-scales: turbulence transported from elsewhere quickly dissipates
Left on its own (no mean shear), the system quickly tends to the no
turbulence solution: k = 0, = 0
In most turbulence models, local balance of turbulence production and
destruction dominates over the transport: equations are said to be sourcedominated. This makes them easy to solve: local effect
Equation coupling is highly non-linear. Generation term
k2
G = C [u + (u)T ] : u

-equation sources and sinks:

2
S = C1 G C2 ,
k
k
Note various k 2 and 2 terms in the equations!

(8.50)

(8.51)

Non-linearity if further (massively) complicated bu the introduction of the


momentum equation, influenced through effective viscosity ef f = + t
and
k2
(8.52)
t = C

In segregated solution methods, two equations are solved consecutively


without major coupling problems. In reality, either k or will over-shoot
and stabilise the system
In external aerodynamics (aerospace) flows with coupled solvers and large
time-steps, it sometimes pays to solve the equations in a coupled manner.
However, the nature of equations indicates the largest benefit from local
source coupling, followed by a transport step: see Multi-Step Approach
below
Chemical Reactions
Coupling dependent on the reaction rate. For systems with fast reactions,
interaction between local quantities may totally dominate
Stiffness and behaviour of the system critically depends on the choice of
reactions, species (or pseudo-species) and the time-step
In most cases, the system is source-dominated, but inter-equation coupling
issues may be extremely severe. Depending on the problem, use of nonlinear stiff system solvers may be required

126

8.6

Solution Methods for Coupled Equation Sets

Special Coupling Algorithms

For significant equation sets like fluid flow or magneto-hydrodynamics,


we can also devise special solution algorithms based on the detailed understanding of the physics. These may be orders of magnitude faster or
memory-efficient that the above approaches
Examples of such algorithms are multi-step algorithms for chemical reactions and pressure-velocity coupling algorithms like SIMPLE and PISO in
fluid flows
Multi-Step Approach
In chemical reactions, it regularly happens that the system of reaction rates
creates a strongly coupled and non-linear system that requires a non-linear
solver
At the same time, the transport part of the system is easy to solve. However,
a combination of non-linear source coupling and transport would result in
a very large and strongly non-linear system
Such systems are solved in 2 steps:
Reaction step. Solution of the local non-linear coupling with frozen
transport terms: one system per computational point. The system
captures all coupled species and resolves local effects
Transport step. Once the coupling is resolved, the reaction terms
are frozen and a transport is solved in the standard manner
If necessary, the steps can be repeated until convergence
Pressure-Velocity Coupling
Pressure-velocity coupling algorithms stem from the incompressible NavierStokes equations and separate into 2 parts:
Assembly of the pressure equation from the divergence condition
Coupling between the momentum and pressure equations
Variants of pressure-velocity coupling tend to agree on the formulation of
the pressure equation but differ in the way the coupling is established, as
will be presented in future chapters

Part III
Numerical Simulation of Fluid
Flows

Chapter 9
Governing Equations of Fluid
Flow
In this chapter, we will revisit the governing equations of fluid flow and various
levels of simplification in engineering practice. Some simplifications are voluntary
(e.g. steady-state) and some follow from the physical behaviour or flow characteristics (e.g. incompressible flow, turbulence).
All simplified forms and levels of approximation shown below are used in fluid
flow simulations. Simpler forms are not only quick and easy to compute, but can
be used as an initial guess for more complete level of approximation.

9.1

Compressible Navier-Stokes Equations

Solution variables: density , momentum u and energy e


Continuity equation:

+ (u) = 0
t

(9.1)

Rate of change and convection: mass transport. The two terms are
sometimes grouped into a substantial derivative
Mass sources and sinks would appear on the r.h.s.
Note the absence of a diffusion term: mass does not diffuse
Coupling with the momentum equation: rate of change of depends
on the divergence of u
Momentum equation:

2
(u)
T
= g P + u (9.2)
+ (uu) u + (u)
t
3

130

Governing Equations of Fluid Flow

Substantial derivative
Non-linear convection term: (uu). This terms provides the wealth
of interaction in fluid flows
Diffusion term contains viscous effects
Energy equation:

2
(e)
+ (eu) (T ) = gu (P u)
(u) u
t
3
(9.3)


T
+ u + (u) u + Q,

Note that the diffusion term is given in terms of temperature T , not


energy: for non-constant material properties, this may be problematic
r.h.s. contains a number of terms related to the work from the stress
tensor

Weaker coupling to the rest of the system: e and T influence and u


through the equation of state
Equation of state
: = (P, T )

(9.4)

Relationship between density and pressure P


Transport coefficients and are also functions of the thermodynamic
state variables:
= (P, T ),
= (P, T ).

(9.5)
(9.6)

Properties of real gasses and liquids rarely used in tabular form. Instead, measured data is curve fitted be standard sources: JANAF,
NIST, etc.
Variation of material properties is usually a smooth function and does
not introduce significant non-linear problems. Issues sometimes occur
when the state changes significantly in a single time-step. Here, the
initial guess for the new state may be far away from the solution,
causing excessive number of search iterations

9.2

Flow Classification based on Flow Speed

Flow-related compressibility effects are measured by comparing the flow


speed with the speed of sound

131

9.2 Flow Classification based on Flow Speed

Velocities to compare are the convective velocity and the speed with which
a weak pressure wave travels through the medium
When the convective speed reaches and exceeds the speed of sound, the
mode of propagation of information changes significantly: shocks
Speed Range
low subsonic
high subsonic
transonic
supersonic
hypersonic

Mach Number
< 0.3
0.3 0.6
0.6 1.1
15
>5

Low Subsonic Flow


Pressure changes driving the flow are sufficiently slow to cause minimal
changes in the density
As a consequence, flow may be considered constant density, allowing all
equations to be divided through by the density and setting
=0
t
In special cases, effects like buoyancy-driven flow can be modelled in the
same way: driving force from buoyancy is treated as a body force without
changing the density
High Subsonic Flow
Flow-induced density variation is significant, but without transonic flow
pockets. In other words, the convective effects in the pressure distribution
are significant but not dominating
Similar situation appears in flow where engineering machinery is designed to
increase the pressure (density) mechanically. Example: internal combustion
engine (compression-expansion)
This formulation is sometimes called the variable density formulation
Transonic Flow
Inlet/outlet conditions typically subsonic, but with pockets of supersonic
flow
In some parts of the flow, the convective effects are dominant
Because of the mix of elliptic and hyperbolic nature, transonic cases are
usually the most difficult to compute

132

Governing Equations of Fluid Flow

Supersonic Flow
Boundary conditions are typically supersonic, with pockets of subsonic flow.
Subsonic regions are usually captured close to walls or moving obstacles
Hypersonic Flow
On very high speed, simple formulation of the equation of state breaks down
and more complex laws are needed
Apart from increasingly complex equation of state, the flow is basically
supersonic, with the same limitations on the specification of boundary conditions
Forms of equation of state:
Perfect gas. Flow regime still Mach number independent, but there
are problems with adiabatic wall conditions
Two-temperature ideal gas. Rotational and vibrational motion
of the molecules needs to be separated and leads to two-temperature
models. Used in supersonic nozzle design
Dissociated gas. Multi-molecular gases begin to dissociate at the
bow shock of the body.
Ionised gas. The ionised electron population of the stagnated flow
becomes significant, and the electrons must be modelled separately:
electron temperature. Effect important at speeds of 10 12km/s
In engineering machinery, this flow regime is achieved by dropping the speed
of sound (rarefied gas), or in space vehicle re-entry aerodynamics

9.3

Steady-State or Transient

In engineering machinery and especially in fluid flow simulations we are


regularly interested in the mean or time-averaged properties. Example:
mean lift and drag on an airfoil or the mean pressure drop in the pipe.
Physically, such simulations should involve calculating a time-dependent
flow and performing an appropriate averaging procedure, as is the case in
experimental studies
Operations on mathematical equation governing the system allow a different
approach: assemble the equations for time-averaged (instead of instantaneous) properties and solve them: in principle, this should provide a mean
(time-averaged) solution without further manipulation

9.4 Incompressible Formulation

133

Unfortunately, in engineering practice, steady-state approximation is used


indiscriminately: having an aircraft flying at cruising speed and altitude,
with constant atmospheric conditions does not imply that the flow is steady
or even that lift and drag remain constant
In true steady-state simulations, the value of time derivative in all equations
reduces to zero. However, forcing this on cases where it will not physically
happen leads to numerical problems, including lack of convergence
Example: approximations and numerical difficulties of steady state: vortex
shedding behind a cylinder in laminar flow
For some transient cases, with a well ordered time response, additional
time-response simplifications are possible. Example: frequency-based decomposition in turbomachinery simulations, where frequency is determined
from the number of stator and rotor passages

9.4

Incompressible Formulation

Decoupling dependence of density on pressure, also resulting in the decoupling of the energy equation from the rest of the system
Equations can be solved both in the velocity-density or velocity-pressure
formulation
Velocity-density formulation does not formally allow for Ma = 0 (or
c = ), but formally this is never the case. In practice, matrix preconditioning techniques are used to overcome zero diagonal coefficients
Velocity-pressure formulation does not suffer from low-Ma limit, but
performs considerably worse at high Ma number

9.5

u
+ (uu) (u) = p
t

(9.7)

u = 0

(9.8)

Inviscid Formulation

Relative influence of convective and viscous effects is measured by the


Reynolds number (Re).
Inviscid formulation implies infinite Re number. In reality, viscous effects
are only important in the vicinity of walls. Also, this simplification would
have important effects on turbulence dynamics, described below

134

Governing Equations of Fluid Flow

A popular simplified form of equations used in the past is a combination of


an inviscid flow solver in the far field coupled with a boundary layer solver
in the near-wall region

9.6

Potential Flow Formulation

Fast turnaround: panel method simulations, sometimes coupled with a


boundary layer solver
Still useful in engineering practice: initialisation of the flow field, speeding
up convergence

9.7

Turbulent Flow Approximations

Turbulent Flow
Navier-Stokes equations represent fluid flow in all necessary detail. However, the span of scales in the flow is considerable
Nature of turbulent flow is such that it is possible to separate the mean
signal from turbulence interaction
Example: turbulent flow around Airbus A380
Largest scale of interest is based on the scale of engineering machinery:
overall length (79.4 m), wing span (79.8 m). In practice, wake behind
the aircraft is also of interest
In turbulent flows, energy is introduced into large scales and through
the process of vortex stretching transferred into smaller scales. Most
dissipation of turbulence energy into heat happens at smallest scales
The size of smallest scale of interest is estimated from the size of a
vortex which would dissipate the energy it contains in one revolution.
The scale depends on Re number, but an estimate would be obtained
from the Kolmogorov micro-scale:
=

41

(9.9)

where is the scale, is the kinematic viscosity and is the dissipation


rate (equal to the production rate). For our case, this will be well below
a millimetre; additionally, include the requirement for time-accurate
simulation and averaging

9.7 Turbulent Flow Approximations

135

In order to resolve the flow to all of its details, full range of scales need
to be simulated. The range of scales in turbulent flow on high Re is well
beyond the capabilities of modern computers, which leads to turbulence
modelling
Level of Approximation
Direct Numerical Simulation (DNS). Full range of scales is simulated:
transient simulation with averaging. 3-D and time-dependent simulations,
with the need for averaging
Reynolds Averaged Navier-Stokes Equations (RANS). Velocity and
pressure (density) are decomposed into the mean and oscillating component
u = u + u
p = p + p

(9.10)
(9.11)

Substituting the above into the Navier-Stokes equations and eliminating


second-order terms yields the equations in terms of mean properties: u
and p, with a closure problem.
Large Eddy Simulation (LES). LES recognises the fact that turbulence on larger scales depends on the geometry and flow details and smaller
scales acting mainly as the energy sink. By nature, smaller scales are more
isotropic and homogenous and thus easier to model. Therefore, we shall aim
to decompose the flow into larger scales, which are resolved and model the
effect of smaller scales. Simulation is 3-D and time-resolved and requires
averaging.

9.7.1

Direct Numerical Simulation

Main source of comparison data for simple and canonical flows (e.g. homogenous isotropic turbulence, incompressible and compressible turbulent
boundary layer, simple geometries)
DNS has completely replaced experimental methods at this level because it
provides complete information and numerics has proven sufficiently accurate
Current push towards compressible flows and simple chemical reactions,
e.g. interaction between turbulent mixing and flame wrinkling in premixed
combustion
Typical level of discretisation accuracy: 6th order in space and 10th order
in time. Critical for accurate high-order correlation data
Extremely expensive simulations: pushing the limits of computing power

136

Governing Equations of Fluid Flow

9.7.2

Reynolds Averaging Approach

Reynolds Averaging
Reynolds averaging removes a significant component of unsteady behaviour:
all transient effects that can be described as turbulence are removed by
the manipulation of equations
Note that u and p are still time-dependent (separation of scales): time
dependent RANS
It is now possible to solve directly for the properties of engineering interest:
mean flow field, mean drag etc. For cases which are 2-D in the mean, it
makes sense to perform 2-D simulations irrespective of the nature of turbulence
A turbulence model is required for closure: describe the effect of sub-grid
scales on the resolved flow based on resolved flow characteristics
This is a substantial reduction in simulation cost and has allowed the adoption industrial of CFD. RANS models are the mainstay of industrial CFD
and likely to remain so until the next change in computing power of approximately 2 orders of magnitude
Turbulence models are just models (!) and their physical justification is
often more limited than for the fundamental equations
u
+ (u u) (u) = p + R
t

(9.12)

u = 0

(9.13)

Here, R is the Reynolds stress tensor:


R = u u

(9.14)

Reynolds Stress Closure Models


Eddy viscosity models. Models are based on reasoning similar to Prandtls
theory:

R = t u + (u)T
(9.15)
where t is the eddy viscosity. In short, the formula specifies that the
Reynolds stress tensor is aligned with the velocity gradient. Eddy viscosity
is assembled through dimensional analysis, based on a characteristic lengthand time-scale

9.7 Turbulent Flow Approximations

137

Second and higher order closure. Instead of assembling R based on the


velocity gradient, a transport equation for the Reynolds stress is assembled
by manipulating the momentum equation. However, this leads to a higherorder closure problem (new terms in the Reynolds stress transport equation)
with additional uncertainty
Near-wall treatment. Regions of sharp velocity gradients near the wall
is the most demanding: high mesh resolution, controlling cell aspect ratio
and time-step. Two modelling approaches:
Integration to the wall, also known as low-Re turbulence models.
Near-wall region is resolved in full detail, with the associated space
resolution requirements.
Wall functions, where the region of high gradients is bridged with
a special model which compensates for unresolved gradients. Model
assumes equilibrium behaviour near the wall (attached fully developed
flow) and significantly influences the result

9.7.3

Large Eddy Simulation

The first step in Large Eddy Simulation (LES) modelling approach is the separation of the instantaneous value of a variable into the resolved and unresolved
(modelled) component.
Mathematical Machinery
Scale separation operation is achieved through filtering. Imagine a separation of space into small pockets of space and performing local averaging.
Averaging operation is mathematically defined as:
Z
u = G(x, x ) u(x )dx ,
(9.16)
where G(x, x ) is the localised filter function. This can be interpreted as
a local spatial average
Effect of filtering the Navier-Stokes equations is very similar to the Reynolds
averaging, but the meaning of the filtered values is considerably different
Simulations remains 3-D and unsteady, with the need for averaging. However, demands for spatial and temporal resolution are considerably reduced,
due to the fact that smallest scales are to be modelled
u
+ (u u) (u) = p +
t

(9.17)

138

Governing Equations of Fluid Flow

u = 0

(9.18)

Here, is the sub-grid stress tensor, arising from the fact that u u 6= u u:
= uu uu

(9.19)

= (u + u ) (u + u )

(9.20)

= (u u u u) + (u u + u u) + u u

(9.21)

(Leonard stress, grid-to-subgrid energy transfer, sub-grid Reynolds stress)


Sub-Grid Scale (SGS) Model
The idea of LES is to separate the scales of turbulence such that only small
scales are modelled, whereas energetic and geometrical scales are resolved
by simulation. Small scale turbulence is closer to isotropic and homogenous,
making it easier to model
A number of modelling paradigms exist, based on different ways of extracting the information about sub-grid scales (SGS). Since the main role of SGS
models is to remove the energy of the resolved scales, overall result is only
weakly influenced by the SGS, provided the correct rate of energy removal
is accounted for
In practice, most SGS models are based on eddy viscosity, sometimes with
additional transport or back-scatter effects
Numerical Model and Simulation Framework
Numerical errors introduced by discretisation are typically diffusive in nature. In other words, the discretisation error will act as if additional diffusivity in the system
At the same time, it is the role of the SGS model to control the energy
dissipation at the correct physical rate this would imply the importance
of reducing numerical errors to a minimum
Older school of LES required the same accuracy of spatial and temporal
discretisation as in DNS. Recent studies show this is excessive: higher moments are typically not of interest.
On balance, good second-order discretisation and unstructured mesh handling for complex geometries provides a good balance of accuracy, speed
and resolution requirements

Chapter 10
Pressure-Velocity Coupling
In this chapter, we shall examine the nature of pressure-velocity coupling and
review numerical algorithms to handle fluid flow equations in the most efficient
manner. The algorithms can be divided into pressure- and density- based algorithms, with segregated and coupled solution methods.

10.1

Nature of Pressure-Velocity Coupling

Discretisation Procedure for Fluid Flow Equations


In previous chapters, we have presented a discretisation procedure for transport equations for scalars and vectors. Additionally, we have presented a
method for handling coupled equation sets and linear equation solver technology
In density-based algorithms, the methodology is satisfactory: solving a single transport equation for a block variable, where the flux of mass, momentum and energy depends on the complete set of state variables
However, the machinery does not seem to be complete for pressure-based
system. We shall examine this further starting from the incompressible
Navier-Stokes, equations, extend it to compressible flow and compare with
the density-based solvers
Momentum Equation
Momentum equation is in the standard form and the discretisation of individual terms is clear. This is the incompressible form, assuming = const.
and u = 0 (demonstrate):
u
+ (uu) (u) = p
t

(10.1)

140

Pressure-Velocity Coupling

The non-linearity of the convection term, (uu) can be easily handled by


an iterative algorithm, until a converged solution is reached
The limiting factor is the pressure gradient: p appears as the source term
and for known p there would be no issues.
Continuity Equation
Continuity equation states that mass will neither be created nor destroyed.
In incompressible flows, by definition = const., resulting in the incompressible form of the continuity equation:
u = 0

(10.2)

Note: this is a scalar field equation in spite of the fact that u is a vector
field!
Pressure Momentum Interaction
Counting the equations and unknowns, the system seems well posed: 1
vector and 1 scalar field governed by 1 vector and 1 scalar equation
Linear coupling exists between the momentum equation and continuity.
Note that u is a vector variable governed by the vector equation. Continuity
equation imposes an additional criterion on velocity divergence (u). This
is an example of a scalar constraint on a vector variable, as u is a scalar
Non-linear u u interaction in the convection is unlikely to cause trouble:
use an iterative solution technique. In practice
(uu) (uo un )

(10.3)

where uo is the currently available solution or an initial guess and un is the


new solution. The algorithm cycles until uo = un
Continuity Equation and the Role of Pressure
There is no obvious way of assembling the pressure equation, which is at
the root of the problem. Available equation expresses the divergence-free
condition on the velocity field.
u = 0

(10.4)

10.2 Density-Based Block Solver

141

Examining the role of the pressure, it turns out that the spherical part
of the stress tensor, extracted in the pressure term directly relates to the
above condition on the velocity. Viscous stress is modelled on the basis of
the velocity gradient:

= pI + u + (u)T ,
(10.5)
postulating the equivalence between the mechanical and thermodynamic
pressure. Therefore, the pressure term is related to the tr(u) = u,
which appears in the continuity equation. In other words, pressure distribution should be such that the pressure gradient in the momentum equation
enforces the divergence-free condition on the velocity field.

If the pressure distribution is known, the problem of pressure-velocity coupling is resolved. However, it is clear that pressure and velocity will be
closely coupled to each other.

10.2

Density-Based Block Solver

Density-Based Algorithm
In previous lectures, we have shown a block coupled form of the densitybased flow solver. Noting that all governing equations fit into the standard
form and all variables are fully coupled, the compressible Navier-Stokes
system can be written as:
U
+ F V = 0
t

(10.6)

where the solution variable U is:


U = u
e

(10.7)

In the above, pressure appears in the convective flux F :

u
F = uu + pI
(e + p)u

(10.8)

Standard (Roe flux) compressible Navier-Stokes solver will evaluate F for


each cell face directly from the state (U) left and right from the face, using
approximate Riemann solver techniques

142

Pressure-Velocity Coupling

Looking at the second row of the flux expression we can recognise the
convective contribution and the pressure driving force (note (pI) = p).
In high-speed flows, the first component is considerably larger than the
second
In the low-speed limit, a pressure difference of 35Pa can drive considerable
flow; however, in this case, the pressure gradient will dominate. As shown
before, this implies a density change of approximately 5 105 kg/m3 for
the mean density of 1kg/m3 . Equivalent calculation for a liquid (water),
would produce even more extreme result (due to the higher speed of sound)
Equation governing pressure effects in this case is the continuity, through
density transport and the equation of state. Therefore, for accurate pressure
data we need to capture density changes of the order of 1 105 , with
reference level of 1, together with the velocity changes of the order of 1 and
energy level of 2 105 (e = Cv T ). Note that all properties are closely
coupled, which means that matrix coefficients vary to extreme levels
The speed of sound in general is given as
s
p
c=

(10.9)

Infinite speed of sound (incompressible fluid) implies decoupling between


density and pressure
As a consequence of decoupling, density-based solver cannot handle the
the incompressible limit. In practice, very low Ma number flow can be
achieved, either through matrix preconditioning or by introducing artificial
compressibility
Explicit and Implicit Compressible Flow Solver
Relationship that prescribes F as a function of UP and UN is complex and
non-linear: calculating characteristic wave speed and propagation. It is
therefore natural to evaluate the flux F and advance the simulation explicitly:
U n = U o t(F V ) = U o tR

(10.10)

Here, R is the convection-diffusion residual residual (A higher-order timeintegration technique may also be used)
This leads to a fundamentally explicit time-integration method, with
the associated Courant number (Co) limit: time-step is limited by the size
of the smallest cell

143

10.2 Density-Based Block Solver

Time-step limitation is in reality so severe that it renders the code useless:


for steady-state simulations, we need to achieve acceleration of a factor of
100 10 000
Solution acceleration techniques require faster information transfer in order
to approach steady-state more rapidly. We will examine two:
Implicit solver
Geometric multigrid
Solution Acceleration Techniques
Implicit solver
Implicit compressible solver is based on the same flux evaluation technique as the explicit solver, but generalising the form of the flux expression to create matrix coefficients
F
F
UP +
UN + D
UP
UN
= AP UP + AN UN + D

F = F (UP , UN ) =

(10.11)
(10.12)

Here, matrix coefficient is a full 5 5 matrix, calculated as a Jacobian


and D is the explicit correction. Linearisation may be done in several
ways, with different level of approximation
(u)

(uu+pI)
A = (u)

(10.13)

((e+p)u)
(e)

With the help of flux Jacobians, we have created an implicit system


of equations, which relaxes the Co number criterion, but not to the
desired level. However, this is a very useful first step
Multigrid acceleration
Geometric multigrid is based on a curious fact: as the mesh gets coarse,
the Co number limit becomes less strict, allowing the simulation to
advance in larger time-steps and a steady-state solution is reached in
fewer time-steps
The problem we have solved on a coarse grid is physically identical
to its fine-grid equivalent. It should therefore be possible to solve
the coarse-grid problem and use the solution as the initial guess for its
fine-grid equivalent

144

Pressure-Velocity Coupling

Full Approximation Storage (FAS) Multigrid performs this process on several levels simultaneously, using a hierarchy of corse grids.
This allows us to use a very large Co number (100 1 000 or higher)
without falling foul of the Co criterion: significant part of information
transfer occurs on coarse grids without violating the stability criterion
Additional complication in multigrid simulation is the requirement for
a hierarchy of coarse grids for the geometry of interest. Additional
problems, related to the geometric representation and specification of
boundary conditions on coarse grids
In practice, coarse grids are assembled be agglomerating fine grid cells
into clusters

10.3

Pressure-Based Block Solver

Rationale
We have shown there exists a fundamental limitation of density-based solvers
close to the incompressibility limit. At the same time, based on the flow
classification based on Ma number, for Ma < 0.3 the compressibility effects
are negligible. This covers a large proportion of flow regimes
Idea: assemble the solution algorithm capable of handling the low Mach
number limit and extend it to compressible flow. Formally, such a method
should be able to simulate the flow at all speeds
A critical part here is handling the incompressibility limit: this is what we
will examine below
Block Pressure-Momentum Solution
Looking at basic discretisation techniques, we can handle the momentum
equation without any problems, apart from the pressure gradient term. If
pressure were known, its gradient could be easily evaluated; however, we
need to create an implicit form of the operator
The same applies for the velocity divergence term with an additional complication: u needs to be expressed in terms of pressure as a working
variable
This technique leads to the saddle-point system mentioned above

10.3.1

Gradient and Divergence Operator

Repeating the discretisation of the gradient and divergence term given above, we
shall now repeat the procedure, attempting to assemble an implicit form

145

10.3 Pressure-Based Block Solver

Gradient Operator
We shall only show the discretisation for the Gauss gradient; least square
and other techniques can be assembled in an equivalent manner
Discretised form of the Gauss theorem splits into a sum of face integrals
Z
I
X
dV =
n dS =
sf f
(10.14)
VP

It still remains to evaluate the face value of . Consistently with secondorder discretisation, we shall assume linear variation between P and N
f = fx P + (1 fx )N

(10.15)

Assembling the above, the gradient can be assembled as follows


X
aP P +
aN N

(10.16)

where
aN =

1 fx
sf
VP

(10.17)

aP =

(10.18)

and
f

fx sf

VP

Note that both aP and aN are vectors: multiplying a scalar field produces
a gradient (vector field)
For
P a uniform mesh (fx = const.), aP = 0! This is because for a closed cell
f sf = 0

Divergence Operator

The divergence operator is assembled in an equivalent manner. A divergence of a vector field u is evaluated as follows:
Z
I
X
u dV =
nu dS =
sf u
(10.19)
VP

146

Pressure-Velocity Coupling

Equivalent to the gradient operator discretisation, it follows:


u aP uP +

aN uN

(10.20)

where
aN =

1 fx
sf
VP

(10.21)

aP =

(10.22)

and
fx sf
VP

Note that the coefficients are equivalent to the gradient operator, but here
we have the inner product of two vectors, producing a scalar

10.3.2

Block Solution Techniques for a Pressure-Based


Solver

Pressure-Based Block Solver


Discretisation of the gradient and divergence operator above allows us to
assemble the block pressure-velocity system as promised
The system can be readily solved using the direct solver (note the zeros on
the diagonal of the pressure matrix). However, this is massively expensive
and we need to find a better way to handle the system
Solver Technology
Zero diagonal entries exclude a majority of iterative solvers: any GaussSeidel technique is excluded
There exists a set of iterative techniques for saddle systems which may be of
use. Typically, they combine a Krylov-space solver (operating on residual
vectors) with special preconditioners for saddle systems
We shall examine one such technique below, as a part of derivation of the
pressure equation

10.4 Segregated Pressure-Based Solver

10.4

147

Segregated Pressure-Based Solver

Segregated Solution Procedure


Currently, a pressure-based block solver does not look very attractive: large
matrix, with a combination of variables and different nature of equations
with uncertain performance of linear equation solvers
A step forward could be achieved be deriving a proper equation governing
pressure and assembling a coupling algorithm. In this way, momentum
and pressure could be solver separately (1/4 of the storage requirement of
the block- or density-based solver) and handled by an external coupling
algorithm
In any case, the first step would be a derivation of the pressure equation,
which will be examined below

10.4.1

Derivation of the Pressure Equation

Pressure Equation as a Schur Complement


Consider a general block matrix system M, consisting of 4 block matrices,
A, B, C and D, which are respectively p p, p q, q p and q q matrices
and A is invertible:

A B
(10.23)
C D
This structure will arise naturally when trying to solve a block system of
equations
Ax + By = a
Cx + Dy = b

(10.24)
(10.25)

The Schur complement arises when trying to eliminate x from the system
using partial Gaussian elimination by multiplying the first row with A1 :
A1 Ax + A1 By = A1 a

(10.26)

x = A1 a A1 By.

(10.27)

and

Substituting the above into the second row:


(D CA1 B)y = b CA1 a

(10.28)

148

Pressure-Velocity Coupling

Let us repeat the same set of operations on the block form of the pressurevelocity system, attempting to assemble a pressure equation. Note that
the operators in the block system could be considered both as differential
operators and in a discretised form


[Au ] [(.)] u
0
=
(10.29)
[(.)]
[0]
p
0
Formally, this leads to the following form of the pressure equation:
[(.)][A1
u ][(.)][p] = 0

(10.30)

Here, A1
u represent the inverse of the momentum matrix in the discretised
form, which acts as diffusivity in the Laplace equation for the pressure.
From the above, it is clear that the governing equation for the pressure is
a Laplacian, with the momentum matrix acting as a diffusion coefficient.
However, the form of the operator is very inconvenient:
While [Au ] is a sparse matrix, its inverse is likely to be dense
Discretised form of the divergence and gradient operator are sparse
and well-behaved. However, a triple product with [A1
u ] would result
in a dense matrix, making it expensive to solve
The above can be remedied be decomposing the momentum matrix before
the triple product into the diagonal part and off-diagonal matrix:
[Au ] = [Du ] + [LUu ],

(10.31)

where [Du ] only contains diagonal entries. [Du ] is easy to invert and will
preserve the sparseness pattern in the triple product. Revisiting Eqn. (10.29
before the formation of the Schur complement and moving the off-diagonal
component of [Au ] onto r.h.s. yields:

[LUu ][u]
[Du ] [(.)] u
(10.32)
=
0
p
[(.)]
[0]
A revised formulation of the pressure equation via a Schurs complement
yields:
[(.)][Du1 ][(.)][p] = [(.)][Du1 ][LUu ][u]

(10.33)

In both cases, matrix [Du1 ] is simple to assemble.


It follows that the pressure equation is a Poisson equation with the diagonal
part of the discretised momentum acting as diffusivity and the divergence
of the velocity on the r.h.s.

149

10.4 Segregated Pressure-Based Solver

Derivation of the Pressure Equation


We shall now rewrite the above derivation formally without resorting to the
assembly of Schurs complement in order to show the identical result
We shall start by discretising the momentum equation using the techniques
described before. For the purposes of derivation, the pressure gradient term
will remain in the differential form. For each CV, the discretised momentum
equation yields:
auP uP +

auN uN = r p

(10.34)

For simplicity, we shall introduce the H(u) operator, containing the offdiagonal part of the momentum matrix and any associated r.h.s. contributions:
X
H(u) = r
auN uN
(10.35)
N

Using the above, it follows:


auP uP = H(u) p

(10.36)

uP = (auP )1 (H(u) p)

(10.37)

and

Substituting the expression for uP into the incompressible continuity equation u = 0 yields

(auP )1 p = ((auP )1 H(u))

(10.38)

We have again arrived to the identical form of the pressure equation


Note the implied decomposition of the momentum matrix into the diagonal
and off-diagonal contribution, where auP is an coefficient in [Du ] matrix and
H(u) is the product [LUu ][u], both appearing in the previous derivation
Assembling Conservative Fluxes
Pressure equation has been derived from the continuity condition and the
role of pressure is to guarantee a divergence-free velocity field

150

Pressure-Velocity Coupling

Looking at the discretised form of the continuity equation


X
X
F
sf u =
u =

(10.39)

where F is the face flux


F = sf u

(10.40)

Therefore, conservative face flux should be created from the solution of the
pressure equation. If we substitute expression for u into the flux equation,
it follows:
F = (auP )1 sf p + (auP )1 sf H(u)

(10.41)

A part of the above, (auP )1 sf p appears during the discretisation of the


Laplacian, for each face. This is discretised as follows:
(auP )1 sf p = (auP )1

|sf |
(pN pP ) = apN (pN pP )
|d|

(10.42)

|s |

Here, apN = (auP )1 |d|f is equal to the off-diagonal matrix coefficient in the
pressure Laplacian
Note that in order for the face flux to be conservative, assembly of the flux
must ba completely consistent with the assembly of the pressure equation
(e.g. non-orthogonal correction)

10.4.2

SIMPLE Algorithm and Related Methods

SIMPLE Algorithm
This is the earliest pressure-velocity coupling algorithm: Patankar and
Spalding, 1972 (Imperial College London)
SIMPLE: Semi-Implicit Algorithm for Pressure-Linked Equations
Sequence of operations:
1. Guess the pressure field p
2. Solve the momentum equation using the guessed pressure. This step
is called momentum predictor
auP uP = H(u) p

(10.43)

151

10.4 Segregated Pressure-Based Solver

3. Calculate the new pressure based on the velocity field. This is called
a pressure correction step

(auP )1 p = (auP )1 H(u)


(10.44)

4. Based on the pressure solution, assemble conservative face flux F


F = sf H(u) apN (pN pP )

(10.45)

5. Repeat to convergence
Corrected velocity field may be obtained by substituting the new pressure
field into the momentum equation:
uP = (auP )1 (H(u) p)

(10.46)

Under-Relaxation
The algorithm in its base form produces a series of corrections on u and p.
Unfortunately, in the above form it will diverge!
Divergence is due to the fact that pressure correction contains both the
pressure as a physical variable and a component which forces the discrete
fluxes to become conservative
In order to achieve convergence, under-relaxation is used:
p = p + P (p p )

(10.47)

u = u + U (u u )

(10.48)

and

where p and u are the solution of the pressure and momentum equations
and u and p represent a series of pressure and velocity approximations.
Note that in practice momentum under-relaxation is implicit and pressure
(elliptic equation) is under-relaxed explicitly
1 U u
auP
uP = H(u) p +
aP uP
U
U

(10.49)

P and U are the pressure and velocity under-relaxation factors. Some


guidelines for choosing under-relaxation are
0 < P 1
0 < U 1
P + U 1

(10.50)
(10.51)
(10.52)

152

Pressure-Velocity Coupling

or the standard set (guidance only!!!)


P = 0.2
U = 0.8

(10.53)
(10.54)

Under-relaxation dampens the oscillation in the pressure-velocity coupling


and is very efficient in stabilising the algorithm

10.4.3

PISO Algorithm

Pressure Correction Equation


SIMPLE algorithm prescribes that the momentum predictor will be solved
using the available pressure field. The role of pressure in the momentum
equation is to ensure that the velocity field is divergence free
After the first momentum solution, the velocity field is not divergence-free:
we used a guessed pressure field
Therefore, the pressure field after the first pressure corrector will contain
two parts
Physical pressure, consistent with the global flow field
A pressure correction component, which enforces the continuity and
counter-balances the error in the initial pressure guess
Only the first component should be built into the physical pressure field
In SIMPLE, this is handled by severely under-relaxing the pressure
Under-Relaxation and PISO
Having 2 under-relaxation coefficients which balance each other is very inconvenient: difficult tuning
The idea of PISO is as follows:
Pressure-velocity system contains 2 complex coupling terms
Non-linear convection term, containing u u coupling
Linear pressure-velocity coupling
On low Co number (small time-step), the pressure velocity coupling is
much stronger than the non-linear coupling
It is therefore possible to repeat a number of pressure correctors without updating the discretisation of the momentum equation (using the
new fluxes)

153

10.4 Segregated Pressure-Based Solver

In such a setup, the first pressure corrector will create a conservative


velocity field, while the second and following will establish the pressure
distribution
Since multiple pressure correctors are used with a single momentum equation, it is no longer necessary to under-relax the pressure. In steady-state
simulations, the system is stabilised by momentum under-relaxation
On the negative side, derivation of PISO is based on the assumption that
momentum discretisation may be safely frozen through a series of pressure
correctors, which is true only at small time-steps
PISO Algorithm
PISO is very useful in kinds of simulations where the time-step is controlled by external issues and temporal accuracy is important. In such
cases, assumption of slow variation over non-linearity holds and the cost of
momentum assembly and solution can be safely avoided. Example: Large
Eddy simulation
Sequence of operations:
1. Use the available pressure field p from previous corrector or time-step.
Conservative fluxes corresponding to p are also available
2. Discretise the momentum equation with the available flux field
3. Solve the momentum equation using the guessed pressure. This step
is called momentum predictor
auP uP = H(u) p

(10.55)

4. Calculate the new pressure based on the velocity field. This is called
a pressure correction step

(auP )1 p = ((auP )1 H(u))


(10.56)
5. Based on the pressure solution, assemble conservative face flux F
F = sf H(u) apN (pN pP )

(10.57)

6. Explicitly update cell-centred velocity field with the assembled momentum coefficients
uP = (auP )1 (H(u) p)

(10.58)

7. Return to step 4 if convergence is not reached


8. Proceed from step 1 for a new time-step
Functional equivalent of the PISO algorithm is alo used as a preconditioner
in Krylov space saddle-point solvers

154

Pressure-Velocity Coupling

10.4.4

Pressure Checkerboarding Problem

Checkeboarded Pressure Distribution


In early variants of pressure-velocity coupling algorithms an interesting error was noticed, completely invalidating the results: pressure checkerboarding. The pressure field with 1-cell oscillation seemed to satisfy the
discretised equations just as the in the place of a uniform field. Algorithm
p

Figure 10.1: Checkerboarded pressure distribution.


which cannot discriminate between a uniform and checkerboarded pressure
distribution is useless for practical purposes. We shall now examine the
cause and possible solutions for the checkerboarding problem

Figure 10.2: Checkerboarded pressure distribution.

Checkerboarding Error
As shown above, the derived form of the pressure equation contains a
Laplace operator

(auP )1 p = ((auP )1 H(u))


We have also derived the matrix equivalent of the pressure equation using
Schurs complement in the following form:
[(.)][Du1 ][(.)][p] = [(.)][Du1 ][LUu ][u]

155

10.4 Segregated Pressure-Based Solver

In both cases, the (auP )1 or [Du1 ] acts as a diffusion coefficient and can be
safely neglected as a pre-factor
The matrix equivalent can, as a triple product be read as follows:
Create the discretisation for the gradient term
Interpolate it to the face (and multiply by the diffusion)
Assemble the divergence term with the interpolated pressure
An equivalent procedure can be seen when taking the (discrete) divergence
of the discretised momentum equation:
uP = (auP )1 H(u) (auP )1 p /.

(10.59)

Here, the last term may require the interpolation of the pressure gradient.
Computational Molecule
The cause of checkerboarding error becomes clear when we examine the
implied discretised form.
A cell-centred gradient is evaluated using the values in neighbouring cells.
Note that for (p)P the cell centre P does not appear in the discretisation

Pressure gradient

Figure 10.3: Cell-centred gradient.


A divergence operator requires the gradient to be interpolated to the cell
face in order to assemble the divergence term. Symmetrically, on the opposite face, the interpolated gradient will use four computational points
around the face
Points around the cell P will appear in computational molecules for both
interpolated gradients appearing in the . operator for cell P . Since the
face area vectors point in opposite direction for two faces, the coefficients
for the intermediate points will exactly cancel out!

156

Pressure-Velocity Coupling

Face interpolated
pressure gradient

Figure 10.4: Interpolated gradient.


As a result of coefficient cancellation in intermediate points, the computational molecule for the assembled Laplace operator does not feature the
points immediately to the left and right of P , but is still forms a valid
discretisation of the Laplacian!
Laplace operator with
interpolated gradients

Figure 10.5: Laplace operator with interpolated gradients.


Looking at the above it becomes clear why checkerboarding occurs: if we
evaluate the Laplacian using every other cell, a checkerboarded pressure
field appears as uniform and there is no correction to make
Laplace operator with
interpolated gradients

Standard Laplace operator

Figure 10.6: Comparison of computational molecules for the Laplace operator.


Comparison of the two computational molecules clearly demonstrates the
problem and the way a standard discretisation of a Laplacian overcomes
the difficulty: a compact computational molecule of the standard discretisation leaves no room for checkerboarding errors

157

10.4 Segregated Pressure-Based Solver

The solution to the problem is clearly related to the rearrangement of the


computational molecule in the pressure Laplacian to compact support and
will be examined below.

10.4.5

Staggered and Collocated Variable Arrangement

Staggered Variable Arrangement


The issue of checkerboarding arises from the fact that interpolated velocity
in the divergence operator contains a cell-centred pressure gradient. This
results in an expanded molecule for the discretised Laplacian
At the time, the FVM was strictly a (2-D) structured mesh technique and
the offered solution was to stagger the computational locations where p and
u are stored.

uy
p

ux

Figure 10.7: Staggered variable arrangement.


Note that components of the velocity vectors are now stored in separate
locations and both are staggered: they formally represent face flux as well
as the velocity component
With the above, no interpolation is necessary and the pressure Laplacian
appears with compact support
Unfortunately, the staggered variable arrangement is useless on any but
simplest of meshes: for all other shapes the problem would be either underor over-constrained. A more general solution is required

158

Pressure-Velocity Coupling

There exist a pressure-flux formulation but this is beyond our scope at this
time
Collocated Variable Arrangement
The second approach to resolving the staggering problem is to recognise
that the issue boils down to the calculation of the face-based pressure gradient In the original form (above), the face pressure gradient is obtained by
interpolation:
(p)f = fx (p)P + (1 fx )(p)N

(10.60)

The face gradient is then used in the dot-product with the face area vector,
s(p)f
In the discretisation of the Laplace operator we have also come across the
expression s(p)f which was discretised as follows:
s(p)f =

|s|
(pN pP )
|d|

(10.61)

This formula results in compact support of the Laplacian and resolves the
problem
We can arrive to the collocated in several ways:
Delayed discretisation of the pressure gradient. Recognising that
the pressure equation contains a Laplace operator, we shall delay the
discretisation of the p term in the momentum equation. Once the
pressure equation is assembled, the Laplace operator is discretised in
the usual way
Rhie-Chow interpolation. In order to manufacture the coefficients
for compact pressure support, we will create a special formula for
velocity interpolation, which will separate the gradient term. Thus:

pN pP
u 1
n
(p)f
uf = fx uP + (1 fx )uN + (aP )f n
(10.62)
|d|
Here, (auP )1
f is the face interpolate of the diagonal coefficient of the
is a unit-normal vector in the direction of intermomentum equation, n
est (parallel with the direction of interpolation, d) and the expression
in brackets represents two ways of evaluating the face-based pressure
gradient
Interpolated cell-centred pressure:
(p)f = fx (p)P + (1 fx )(p)N

(10.63)

10.4 Segregated Pressure-Based Solver

159

Face-normal gradient
(p)f =
n

pN pP
|d|

(10.64)

This term is introduced to remove the interpolated for of the gradient


and replace it with a compact support, thus removing the cause of
checkerboarding
Rhie-Chow interpolation (1983) has started a major step forward in CFD:
truly complex geometries could now be handled, as well as allowing for
hybrid mesh types, embedded refinement and a number of other techniques

10.4.6

Pressure Boundary Conditions and Global Continuity

Pressure and Velocity Boundary Condition


Momentum and pressure equations form a coupled set of equations. A
consequence of this is a coupled behaviour of their boundary conditions:
the prescribed condition on u and p need to act in unison. If this is not the
case, the pressure-velocity system may be ill-posed and have no solution
The easiest way of examining the nature of boundary condition coupling is
based on the semi-discretised form of the momentum equation:
uP = (auP )1 (H(u) p)

(10.65)

1. On boundaries where u is prescribed, the value of pressure on the


boundary is a part of the solution and cannot be enforced
2. If a boundary value of p is given, the pressure gradient will balance
the flow rate: thus, the flow rate is a part of the solution and cannot
be enforced
There exists a profusion of pressure- and velocity boundary conditions,
e.g. fixed pressure inlet, pressure drop etc. which seem to invalidate the
above. However, for stabile discretisation the actual implementation of the
boundary condition will obey the above rules, with the wrapping for user
convenience
Example: a fixed pressure inlet boundary condition will internally act as a
fixed velocity boundary condition. However, the value of fixed velocity will
be adjusted such that the pressure value (obtained as a part of the solution)
tends towards the one specified by the user

160

Pressure-Velocity Coupling

Enforcing Global Continuity


Note that the pressure equation is derived from a global continuity condition
u = 0

(10.66)

This condition should be satisfied for each cell and for the domain as a
whole
Looking at the formulation of the pressure-velocity system in incompressible
flows, we can establish that the absolute pressure level does not appear in
the equations: it is the pressure gradient that drives the flow
In some situations it is possible to have a set of boundary conditions where
the pressure level is unknown from its boundary conditions. In such cases,
two corrections are needed:
Undeterminate pressure level implies a zero eigen-value in the pressure
matrix. In order to resolve such problems, the level of pressure will be
artificially fixed in one computational point
In order for the continuity equation to be satisfied for each cell, it also
needs to be satisfied for the complete domain. When a pressure level
is fixed by a boundary condition, global continuity will be enforced as
a part of the pressure solution. However, when this is not the case,
one needs to explicitly satisfy the condition after solving the pressure
equation.
Adjusting global continuity
1. Sum up the magnitude of all fluxes entering the domain
X
Fin =
|F |; F < 0

2. Separately, sum up all the fluxes leaving the domain


X
|F |; F > 0
Fout =
3. Adjust the out-going fluxes such that Fin = Fout

(10.67)

(10.68)

Chapter 11
Compressible Pressure-Based
Solver
11.1

Handling Compressibility Effects in PressureBased Solvers

In this Chapter we shall repeat the derivation of the pressure-based solver for
compressible flows. The idea of a behind the derivation is that a pressure-based
algorithm and pressure-velocity coupling does not suffer from singularity in the
incompressible limit and may behave better across the range of speeds. Memory
usage for a segregated solver is also considerably lower than the coupled one,
which may be useful in large-scale simulations.
The issue that remains to be resolved is the derivation of the pressure equation
and momentum-pressure-energy coupling procedure
Compressibility Effects
Compressible form of the continuity equation introduces density into the
system

+ (u) = 0
t

(11.1)

In the analysis, we shall attempt to derive the equation set in general terms.
For external aerodynamics, it is typical to use the ideal gas law as the
constitutive relation connecting pressure p and density :
P
= P
RT
where is compressibility:
=

1
RT

(11.2)

(11.3)

162

Compressible Pressure-Based Solver

The principle is the same for more general expressions. In this case, presence of density also couples in the energy equation because temperature T
appears in the constitutive relation
(e)
+ (eu) (T ) = gu (P u)
t



2

(u) u + u + (u)T u + Q,
3

(11.4)

Momentum equation is in a form very simular to before: note the presence


of (non-constant) density in all terms. Also, unlike the incompressible form,
we shall now deal with dynamic pressure and viscosity in the lace of their
kinematic equivalents

2
(u)
T
+(uu) u + (u)
= g P + u (11.5)
t
3

In the incompressible form, the (u)T term was dropped due to u =


0:

(u)T = u + (u)

(11.6)

where the first term disappears for = const. and the second for u = 0.
In compressible flows, this is not the case and the term remains

11.2

Derivation of the Pressure Equation in Compressible Flows

Compressible Pressure Equation


The basic idea in the derivation is identical to the incompressible formulation: we shall use the semi-discretised form of the momentum equation
auP uP = H(u) P

(11.7)

uP = (auP )1 (H(u) P )

(11.8)

and

Substituting this into the continuity equation will not yield the pressure
equation directly: we need to handle the density-pressure relation

163

11.2 Derivation of the Pressure Equation in Compressible Flows

The first step is the transformation of the rate-of-change term. Using the
chain rule on = (p, . . .), it follows:
P

=
t
P t

(11.9)

From the ideal gas law, it follows

=
P

(11.10)

Looking at the divergence term, we will substitute the expression for u and
try to present in terms of P as appropriate

(u) = (auP )1 H(u) (auP )1 P

(11.11)

The first term is under divergence and we will attempt to convert it into a
convection term. Using = P , it follows:

(auP )1 H(u) = P (auP )1 H(u) = (Fp P )

(11.12)

Fp = (auP )1 H(u)

(11.13)

where Fp is the flux featuring in the convective effects in the pressure.

The second term produces a Laplace operator similar to the incompressible


form and needs to be preserved. The working variable is pressure and we
will leave the term in the current form. Note the additional pre-factor,
which will remain untouched; otherwise the term would be a non-linear
function of P
Combining the above, we reach the compressible form of the pressure equation:

( P )
+ (auP )1 H(u) P (auP )1 P = 0
t

(11.14)

A pleasant surprise is that the pressure equation is in standard form: it


consists of a rate of change, convection and diffusion terms. However, flux
Fp is not a volume/mass flux as was the case before. This is good news:
discretisation of a standard form can be handle in a stable, accurate and
bounded manner

164

Compressible Pressure-Based Solver

11.3

Pressure-Velocity-Energy Coupling

Discretised Pressure-Velocity System


Let us review the set of equations for the compressible system
Discretisation of the momentum equation is performed in standard way.
Pressure gradient term is left in a differential form:
auP uP = H(u) P

(11.15)

Using the elements of the momentum equation, a sonic flux is assembled


as:
Fp = (auP )1 H(u)

(11.16)

Pressure equation is derived by substituting the expression for u and expressing density in terms of pressure

( P )
+ (Fp P ) (auP )1 P = 0
t

The face flux expression is assembled in a similar way as before

F = sf (auP )1 H(u) f Pf (auP )1 sf P

(11.17)

(11.18)

and is evaluated from the pressure solution

Density can be evaluated either from the constitutive relation:


=

P
= P
RT

(11.19)

or from the continuity equation. Note that at this stage the face flux (=
velocity field) is known and the equation can be explicitly evaluated for
Depending on the kind of physics and the level of coupling, the energy
equation may or may not be added to the above. It is in standard form but
contains source and sink terms which need to be considered with care
Coupling Algorithm
The pressure-velocity coupling issue in compressible flows is identical to its
incompressible equivalent: in order to solve the momentum equation, we
need to know the pressure, whose role is to impose the continuity constraint
on the velocity

11.4 Additional Coupled Equations

165

In the limit of zero Ma number, the pressure equation reduces to its incompressible form
With this in mind, we can re-use the incompressible coupling algorithms:
SIMPLE and PISO
In cases of rapidly changing temperature distribution (because of the changes
in source/sink terms in the energy equation), changing temperature will
considerably change the compressibility . For correct results, coupling
between pressure and temperature needs to be preserved and the energy
equation is added into the loop
Boundary Conditions
We have shown that for incompressible flows boundary conditions on pressure and velocity are not independent: two equations are coupled and badly
posed set of boundary conditions may result in an ill-defined system
In compressible flows, we need to account for 3 variables (, u, e) handled
together. The issue is the same: number of prescribed values at the boundary depends on the number of characteristics pointing into the domain:
Supersonic inlet: 3 variables are specified
Subsonic inlet: 2 variables
Subsonic outlet 1 variable
Supersonic outlet: no variables
Inappropriate specification of boundary conditions or location of boundaries
may result in an ill-defined problem: numerical garbage

11.4

Additional Coupled Equations

Coupling to Other Equations


Compared with the importance and strength of pressure-velocity (or pressurevelocity-energy) coupling, other equations that appear in the system are
coupled more loosely
We shall consider two typical sets of equations: turbulence and chemical
reactions

166

Compressible Pressure-Based Solver

Turbulence
Simple turbulence models are based on the Boussinesq approximation, where
t acts as turbulent viscosity. Coupling of turbulence to the momentum
equation is relatively benign: the Laplace operator will handle it without
trouble
In all cases, momentum to turbulence coupling will thus be handled in a
segregated manner
In 2-equation models, the coupling between two equations may be strong
(depending on the model formulation). Thus, turbulence equations may
be solved together keep in mind that only linear coupling may be made
implicit
A special case is Reynolds stress transport model: the momentum equation
is formally saddle-point with respect to R; R is governed by its own equation. In most cases, it is sufficient to handle RSTM models as an explicit
extension of the reduced 2-equation model (note that k = tr(R)). From
time to time, the model will blow up, but careful discretisation usually
handles is sufficiently well
Chemistry and Species
Chemical species equations are coupled to pressure and temperature, but
more strongly coupled to each other. Coupling to the rest of the system is
through material properties (which depend on the chemical composition of
the fluid) and temperature.
Only in rare cases it is possible to solve chemistry in a segregated manner:
a coupled chemistry solver is preferred
The second option is a 2-step strategy. Local equilibrium solution is sought
for chemical reactions using an ordinary differential equation (ODE) solver,
which is followed be a segregated transport step

11.5

Comparison of Pressure-Based and Density Based Solvers

Density-Based Solver
Coupled equations are solved together: flux formulation enforces the coupling and entropy condition

11.5 Comparison of Pressure-Based and Density Based Solvers

167

The solver is explicit and non-linear in nature: propagating waves. Extension to implicit solver is approximate and done through linearisation
Limitation on Courant number are handled specially: multigrid is a favoured
acceleration technique
Problem exist at the incompressibility limit: formulation breaks down
PressureBased Solver
Equation set is decoupled and each equation is solved in turn: segregated
solver approach
Equation coupling is handled by evaluating the coupling terms from the
available solution and updating equations in an iteration loop
Density equation is reformulated as an equation for the pressure. In the
incompressible limit, it reduces to a the pressure-velocity system described
above: incompressible flows are handled naturally
Equation segregation implies that matrices are created and inverted one at
a time, re-using the storage released the storage from the previous equation.
This results is a considerably lower overall storage requirement
Flux calculation is performed one equation at a time, consistent with the
segregated approach. As a consequence, the entropy condition is regularly
violated (!)
Variable Density or Transonic Formulation
To follow the discussion, note that the cost of solving an elliptic equation
(characterised by a symmetric matrix) is half of the equivalent cost for the
assymetric solver
For low Mach number or variable compressibility flows, it is known in advance that the pressure equation is dominated by the Laplace operator.
Discretised version of it creates a symmetric matrix
In subsonic high-Ma or transonic flows, importance of convection becomes
more important. However, changed nature of the equation (transport is
local) makes it easier to solve
Variable compressibility formulation handles the convection explicitly:
the matrix remains symmetric but total cost is reduced with minimal impact
on accuracy

168

Compressible Pressure-Based Solver

Chapter 12
Turbulence Modelling for
Aeronautical Applications
12.1

Nature and Importance of Turbulence

Why Model Turbulence?


The physics of turbulence is completely understood and described
in all its detail: turbulent fluid flow is strictly governed by the Navier-Stokes
equations
. . . but we do not like the answer very much!
Turbulence spans wide spatial and temporal scales
When described in terms of vortices (= eddies), non-linear interaction
is complex
Because of non-linear interactions and correlated nature, it cannot be
attacked statistically
It is not easy to assemble the results of full turbulent interaction and
describe them in a way relevant for engineering simulations: we are
more interested in mean properties of physical relevance
In spite of its complexity, there is a number of analytical, order-of-magnitude
and quantitative result for simple turbulence flows. Some of them are extremely useful in model formulation
Mathematically, after more than 100 years of trying, we are nowhere near
to describing turbulence the way we wish to

170

Turbulence Modelling for Aeronautical Applications

Handling Turbulent Flows


Turbulence is irregular, disorderly, non-stationary, three-dimensional, highly
non-linear, irreversible stochastic phenomenon
Characteristics of turbulent flows (Tennekes and Lumley: First Course in
Turbulence)
Randomness, meaning disorder and no-repeatability
Vorticality: high concentration and intensity of vorticity
Non-linearity and three-dimensionality
Continuity of Eddy Structure, reflected in a continuous spectrum
of fluctuations over a range of frequencies
Energy cascade, irreversibility and dissipativeness
Intermittency: turbulence can only occupy only parts of the flow
domain
High diffusivity of momentum, energy, species etc.
Self-preservation and self-similarity: in simple flows, turbulence
structure depends only on local environment
Turbulence is characterised by higher diffusion rates: increase id drag, mixing, energy diffusion. In engineering machinery, this is sometimes welcome
and sometimes detrimental to the performance
Laminar-turbulent transition is a process where laminar flow naturally and
without external influence becomes turbulent. Example: instability of free
shear flows
Vortex Dynamics and Energy Cascade
A useful way of looking at turbulence is vortex dynamics.
Large-scale vortices are created by the flow. Through the process of
vortex stretching vortices are broken up into smaller vortices. This
moves the energy from large to smaller scales
Energy dissipation in the system scales with the velocity gradient,
which is largest in small vortices

171

12.1 Nature and Importance of Turbulence

Energy

Taylor scale

Kolmogorov
scale

Energy scales
Inertial range

Dissipation
Wavenumber

The abscissa of the above is expressed in terms of wavenumber: how many


vortices fit into the space
Thus, we can recognise several parts of the energy cascade:
Large scale vortices, influenced by the shape of flow domain and global
flow field. Large scale turbulence is problematic: it is difficult yo decide
which of it is a coherent structure and which is actually turbulence
Energy-containing vortices, which contain the highest part of the turbulent kinetic energy. This scale is described by the Taylor scale
Inertial scale, where vortex stretching can be described by inertial
effects of vortex breakup
Small vortices, which contain low proportion of overall energy, but
contribute most of dissipation. This is also the smallest relevant scale
in turbulent flows, characterised by the Kolmogorov micro-scale
Note that all of turbulence kinetic energy eventually ends up dissipated as
heat, predominantly is small structures
Turbulence Modelling
The business of turbulence modelling can be described as:
We are trying to find approximate simplified solutions for the
Navier-Stokes equations in the manner that either describes turbulence in terms of mean properties or limits the spatial/temporal
resolution requirements associated with the full model

172

Turbulence Modelling for Aeronautical Applications

Turbulence modelling is therefore about manipulating equations and creating closed models in the form that allows us to simulate turbulence interaction under our own conditions. For example, a set of equations describing
mean properties would allow us to perform steady-state simulations when
only mean properties are of interest
We shall here examine three modelling frameworks
Direct Numerical Simulation (DNS)
Reynolds-Averaged Navier-Stokes Equations (RANS), including eddy viscosity models and higher moment closure. For compressible flows with significant compressibility effects, the averaging is actually of the Favre type
Large Eddy Simulation (LES)

12.2

Direct Numerical Simulation of Turbulence

Direct Numerical Simulation


DNS is, strictly speaking, not a turbulence model at all: we will simulate
all scales of interest in a well-resolved transient mode with sufficient spatial
and temporal resolution
In order to perform the simulation well, it is necessary to ensure sufficient
spatial and temporal resolution:
Spatial resolution: vortices smaller that Kolmogorov scale will dissipate their energy before a full turn. Smaller flow features are of no
interest; Kolmogorov scale is a function of the Re number
Temporal resolution is also related to Kolmogorov scale; but may be
adjusted for temporal accuracy
Computer resources are immense: we can really handle relatively modest
Re numbers and very simple geometry
. . . but this is the best way of gathering detailed information on turbulent
interaction: mean properties, first and second moments, two-point correlations etc. in full fields
In order to secure accurate higher moments, special numerics is used: e.g.
sixth order in space and tenth order in space will ensure that higher moments are not polluted numerically. An alternative are spectral models,
using Fourier modes or Chebyshev polynomials as a discretisation base

173

12.3 Reynolds-Averaged Turbulence Models

DNS simulations involve simple geometries and lots of averaging. Data is


assembled into large databases and typically used for validation or tuning
of proper turbulent models
DNS on engineering geometries is beyond reach: the benefit of more complete fluid flow data is not balanced by the massive cost involved in producing it
Current research frontier: compressible turbulence with basic chemical reactions, e.g. mixing of hydrogen and oxygen with combustion; buoyancydriven flows

12.3

Reynolds-Averaged Turbulence Models

Reynolds Averaging
The rationale for Reynolds averaging is that we are not interested in the
part of flow solution that can be described as turbulent fluctuations:
instead, it is the mean (velocity, pressure, lift, drag) that is of interest.
Looking at turbulent flow, it may be steady in the mean in spite of turbulent
fluctuations. If this is so, and we manage to derive the equations for the
mean properties directly, we may reduce the cost by orders of magnitude:
It is no longer necessary to perform transient simulation and assemble
the averages: we are solving for average properties directly
Spatial resolution requirement is no longer governed by the Kolmogorov
micro-scale! We can tackle high Reynolds numbers and determine the
resolution based on required engineering accuracy
Reynolds Averaged Navier-Stokes Equations
Repeating from above: decompose u and p into a mean and fluctuating
component:
u = u + u
p = p + p

(12.1)
(12.2)

Substitute the above into original equations. Eliminate all terms containing
products of mean and fluctuating values
u
+ (u u) (u) = p + (u u )
t

(12.3)

u = 0

(12.4)

174

Turbulence Modelling for Aeronautical Applications

One new term: the Reynolds stress tensor:


R = u u

(12.5)

R is a second rank symmetric tensor. We have seen something similar


when the continuum mechanics equations were assembled, but with clear
separation of scales: molecular interaction is described as diffusion
Modelling Paradigms
In order to close the system, we need to describe the unknown value, R as
a function of the solution. Two ways of doing this are:
1. Write an algebraic function, resulting in eddy viscosity models
R = f (u, p)

(12.6)

2. Add more differential equations, i.e. a transport equation for R, producing Reynolds Transport Models. A note of warning: as we
keep introducing new equations, the above problem will recur. At the
end, option 1 will need to be used as some level of closure
Both options are in use today, but the first one massively out-weights the
second in practicality

12.3.1

Eddy Viscosity Models

Dimensional Analysis
Looking at R, the starting point is to find an appropriate symmetric second
rank tensor. Remember that the terms acts as diffusion of momentum,
appears in the equation under divergence and appears to act as diffusion
Based on this, the second rank tensor is the symmetric velocity gradient S:
R = f (S)

(12.7)

where
S=

1
u + (u)T
2

(12.8)

Under divergence, this will produce a (u) kind of term, which makes
physical sense and is numerically well behaved
Using dimensional analysis, it turns out that we need a pre-factor of dimensions of viscosity: for laminar flows, this will be [m2 /s] and because of its
equivalence with laminar viscosity we may call it turbulent viscosity t

12.3 Reynolds-Averaged Turbulence Models

175

The problem reduces to finding t as a function of the solution. Looking


at dimensions, we need a length and time-scale, either postulated or calculated. On second thought, it makes more sense to use velocity scale U
and length-scale
We can think of the velocity scale as the size of u and length-scale as the
size of energy-containing vortices. Thus:

1
(12.9)
R = t u + (u)T
2
and
U
t = A
(12.10)

where A is a dimensionless constant allowing us to tune the model to the


actual physical behaviour
Velocity and Length Scale
Velocity scale is relatively easy: it represents the strength of turbulent
fluctuations. Thus, U |u |. Additionally, it is easy to derive the equation
for turbulence kinetic energy k:
3 2
k = u
2
directly from the momentum equation in the following form:

2
k
1
T
+ (uk) [(ef f )k] = t
(u + u )
t
2

(12.11)

(12.12)

Here is turbulent dissipation which contains the length scale:


3

k2
= C

(12.13)

Zero and One-Equation Models


Zero equation model: assume local equilibrium above: k = , with
no transport. The problem reduces to the specification of length-scale.
Example: Smagorinsky model
t = (CS )2 |S|

(12.14)

where CS is the Smagorinsky constant. The model is actually in active


use (!) but not in this manner see below
One equation model: solve the k equation and use an algebraic equation
for the length scale. Example: length-scale for airfoil simulations can be
determined form the distance to the wall

176

Turbulence Modelling for Aeronautical Applications

Two-Equation Model
Two-equation models are the work-horse of engineering simulations today.
Using the k equation from above, the system is closed by forming an equation for turbulent dissipation and modelling its generation and destruction
terms
Other choices also exist. For example, the Wilcox model uses eddy turnover
time as the second variable, claiming better behaviour near the wall and
easier modelling
Two-equation models are popular because it accounts for transport of moth
the velocity and length-scale and can be tuned to return several canonical
results
Standard k Model
This is the most popular 2-equation model, now on its way out. There
exists a number of minor variants, but the basic idea is the same
Turbulence kinetic energy equation
k
+ (uk) [(ef f )k] = G
t

(12.15)

where
G = t

1
(u + uT )
2

(12.16)

Dissipation of turbulence kinetic energy equation

2
+ (u) [(ef f )] = C1 G C2
t
k
k

(12.17)

Turbulent viscosity
t = C

k2

Reynolds stress

1
T
R = t (u + u )
2
Model constants are tuned to canonical flows. Which?

(12.18)

(12.19)

12.3 Reynolds-Averaged Turbulence Models

12.3.2

177

Reynolds Transport Models

Background
Transport equation for Reynolds stress R = f (u, p) is derived in a manner
similar to the derivation of the Reynolds-averaged Navier-Stokes equation.
We encounter a number of terms which are physically difficult to understand
(a pre-requisite for the modelling)
Again the most difficult term is the destruction of R, which will ba handled
by solving its own equation: it is unreasonable to expect a postulated or
equilibrium length-scale to be satisfactory
Analytical form of the (scalar) turbulence destruction equation is even more
complex: in full compressible form it contains over 70 terms
The closure problem can be further extended by writing out equations for
higher moments etc. but natural closure is never achieved: the number
of new terms expands much faster that the number of equations
Modelling Reynolds Stress Equation
Briefly looking at the modelling of the R and equations, physical understanding of various terms is relatively weak and uninteresting. As a result,
terms are grouped into three categories
Generation terms
Redistribution terms
Destruction terms
Each category is then modelled as a whole
Original closure dates from 1970s and in spite of considerable research efforts, it always contained problems
Currently, Reynolds transport models are used only in situations where
it is a-priori known that eddy viscosity models fails. Example: cyclone
simulations
Standard Closure
Reynolds stress transport equation

2
R
+(uR)[(R t + l )R] = PC1 R+ (C1 1)IC2 dev(P)+W
t
k
3
(12.20)
where

178

Turbulence Modelling for Aeronautical Applications

P is the production term


P = R [u + (u)T ]

(12.21)

t is the turbulent viscosity


t = C

k2

(12.22)

and k is the turbulent kinetic energy


k=

1
tr(R)
2

(12.23)

W is the wall reflection term(s)


G=

1
tr(P)
2

(12.24)

Dissipation equation: is still a scalar

+ (u) [( t + l )] = C1 G C2
t
k
k

(12.25)

P is the production term


P = R [u + (u)T ]

(12.26)

G is the (scalar) generation term


G=

1
tr(P)
2

(12.27)

Comparing Reynolds Closure with Eddy Viscosity Models


Eddy viscosity implies that the Reynolds stress tensor is aligned with the
velocity gradient
R = t

1
u + (u)T
2

(12.28)

This would represent local equilibrium: compare with equilibrium assumptions for k and above
In cases where the two tensors are not aligned, Reynolds closure results are
considerably better
. . . but at a considerable cost increase: more turbulence equations, more
serious coupling with the momentum equation

12.3 Reynolds-Averaged Turbulence Models

12.3.3

179

Near-Wall Effects

Turbulence Near the Wall


Principal problem of turbulence next to the wall is the inverted energy
cascade: small vortices are rolled up and ejected from the wall. Here, small
vortices create big ones, which is not accounted in the standard modelling
approach
Presence of the wall constrains the vortices, giving them orientation: effect
on turbulent length-scales
Most seriously of all, both velocity and turbulence properties contain very
steep gradients near the wall. Boundary layers on high Re are extremely
thin. Additionally, turbulent length-scale exhibits complex behaviour: in
order for the model to work well, all of this needs to be resolved in the
simulation
Resolved Boundary Layers
Low-Re Turbulence Models are based on the idea that all details of
turbulent flow (in the mean: this is still RANS!) will be resolved
In order to achieve this, damping functions are introduced in the near-wall
region and tuned to actual (measured, DNS) near-wall behaviours
Examples of such models are: Launder-Sharma, Lam-Bremhorst k
Near-wall resolution requirements and boundary conditions depend on the
actual model, but range from y + = 0.010.1 for the first node, with grading
away from the wall. This is a massive resolution requirement!
If the resolution requirement is not satisfied, models will typically blow up.
On stabilisation, velocity profile and wall drag will be wrong
Wall Functions
In engineering simulations, we are typically not interested in the details of
the near-wall region. Instead, we need to know the drag
This allows us to bridge the troublesome region near the wall with a coarse
mesh and replace it with an equilibrium model for attached flows: wall
functions
Wall functions bridge the problematic near-wall region, accounting for drag
increase and turbulence. A typical resolution requirement is y + = 30 50,
but coarser meshes can also be used

180

Turbulence Modelling for Aeronautical Applications

This is a simple equilibrium model for fully developed attached boundary


layer. It will cause loss of accuracy in non-equilibrium boundary layers, but
it will still produce a result
Wall functions split the region of operation below and above y + = 11.6 and
revert to laminar flow for below it. Here, increased mesh resolution may
result in less accurate drag prediction this is not a well-behaved model
Advanced wall functions may include effects of adverse pressure gradient
and similar but are still a very crude model
Note that wall functions are used with high-Re bulk turbulence models,
reducing the need for high resolution next to the wall
What Can a Low-Re Model Do For Me?
With decreasing Re number, turbulence energy spectrum loses its inertial
range and regularity: energy is not moved smoothly from larger scales to
smaller; importance of dissipation spreads to lower wavenumber
Low-Re models are aimed at capturing the details of the near-wall flow,
characterised by lower Re
However, near-wall turbulence is nothing like low-Re bulk flow: this is to do
with the presence and effect of the wall, not the loss of turbulence structure
A low-Re turbulence model is not appropriate for low-Re flows away from
the wall: the results will be wrong!

12.3.4

Transient RANS Simulations

Concept of Transient RANS


RANS equations are derived by separating the variable into the mean and
fluctuation around it. In simple situations, this implies a well-defined meaning: mean is (well,) mean implying time-independence and the fluctuation
carries the transient component
In many physical simulations, having a time-independent mean makes no
sense: consider a flow simulation in an internal combustion engine. Here, we
will change the mean into a ensemble average (over a number of identical
experiments) and allow the mean to be time-dependent
In other cases, the difference between the mean and fluctuation may become
even more complex: consider a vortex shedding behind a cylinder at high
Re, where large shed vortices break up into turbulence further downstream

12.4 Large Eddy Simulation

181

Idea of RANS here is recovered through separation of scales, where large


scales are included in the time-dependence of the mean and turbulence is
modelled as before. It is postulated that there exists separation of scales
between the mean (= coherent structures) and turbulence
Using Transient RANS
Transient RANS is a great step forward in the fidelity of modelling. Consider a flow behind an automobile, with counter-rotating vortices in the
wake and various other unsteady effects. Treating it as steady implies
excessive damping, typically done through first-order numerics because the
simulation does not naturally converge to steady-state
Simulations can still be 2-D where appropriate and the answer is typically
analysed in terms of a mean and coherent structure behaviour
RANS equations are assembled as before, using a transient Navier-Stokes
simulations. Usually, no averaging is involved
Transitional Flows
Phenomena of transition are extremely difficult to model: as shown before,
a low-Re turbulence model would be a particularly bad choice
The flow consists of a mixture of laminar pockets and various levels of
turbulence, with laminar-to-turbulent transition within it
Apart from the fact that a low-Re flow is difficult to model in RANS,
additional problem stems from the fact that k = = 0 is the solution to
the model: thus if no initial or boundary turbulence is given, transition will
not take place
Introducing intermittency equation: to handle this a RANS model is augmented by an equation marking presence of turbulence
Transition models are hardly available: basically, a set of correlations is
packed as transport equations. Details of proper boundary conditions,
posedness of the model, user-controlled parameters and model limitations
are badly understood. A better approach is needed!

12.4

Large Eddy Simulation

Deriving LES Equations

182

Turbulence Modelling for Aeronautical Applications

Idea of LES comes from the fact that large-scale turbulence strongly depends on the mean, geometry and boundary conditions, making it casedependent and difficult to model. Small-scale turbulence is close to homogenous and isotropic, its main role is energy removal from the system,
it is almost universal (Re dependence) and generally not of interest
Mesh resolution requirements are imposed by the small scales, which are
not of interest anyway
In LES we shall therefore simulate the coherent structures and largescale turbulence and model small-scale effects
For this purpose, we need to make the equations understand scale, using equation filtering: a variable is decomposed into large scales which
are solved for and modelled small scales. To help with the modelling, we
wish to capture a part of the inertial range and model the (universal) high
wavenumber part of the spectrum
Unlike transient RANS, a LES simulation still captures a part of turbulence
dynamics: a simulation must be 3-D and transient, with the results obtained
by averaging
Filtered Navier-Stokes Equations
Equation averaging is mathematically defined as:
Z
u = G(x, x ) u(x )dx ,

(12.29)

where G(x, x ) is the localised filter function


Various forms of the filter functions can be used: local Gaussian distribution, top-hat etc. with minimal differences. The important principle is
localisation
After filtering, the equation set looks very similar to RANS, but the meaning
is considerably different
u
+ (u u) (u) = p +
t

(12.30)

u = 0

(12.31)

= (u u u u) + (u u + u , u) + u u = L + C + B

(12.32)

with

12.4 Large Eddy Simulation

183

The first term, L is called the Leonard stress. It represents the interaction
between two resolved scale eddies to produce small scale turbulence
The second term, C (cross term), contains the interaction between resolved
and small scale eddies. It can transfer energy in either direction but on
average follows the energy cascade
The third term represents interaction between two small eddies to create
a resolved eddy. B (backscatter) represents energy transfer from small to
large scales
Sub-Grid Scale (SGS) Modelling
The scene in LES has been set to ensure that single turbulence models work
well: small-scale turbulence is close to homogenous and isotropic
The length-scale is related to the separation between resolved and unresolved scales: therefore, it is related to the filter width
In LES, implicit filtering is used: separation between resolved and unresolved scales depends on mesh resolution. Filter size is therefore calculated
as a measure of mesh resolution and results are interpreted accordingly
Typical models in use are of Smagorinsky model type, with the fixed or
dynamic coefficients. In most models, all three terms are handled together
Advanced models introduce some transport effects by solving a subgrid kequation, use double filtering to find out more about sub-grid scale or create
a structural picture of sub-grid turbulence from resolved scales
Amazingly, most models work very well: it is only important to remove the
correct amount of energy from resolved scales
LES Inlet and Boundary Conditions
In the past, research on LES has centred on sub-grid scale modelling and
the problem can be considered to be resolved
Two problematic areas in LES are the inlet conditions and near-wall treatment
Modelling near-wall turbulence A basic assumption of LES is energy
transfer from large towards smaller scales, with the bulk of dissipation taking place in small vortices. Near the wall, the situation is reversed: small
vortices and streaks are rolled up on the wall and ejected into the bulk

184

Turbulence Modelling for Aeronautical Applications

Reversed direction of the energy cascade violates the modelling paradigm.


In principle, the near-wall region should be resolved in full detail, with
massive resolution requirements
A number of modelling approaches to overcome the problem exists:
structural SGS models (guessing the sun-grid scale flow structure),
dynamic SGS models, approaches inspired by the wall function treatment and Detached Eddy Simulation
Inlet boundary condition. On inlet boundaries, flow conditions are typically known in the mean, or (if we are lucky) with u and turbulence lengthscale. An important property of turbulence in the energy cascade: correlation between various scales and vortex structures. The inlet condition
should contain the real turbulence interaction and it is not immediately
clear how to do this
Energy-Conserving Numerics
For accurate LES simulation, it is critical to correctly predict the amount
of energy removal from resolved scale into sub-grid. This is the role of a
SGS model
In order for the SGS model to perform its job, it is critical that the rest
of implementation does not introduce dissipative errors: we need energyconserving numerics
Errors introduced by spatial and temporal discretisation must not interfere
with the modelling
In short, good RANS numerics is not necessarily sufficient for LES simulations. In RANS, a desire for steady-state and performance of RANS models
masks poor numerics; in LES this is clearly not the case
Averaging and Post-Processing
Understanding LES results is different than looking at steady or transient
RANS: we have at disposal a combination of instantaneous fields and averaged results
Resolved LES fields contain a combination of mean (in the RANS sense)
and large-scale turbulence. Therefore, it is extremely useful in studying the
details of flow structure
The length of simulation, number of averaging steps etc. is studied in terms
of converging averages: for statistically steady simulations, averages
must converge!

12.5 Choosing a Turbulence Model

185

A good LES code will provide a set of on-the-fly averaging tools to assemble
data of interest during the run
Flow instability and actual vortex dynamics will be more visible in the
instantaneous field
Data post-processing
It is no longer trivial to look and understand the LES results, especially
in terms of vortex interaction: we typically use special derived fields,
e.g. enstrophy (magnitude of curl of velocity), invariants of the strain
tensor etc.
Looking at LES results takes some experience and patience: data sets
will be very large

12.5

Choosing a Turbulence Model

Background
There exists a wide range of turbulence models in various approaches to the
problem. A role of a good engineer is to choose the best for the problem at
hand
Important factors are the goal of simulation, available computer resources
and required accuracy
In what follows, we will give short overview of traditional choices

12.5.1

Turbulence Models in Airfoil Simulations

Single and Multiple Airfoils


Simulations typically done in steady-state and 2-D
Objective of simulation is mainly lift/drag and stall characteristics
This automatically implies 2-D steady-state RANS. Moreover, region of
interest is close to the surface of the airfoil; the bulk flow is simple
Presence of the wall allows for simple prescription of length-scale

186

Turbulence Modelling for Aeronautical Applications

New Challenges
Laminar-to turbulent transition occurs along the airfoil; in multiple airfoil
configuration, upstream components trigger transition downstream
In order to handle transition, new models are being developed (currently:
useless!)
Problematic region is also found around the trailing edge: flow detachment
LES is prohibitively expensive: from steady-state 2-D RANS to unsteady
3-D with averaging
Choice of Models
Zero-equation and one-equation turbulence models for aeronautics
Balwdin-Lomax model, Cebeci-Smith are the usual choices. Spalart-Allmaras
model represents the new generation and across all models the performance is very good
This is a very popular set of cases for low-Re RANS models
2-equation models are also used regularly. A very popular model is the
k because of its performance close the the wall

12.5.2

Turbulence Models in Bluff-Body Aerodynamics

Background
Bluff body flows (e.g. complete aircraft, automobile, submarine) are considerably more complex, both in the structure of boundary layers and in
the wake
Abandoning local equilibrium: transport of turbulence and length-scale
A standard choice of model would be 2-equation RANS with wall functions.
Currently moving to transient RANS
Choice of Models
k model and its variants; k model represent normal industrial
choice. There are still issues with mesh resolution for full car/aeroplane
aerodynamics: meshes for steady RANS with wall functions can be of the
order of 100 million cells and larger

12.6 Future of Turbulence Modelling in Industrial Applications

187

Low-Re formulations wall-bounded flows is not popular: excessive mesh


resolution for realistic geometric shapes
Study of instabilities and aero-acoustic effects in moving steadily to LES.
Typically, only a part of the geometry is modelled and coupled to the global
(RANS) model for boundary conditions. Examples: bomb bay in aeroplanes
or wing mirrors in automobiles

12.6

Future of Turbulence Modelling in Industrial Applications

Future Trends
Future trends are quite clear: moving from the RANS modelling to LES on
a case-by-case basis and depending on problems with current models and
available computer resources
RANS is recognised as insufficient in principle because the decomposition
into mean and fluctuation. Also, models are too diffusive to capture detailed
flow dynamics. Research in RANS is scaled down to industrial support;
everything else is moving to LES
Transient RANS is a stop-gap solution until LES is not available at reasonable cost
DNS remains out of reach for all engineering use, but provides a very good
base for model development and testing

188

Turbulence Modelling for Aeronautical Applications

Chapter 13
Large-Scale Computations
13.1

Background

In this chapter, a computing background for CFD simulations in engineering will


be examined. In 1965, Gordon Moore, Director of Fairchild Semiconductors Research and Development Laboratories, wrote an article on the future development
of semiconductor industry with a sentence on computing power at fixed cost is
doubling every 18 months.
Increasing computer power is the driving force behind the expansion of numerical simulation tools. Every new level of performance brings a possibility of
tackling new problems, using more advanced models or achieving higher simulation fidelity.

13.1.1

Computer Power in Engineering Applications

Background
CFD simulations are among the largest users of CPU time in the world.
Even for a relative novice, it is easy to devise and set up a very large
simulation that would yield relevant results
Other computational fields with similar level of requirements include
Numerical weather forecasting. Currently at the level of first-order
models and correlations tuned to the mesh size. Large facilities and
efforts at the UK Met Office and in Japan
Computational chemistry: detailed atom-level study of chemical reactions from first principles
Global climate modelling. This includes ocean and atmosphere models, vapour in atmosphere and polar ice caps effects. Example: global
climate model facility (Earth Simulator)

190

Large-Scale Computations

Direct numerical simulation of turbulence, mainly as replacement for


experimental studies
In all cases, the point is how to achieve maximum with the available computing resources rather than how to perform the largest simulation. A small
simulation with equivalent speed, accuracy etc. is preferred
Simulation Time
Typical simulation time depends on available resources, object of simulation and required accuracy. Recently, the issue of optimal use of computer
resources comes into play: running a trivial simulation on a supercomputer
is not fair game
Example: parametric studies, optimisation and robust design in engineering. Here, the point is to achieve optimal performance of engineering equipment by in-depth analysis. Optimisation algorithms will perform hundreds
of related simulations with subtle changes in geometrical and flow setup
details in order to achieve multi-objective optimum. Each simulation on its
own can be manageable, but we need several hundreds!
In many cases, the limiting factor is not feasibility, but time to market:
a Formula 1 car must be ready for the next race (or next season)
Reducing simulation time:
4. Algorithmic improvements: faster, more accurate numerics, timestepping algorithms
3. Linear solver speed. Numerical solution of large systems of algebraic equations is still under development. Having in mind that a
good solver spends 50-80 % of solution time inverting matrices, this is
a very important research area. Interaction with computer hardware
(how does the solver fit onto a supercomputer to use it to the best of
its abilities) is critical
2. Physical modelling. A typical role of a model is to describe complex
physics of small scales in a manner which is easier to simulate. Better
models provide sufficient accuracy for available resource
1. User expertise. The best way of reducing simulation time is an
experienced user. Physical understanding of the problem, modelling,
properties of numerics and required accuracy allows the user to optimally allocate computer resources. Conversely, there is no better
way of producing useless results or wasting computer resources than
applying numerical tools without understanding.

13.2 Classification of Computer Platforms

191

Scope
Our objective is to examine the architecture requirements, performance and
limitations of large-scale CFD simulations today
There is no need to understand the details of high performance programming or parallel communications algorithms: we wish to know what parallelism means, how to use it and how it affects solver infrastructure
Crucially, we will examine the mode of operation of parallel computers
choice of algorithms and their tuning
The first step is classification of high-performance computer platforms in
use today

13.2

Classification of Computer Platforms

High Performance Computers


Basic classification of high performance architecture depends on how instructions and data are handled in the computer (Flynn, 1972). Thus:
SISD: single instruction, single data
SIMD: single instruction, multiple data
MISD: multiple instruction, single data
MIMD: multiple instruction, multiple data
The above covers all possibilities. SISD is no longer considered high performance. In short, SISD is a very basic processing unit (a toaster?)
We shall concentrate on SIMD, also called a vector computer and MIMD,
known as a parallel computer. MISD is sometimes termed pipelining
and is considered a hardware optimisation rather than a programming
technique
Vector Computers
Computationally intensive part of CFD algorithms involves performing
identical operations on large sets of data. Example: calculation of face
values from cell centres for grading calculation in cell-centred FVM:
f = fx P + (1 fx )N

(13.1)

P and N belong to the same array over all cells. The result, f belongs to
an array over all faces. Subscripts P , N and f will be cell and face indices:

192

Large-Scale Computations

const labelList& owner = mesh.owner();


const labelList& neighbour = mesh.neighbour();
const scalarField& fx = mesh.weights();
for (label i = 0; i < phiFace.size(); i++)
{
phiFace[i] =
fx[i]*phiCell[owner[i]]
+ (1 - fx[i])*phiCell[neighbour[i]];
}

Performing an operation like this consists of several parts


(Splitting up the operation into bits managed by the floating point
unit)
Setting up the instruction registers, e.g. a = b + c d
Fetching the data (memory, primary cache, secondary cache, registers)
Performing the operation
In vector computers, the idea is that performing the same operation over a
large set can be made faster: create special hardware with lots of identical
(floating point) units under unified control
1. Set up instruction registers. This is done only once for the complete
data set
2. Assume the data is located in a contiguous memory space. Fetching
the start of the list grabs the whole list
3. Perform the operation on a large data set simultaneously
More Vector Computers
A number of joint units is called the vector length. It specifies how many
operations can be performed together. Typical sizes would be 256 or 1024:
potentially very fast!
Some care is required in programming. Examples:
Do-if structure
for (label i = 0; i < phiFace.size(); i++)
{
if (f_x < 0.33)
{

13.2 Classification of Computer Platforms

193

phiFace[i] =
0.5*(phiCell[owner[i]] + phiCell[neighbour[i]]);
}
else
{
phiFace[i] =
fx[i]*phiCell[owner[i]]
+ (1 - fx[i])*phiCell[neighbour[i]];
}
}

This kills performance: a decision required at each index. Reorganise


to execute the complete loop twice and then combine result. Min
performance loss: 50%!
Data dependency
for (label i = 0; i < phiFace.size(); i++)
{
phiCell[i] -= fx[i]*phiCell[owner[i]];
}

Values of phiCell depend on each other if this happens within a


single vector length, we have a serious problem!
Today, vector computers are considered very 1970-s. The principle works,
but loss of performance due to poor programming or compiler problems is
massive
Compilers and hardware are custom-built: cannot use off-the shelf components, making the computers very expensive indeed
However, the lesson on vectorisation is critical for understanding highperformance computing. Modern CPU-s will automatically and internally
attempt to configure themselves as vector machines (with a vector length
of 10-20, for example). If the code is written vector-safe and the compiler
is good, there will be substantial jump in performance
There is a chance that vector machines will make a come-back: the principle
of operation is sound but we need to make sure things are done more cleverly
and automatically
Parallel Computers
Recognising that vector computers perform their magic by doing many
operations simultaneously, we can attempt something similar: can a room
full of individual CPU-s be made to work together as a single large machine

194

Large-Scale Computations

Idea of massive parallelism is that a large loop (e.g. cell-face loop above)
could be executed much faster if it is split into bits and each part is given
to a separate CPU unit to execute. Since all operations are the same, there
is formally no problem in doing the decomposition
Taking a step back, we may generalise:
A complete simulation can be split into separate bits, where each
bit is given to a separate computer. Solution of separate problems
is then algorithmically coupled together to create a solution of the
complete problem.
Parallel Computer Architecture
Similar to high-performance architecture, parallel computers differ in how
each node (CPU) can see and access data (memory) on other nodes. The
basic types are:
Shared memory machines, where a single node can see the complete memory (also called addressing space) with no cost overhead
Distributed memory machines, where each node represents a selfcontained unit, with local CPU, memory and disk storage. Communication with other nodes involved network access and is associated
with considerable overhead compared to local memory access
In reality, even shared memory machines have variable access speed and
special architecture: other approaches do not scale well to 1000s of nodes.
Example: CC-NUMA (Cache Coherent Non-Uniform Memory Access)
For distributed memory machines, a single node can be an off-the-shelf
PC or a server node. Individual components are very cheap, the approach
scales well and is limited by the speed of (network) communication. This
is the cheapest way of creating extreme computing power from standard
components at very low price
Truly massively parallel supercomputers are an architectural mixtures of
local quasi-shared memory and fast-networked distributed memory nodes.
Writing software for such machines is a completely new challenge
Coarse- and Fine-Grain Parallelisation
We can approach the problem of parallelism at two levels

195

13.2 Classification of Computer Platforms

In coarse-grain parallelisation, the simulation is split into a number


of parts and their inter-dependence is handled algorithmically. Main
property of coarse-grain parallelisation is algorithmic impact. The solution algorithm itself needs to account for multiple domains and program parallel support
Fine-grain parallelisation operates on a loop-by-loop level. Here,
each look is analysed in terms of data and dependency and where appropriate it may be split among various processors. Fine-grain action
can be performed by the compiler, especially if the communications
impact is limited (e.g. shared memory computers)
In CFD, this usually involved domain decomposition: computational
domain (mesh) is split into several parts (one for each processor): this
corresponds to coarse-grain parallelisation
Global domain

Subdomain 1

Subdomain 2

Subdomain 3

Subdomain 4

decomposition

While fine-grain parallelisation sounds interesting, current generation of


compilers is not sufficiently clever for complex parallelisation jobs. Examples include algorithmic changes in linear equation solvers to balance
local work with communications: this action cannot be performed by the
compiler
Build Your Own Supercomputer
In the age of commodity computing, price of individual components is
falling: processors, memory chips, motherboards, networking components
and hard disks are commodity components
A distributed memory computer can be built up to medium size (dozens
of compute nodes) without concern: a balance of computing power and
communication speed is acceptable - usually called Beowulf clusters
As a result, parallel machines have become immensely popular and used
regularly even for medium-size simulations

196

13.3

Large-Scale Computations

Domain Decomposition Approach

In this section we will review the impact of parallel domain decomposition to various parts of the algorithm. A starting point is a computation mesh decomposed
into a number of sub-domains.

13.3.1

Components

Functionality
In order to perform a parallel FVM simulation, the following steps are
performed:
Computational domain is split up into meshes, each associated with a
single processor. This consists of 2 parts:
Allocation of cells to processors
Physical decomposition of the mesh in the native solver format
Optimisation of communications is important: it scales with the surface of inter-processor interfaces and a number of connections. Both
should be minimised. This step is termed domain decomposition
A mechanism for data transfer between processors needs to be devised.
Ideally, this should be done in a generic manner, to facilitate porting
of the solver between various parallel platforms: a standard interface
to a communications package
Solution algorithm needs to be analysed to establish the inter-dependence
and points of synchronisation
Additionally, we need a handling system for a distributed data set, simulation start-up and shut-down and data analysis tools
Keep in mind that during a single run we may wish to change a number
of available CPUs and may wish to resume or perform data analysis on a
single node
Parallel Communication Protocols
Today, Message Passing Interface (MPI) is a de-facto standard (http://wwwunix.mcs.anl.gov/mpi/). A programmer does not write custom communications routines. The standard is open and contains several public domain
implementation
On large or specialist machines, hardware vendor will re-implement or tune
the message passing protocol to the machine, but the programming interface
is fixed

13.3 Domain Decomposition Approach

197

Modes of communication
Pairwise data exchange, where processors communicate to each other
in pairs
Global synchronisation points: e.g. global sum. Typically executed as
a tree-structured gather-scatter operation
Communication time is influenced by 2 components
Latency, or a time interval required to establish a communication
channel
Bandwidth, or the amount of data per second that can be transferred
by the system
Mesh Partitioning Tools
The role of mesh a partitioner is to allocate each computational point (cell)
to a CPU. In doing so, we need to account for:
Load balance: all processing units should have approximately the same
amount of work between communication and synchronisation points
Minimum communication, relative to local work. Performing local
computations is orders of magnitude faster than communicating the
data
Achieving the above is not trivial, especially if the computing load varies
during the calculation
Handling Parallel Computations and Data Sets
The purpose of parallel machines is to massively scale up computational
facilities. As a result, the amount of data handled and preparation work is
not trivial
Parallel post-processing is a requirement. Regularly, the only machine
capable of handling simulation data is the one on which the computation has
been performed. For efficient data analysis, all post-processing operations
also need to be performed in parallel and presented to the user in a single
display or under a single heading: parallelisation is required beyond the
solver
On truly large cases, mesh generation is also an issue: it is impossible to
build a complete geometry as a single model. Parallel mesh generation
is still under development

198

Large-Scale Computations

13.3.2

Parallel Algorithms

Finally, let us consider parallelisation of three components of a CFD algorithm


for illustration purposes.
Mesh Support
For purposes of algorithmic analysis, we shall recognise that each cell belongs to one and only one processor
Mesh faces can be grouped as follows
Internal faces, within a single processor mesh
Boundary faces
Inter-processor boundary faces: faces used to be internal but are now
separate and represented on 2 CPUs. No face may belong to more
than 2 sub-domains
Algorithmically, there is no change for internal and boundary faces. This
is the source of parallel speed-up. Out challenge is to repeat the operations
for for faces on inter-processor boundaries
Gradient Calculation
Using Gauss theorem, we need to evaluate face values of the variable. For
internal faces, this is done trough interpolation:
f = fx P + (1 fx ) N

(13.2)

Once calculated, face value may be re-used until cell-centred changes


In parallel, P and N live on different processors. Assuming P is local,
N can be fetched through communication: this is once-per-solution cost
and obtained by pairwise communication
Note that all processors perform identical duties: thus, for a processor
boundary between domain A and B, evaluation of face values can be done
in 3 steps:
1. Collect internal cell values from local domain and send to neighbouring
processor
2. Receive neighbour values from neighbouring processor
3. Evaluate local face value using interpolation

13.3 Domain Decomposition Approach

199

Discretisation Routines: Matrix Assembly


Similar to gradient calculation above, assembly of matrix coefficients on
parallel boundaries can be done using simple pairwise communication
In order to assemble the coefficient, we need geometrical information and
some interpolated data: all readily available, maybe with some communication
Example: off-diagonal coefficient of a Laplace operator
aN = |sf |

f
|df |

(13.3)

where f is the interpolated diffusion coefficient (see above). In actual


implementation, geometry is calculated locally and interpolation factors
are cached to minimise communication
Discretisation of a convection term is similarly simple
Note: it is critical that both sides of a parallel interface calculate the identical coefficient. If consistency is not ensured, simulation will fail
Sources, sinks and temporal schemes all remain unchanged: each cell belongs to only one processor
Linear Equation Solvers
Major impact of parallelism in linear equation solvers is in choice of algorithm. For example, direct solver technology does not parallelise well, and
is typically not used in parallel. Only algorithms that can operate on a
fixed local matrix slice created by local discretisation will give acceptable
performance
In terms of code organisation, each sub-domain creates its own numbering
space: locally, equation numbering always starts with zero and one cannot
rely on global numbering: it breaks parallel efficiency
With this in mind, coefficients related to parallel interfaces need to be kept
separate and multiplied through in a separate matrix update
Impact of parallel boundaries will be seen in:
Every matrix-vector multiplication operation
Every Gauss-Seidel or similar smoothing sweep
. . . but nowhere else!

200

Large-Scale Computations

Identical serial and parallel operation


If serial and parallel execution needs to be identical to the level of
machine tolerance, additional care needs to be taken: algorithmically,
order of operations needs to be the same
This complicates algorithms, but is typically not required. Only large
(badly behaved) meteorological models pose such requirements
Under normal circumstances, parallel implementation of linear equation solvers will provide results which vary from the serial version at
the level of machine tolerance
Synchronisation
Parallel domain decomposition solvers operate such that all processors follow identical execution path in the code. In order to achieve this, some
decisions and control parameters need to be synchronised across all processor
Example: convergence tolerance. If one of the processors decides convergence is reached and others do not, they will attempt to continue with
iterations and simulation will lock up waiting for communication
Global reduce operations synchronise decision-making and appear throughout high-level code.
Communications in global reduce is of gather-scatter type: all CPUs send
their data to CPU 0, which combines the data and broadcasts it back
Actual implementation is more clever and controlled by the parallel communication protocol

Chapter 14
Fluid-Structure Interaction
14.1

Scope of Simulations

A majority of simulation examples shown so far concentrate on a single physical


phenomenon or set of equations in a domain. There also exists a set of coupled
problems, where governing equations are zoned but still closely coupled.
Fluid-Structure Interaction (FSI)
A number of engineering devices operates by combining various physical effects in a closely coupled manner. In such cases, it is insufficient to examine
each effect in isolation, ignoring the coupling; regularly it is precisely the
coupling that needs to be considered
Example: heat exchanger
Fluid flow inside of the pipe, heated by combustion gasses outside.
From the point of view of flow analysis, two domains are independent
of each other
Even in a trivial case, coupling exists: material properties are a function of temperature and heat transfer is the basic effect we need to
consider
Adding a solid component with finite heat capacity and conductivity
which separates two fluids completes the system Thus:
Liquid flow inside the pipe (water). Navier-Stokes equations +
energy equation
Reacting mixture of combustion gases outside the pipe. NavierStokes equations + additional equation sets depending on interest:
turbulence, combustion etc., including energy equation
Metal pipe wall, with conductivity and heat capacity. Heat transfer equation within the pipe wall; thermal stress analysis

202

Fluid-Structure Interaction

Note that the energy equation is solved in all parts in a strongly coupled manner: single equation encompassing all heat transfer physics
From above, it follows that a number of equation sets will be solved together, with some equations covering multiple parts of the domain: Fluidstructure interaction
This does not necessarily involve only fluids and structures: we can speak
of multi-physics or, more accurately: physics!
With this in mind, single-physics simulations are a simplification of a
complete machine, where the influence of other components is neglected or
handled by prescribed boundary conditions
Components of Fluid-Structure Interaction Simulation
In order to perform an FSI simulation, we first need to handle each bit of
physics separately: ideally in a single simulation code
Simulations should be performed side-by-side and allow for coupling effects
Care should be taken to isolate parts of the simulation depending on nature of coupling and engineering judgement. Example: fan-to-afterburner
analysis of a jet engine:
Turbo-fan
Compressor
Fuel supply and injection system
Combustion chambers
Turbine
Afterburner
and
Fluid flow and heat transfer
Structural integrity: thermal and structural stresses
Vibration modes, natural frequencies, modes of excitement
Analysis of the coupling allows us to judge which effects are important and
which should be solved together

14.1 Scope of Simulations

203

Choice of Model and Discretisation Method


Additional set of problems arises from physical modelling: Example: Reynoldsaveraged Navier-Stokes (RANS) for compressor, coupled to Large Eddy
Simulation (LES) for combustion chambers
Ideally, physical modelling and discretisation are chosen to solve the local equation set in the best possible way: if local solution is insufficiently
accurate, coupling will not be captured either
Coupling problems will follow and can be expressed in two levels
Physical model coupling. Various combinations of physical models
are more or less suited for coupled simulations. Decision on the mode
of coupling or additional coupling physics is made on a case-by-case
basis
Example: magneto-hydrodynamics. Additional body force term
in the momentum equation. Two-way coupling caused by magnetic effects of the conductive fluid in motion
Example: LES to RANS turbulence model coupling. RANS requires mean turbulence properties from the upstream model; they
will be provided using averaged LES data
Coupling discretisation models, where various or inconsistent discretisation methods are combined together. The easiest way of achieving the coupling is through data exchange on coupled boundary conditions.
Coupling Data
Consider a case of wing flutter: fluid flow around an elastic wing
Fluid flow creates forces on the wing surface. Since the wing is not
rigid, forces result in a deflection of the wing
Wing deflection changes the shape of the fluid domain in the critical
region: next to the wing. Details of the flow field, including lift and
drag forces change feeding back to the interaction
Adding a transient effect and natural frequency of oscillation for the
structure further complicates the problem
In the example above, fluid forces are transferred to the solid, followed by
transfer of displacement onto the boundary of the fluid domain
Note that in structural simulations domain motion is determined as a part
of the solution. In fluids, deformation of the domain needs to be handled
separately

204

14.2

Fluid-Structure Interaction

Coupling Approach

Level of Coupling
Some level of coupling exists in every physical situation. Engineering judgement decides if coupling is critical for the performance or can be safely
neglected
Level of coupling
Decoupled simulations. Each physical phenomenon can be studied
in isolation, using boundary conditions or material properties to handle
the dependence to external phenomena. Feed-back effects are small or
limited
Explicit coupling approach. Two simulations are executed sideby-side, exchanging boundary data in a stationary or transient mode.
Dynamic coupling effects can be captured, but with uncertainties in accuracy of simulation. Capable of simulating weakly coupled phenomena. This is currently state-of-the-art for industrial fluid-structures
simulations
Implicit coupling: single matrix. Here, multiple physical phenomena are discretised separately and coupling is also described in an
implicit manner. All matrices are combined into a single linear system
and solved in a coupled manner.
Block implicit solution is more stable then in explicit coupling, but
poses requirements on software design: need to access matrix data
directly. Currently used in conjugate heat transfer simulations
Single equation approach. Recognising the fact that equation set
represents identical conservation equation and only governing laws
vary from material to material, we can describe the complete system
as a single equation. Governing equations are rewritten in a consistent
and compatible manner with a single working variable.
Single equation represents closest possible coupling. However, there
are issues with consistency on interfaces and simulation accuracy in
regions of rapidly changing solution (e.g. boundary layers). Resulting
equations are not necessarily known in type or well behaved and may
require special solution algorithms. This mode of coupling is a current
research topic
In many engineering situations, software limitations are a significant factor:
when tools cannot handle all the physics or software design does not allow
choice or level of coupling, we are forced to use simplifications

14.3 Discretisation of FSI Systems

205

In such cases, engineering judgement is used after the simulation: how can
we interpret the results or study the problem in a decoupled manner

14.3

Discretisation of FSI Systems

FVM both sides; FEM, both sides


FVM fluid flow + FEM stress analysis
Data mapping and integral quantity corrections
Single equation approach
Choice of Discretisation
Ideally, discretisation for each set of equations is chosen for optimal accuracy
and efficiency
In cases of FSI, this would usually employ the FEM for structural analysis
and FVM for fluid flow (why?)
In explicit or implicit coupling, one needs to describe (boundary) data transfer between the two: interpolation
Additional care is required for implicit solution: are the methods compatible
and what are the properties of a coupled system
Data Mapping
Boundary data mapping involves interpolation. This, by necessity includes
a discretisation error: one set of data points describing a continuous field
is translated into a different set
Example: FSI, with transfer of forces from fluid to structure
At the completion of the fluid flow step, we can calculate forces (pressure + shear) on the wall boundary. The force is available for each
boundary face of a fluid domain
Wall pressure represents external load onto the structure. However,
discrete representation of a structures mesh and location of solution
points is not identical: interpolation is needed
It is critical that integral properties (total force) is preserved: typically
done by global re-scaling of the profile
In FEM, one can find terms like profile-conserving or flux-conserving
interpolation. In reality, we need both

206

14.4

Fluid-Structure Interaction

Examples

Bibliography

You might also like