Principal Components Analysis For Turbulence-Chemistry Interaction Modeling

Chapter 4
Principal Components Analysis

for turbulence-chemistry
interaction modeling
As discussed in Chapter 2, turbulent combustion modeling is a very broad
subject and includes a wide range of coupled physics. Our primary interest is
focused on combustion models, adopted in the framework of RANS and LES approaches to close the reacting scalars source terms. Such models must provide
an adequate coupling of turbulent mixing and chemical reactions for the unresolved scales. Moreover, an accurate kinetic description is needed in situations
where finite-rate chemistry effects may be relevant, as in flameless combustion
regime [78, 79, 45]. However, detailed combustion mechanisms for fuels as simple as methane involve 53 species and 325 reactions [63] and the number of
species and reactions dramatically increases with the molecular weight of the
hydrocarbon fuel [83]. Therefore, solution of the species transport equations
for turbulent reacting system can become very computationally intensive, if a
reaction rate approach is adopted and no simplification is made.
The reduction of the number of species equations to be solved can be accomplished in two ways:
Reduction of the kinetic mechanism [84, 85, 86]. This approach is based
on the analysis of the dominant reaction rates at the conditions of interest and proceeds through the elimination of species and reactions in the
original kinetic mechanism, ultimately leading to a reduced set of species
equations to be solved.
State space parametrization. This approach relies on the assumption
that the thermodynamic state of a reacting system relaxes onto a lowdimensional, strongly attracting, manifold in chemical state space [87,
88]. The thermochemical state of a single-phase reacting fluid having Ns
species is uniquely determined by Ns + 1 parameters (T , p, and Ns 1
63
Chapter 4. Principal Components Analysis for turbulence-chemistry

species mass fractions). Yet, if a set of optimal variables is identified, the
whole thermochemical state can be re-parametrized with a lower number
of variables, which nevertheless must provide a satisfactory approximation of the system in a lower dimensional space [87]. In the context of
turbulent non-premixed flames, a widely used approach is represented by
the flamelet model .(Chapter 2). The model adopts two parameters for
the parametrization of the thermochemical state, the mixture fraction f
and the scalar dissipation rate, . The choice of f is particularly appealing, as this parameter is a conserved scalar if all species diffusivities are
equal [88], and it does not require any source term closure.
The present Chapter focuses on the second of these approaches, the parametrization of the thermochemical state by a small number of parameters based on the
existence of a low-dimensional manifold. Most of the existing models exploiting
the existence of such manifolds are based on the a priori prescription of the
manifold dimensionality. For example, the flamelet model [76, 89] assumes that
the system can be satisfactory described by means of two parameters. However, such an approach restricts the subspace that thermochemistry may access,
without providing any quantitative error analysis. Indeed, as mixing and reaction timescales increasingly overlap, the dimensionality of a manifold increases,
as does the error associated with a parametrization of fixed dimensionality [88].
Such consideration has prompted our interest towards the development of a
methodology to automate the selection of the optimal basis for the representation of the manifolds in thermochemical space. Principal Components Analysis
(PCA) [90, 91] offers this potential, as it provides a rigorous mathematical
formalism for reducing the dimensionality of a data set consisting of a large
number of correlated variables, while retaining most of the variation present in
the original data. The reduction is achieved by transforming to a new set of
variables, called the principal components (PCs), which are uncorrelated and
ordered so that the first few account for most of the variation present in all the
original variables. PCA provides an optimal representation of the system based
on q optimal variables, the PCs, which are linear combination of the Ns + 1
primitive variables T , p, and Yi . The linearity of the method is a particularly
appealing aspect since, once the reaction variables are identified, a few linear
combinations of the original variables could be transported in a numerical simulation, if a proper closure for the source terms is employed. Nevertheless, since
the reaction variables provided by PCA are not conserved scalars, an a priori analysis of the ability of principal components to parametrize their source
terms must be assessed, as a mandatory requirement for the generated manifold
method. One of the main advantages of PCA lies in the potential to obtain
the principal components from a target system and to apply them to a similar
system. This potential could remove one of the main drawback which, to date,
affects PCA. In fact, the derivation of the manifold model via PCA requires
the availability of a data set for the extraction of principal components.
64
4.1. Definition and derivation of Principal Components

In the following, a background on PCA is provided. In particular, the
definition and derivation of the PCs, the criteria used for selecting optimal
parameters, or variables, from large data sets and the approaches to help PCs
interpretability and deal with highly non linear systems will be discussed. Then,
various examples of the application of PCA to turbulent reacting systems will
be presented, to investigate the feasibility of PCA for the identification of the
low-dimensional manifolds in thermochemical state. Finally, the possibility of
exploiting PCA as a predictive model approach will be investigated.
All the analysis carried out in this Chapter have been performed with a
MATLAB code written on purpose, available under request. The results
have been published in [92, 93].
4.1
Definition and derivation of Principal Components
A primary goal when dealing with multivariate data is to reduce their dimensionality to the smallest number of meaningful dimensions, in order to help
data exploration and any further processing. Principal Components Analysis
(PCA) can be successfully exploited for this purpose. PCA was first introduced
by Pearson in the early 1900s [94]. A formal treatment of the method is due
to Hotelling [95] and Rao [96].
Suppose that X is a vector of p random variables, i.e. X = (x1 , x2 , . . . , xp ),
with mean and covariance matrix . The (i, j)th element of represents the
covariance between the ith and jth variables of X, if i 6= j, or the variance of the
jth element of X, if i = j. PCA is concerned with finding a few (<< p) derived
variables, called Principal Components (PCs), which nevertheless preserve most
of the information present in the original data. The PCs are linear combinations
of the original variables; moreover, they are uncorrelated (i.e. orthogonal) and
derived so that the variance on the jth component is maximal.
The first PC of X is defined as the linear combination:
(4.1)
z1 = Xa1 .
0
To determine z1 , a vector a1 is sought so that var (z1 ) = a1 a1 is maximized,

0
subject to the constraint a1 a1 = 1. If we adopt the standard approach of
Lagrange multipliers to solve this constrained problem, we need to maximize:
0

0
a1 a1 a1 a1 1
(4.2)
where is a Lagrange multiplier. Differentiating with respect to a1 gives:
a1 a1 = ( Ip ) a1 = 0
(4.3)
where Ip is the (p x p) identity matrix. Thus, is an eigenvalue of and a1 is

the corresponding eigenvector. It is easy to proof that the eigenvector a1 which
65

maximizes the variance of z1 is the one corresponding to the largest eigenvalue
of , being:
0
(4.4)
a1 a1 = a1 a1 = a1 a1 = = 1 .
0
The second PC, z2 = Xa2 , maximizes the variance var (z2 ) = a2 a2

subject to the constraints cov (Xa1 , Xa2 ) = 0 (z1 and z2 uncorrelated) and
0
a2 a2 = 1. Being
0
cov (Xa1 , Xa2 ) = a1 a2 = a2 a1 = 1 a1 a2 = 1 a2 a1 ,
(4.5)
any of the equations

0
a1 a2 = 0 a2 a1 = 0 a1 a2 = 0 a2 a1 = 0
(4.6)
could be used to specify no correlation between z1 and z2 . Choosing arbitrarily

the last expression in Eq. (4.6), the quantity to be maximized becomes:

0
0
0
a2 a2 a2 a2 1 a2 a1
(4.7)
where and are Lagrange multipliers. Differentiating with respect to a2 and

pre-multiplying by a1 , we get:
0
a1 a2 a1 a2 a1 a1 = 0
(4.8)
= 0,
(4.9)
which reduces to
being
0
a1 a2 = a1 a2
(4.10)
due to the constraint of z1 and z2 being uncorrelated. Then, Eq. (4.8) reduces
to:
a2 a2 = 0.
(4.11)
Then, is once more an eigenvalue of and a2 is the corresponding eigenvec0

tor. Again, a2 a2 = . Thus, assuming that has all different eigenvalues,
is the second largest eigenvalue of , 2 , and a2 is the corresponding eigenvector.
0
In general, the kth PC of is zk = Xak and var (zk ) = ak ak = k , where
k is the kth largest eigenvalue of and ak is the corresponding eigenvector.
66
4.2. Sample PCA
4.2
Sample PCA
In Section 4.1, the definition and derivation of PCs have been discussed for an
infinite population of measures. In practice, a random sample of n observations
of the p variables is available, so that Xi = (xi1 , xi2 , . . . , xip ) represents the ith
observation from the data set. Thus, the data available for PCA is a (n x p)
data matrix and an unbiased estimator of , S 1 , is employed.
For a single observation of X, Xi , zi1 is given by:
zi1 = Xi a1 , i = 1, 2 . . . , n
(4.12)
where the vector of coefficients a1 is chosen to maximize the variance

n
X
1
(zi1 z 1 )2 ,
(n 1)
(4.13)
i=1
subject to the constraint a1 a1 = 1. Then, for the second PC:

zi2 = Xi a2 , i = 1, 2 . . . , n
(4.14)
where a2 is chosen to maximize the sample variance of z12 , subject to the

0
constraints a2 a2 = 1 and cov (Xa1 , Xa2 ) = 0.
Continuing the process in an obvious manner, zk = Xak is the kth sample PC (k = 1, 2, . . . , p) and zik is the score for the ith observation on the
kth sample PC. If the derivation of Section 4.1 is followed, but replacing population quantities with sample variances and covariances, then it turns out that
the sample variance of the kth sample PC is lk , the kth largest eigenvalue of
the sample covariance matrix S, and that ak is the corresponding eigenvector
of S.
Let Z be the (n x p) matrix of PCs scores, with (i, k)th element equal to
zik ; then, Z and X are related by
Z = XA
(4.15)
where A is the (p x p) orthogonal matrix whose columns are the eigenvectors of

S. Here we assume, for simplicity, that the variables have zero mean, otherwise
the mean of each variable is subtracted from the columns of X before PCA is
applied. Then, the sample covariance matrix, S, of X can be defined as:
S=
1
0
X X.
n1
(4.16)
Recalling the eigenvector decomposition of a symmetric, non-singular matrix,

S can be decomposed as:
1
The matrix S represents the approximation of for a finite population, i.e. the random
sample consisting of n observations for p variables.
67

(4.17)
S = A LA
where L is a (p x p) diagonal matrix containing the eigenvalues of S in descending order, l1 > l2 > . . . > lp .
The linear transformation given by Eq. (4.15) simply recast the original
variables into a set of new uncorrelated variables, whose coordinate axes are
described by A. Then, the original variables can be stated as a function of the
PCs as:
0
(4.18)
X = ZA
being A orthonormal and, hence, A1 = A . This means that, given Z, the

values of the original variables can be uniquely recovered. However, the main
objective of PCA is to replace the p elements of X with a much smaller number,
q, of PCs, which nevertheless discard a small fraction of the variance originally
contained in the data. If a subset of size q << p is used, the truncated PCs are
defined as:
Zq = XAq .
(4.19)
Eq. (4.19) can be inverted to obtain:

0
Xq = Zq Aq .
(4.20)
The linear transformation provided by Eq. (4.20) is particularly appealing

for size reduction in multivariate data analysis due to some optimal properties
of the PCA transformation, described in the following text.
4.2.1
Optimal properties of the PCA reduction
It can be shown [90] that PCA satisfies the following optimal properties:
Property 1 : For any integer q, 1 q p, consider the orthonormal
0
transformation Z = XB, where B is a (p x q) matrix. Let Sz = B LB
be the variance-covariance matrix for Z. The, the trace of Sz , tr (Sz ),
is maximized by taking B = Aq , where Aq contains the first q columns
of A. Property 1 emphasizes that the PCs explain, successively, as much
as possible of the total univariate variance in the original data, being
tr (Sz ) = tr (S)
Property 2. Consider the orthogonal transformation Z = XB. Then,
tr (Sz ) is minimized by taking B = Aq , where Aq consist of the last q
columns of A. The statistical implication of property 2 is that the last few
PCs are not simply unstructured leftovers after removing the important
PCs. Being the variances of the last PCs small, they can help to detect
unsuspected near-constant linear dependencies among the element of X,
68
4.2. Sample PCA

to select a subset of variables from X in regression analysis or to detect
outliers from a data set.
Property 3. The sample covariance matrix, S, can be expressed as (spectral decomposition):
0
S = l1 a1 a1 + l2 a2 a2 + . . . + lp ap ap .
(4.21)
This result shows that we can decompose the whole covariance matrix
into decreasing contribution due to each PC.
Property 4. Consider the orthogonal transformation Z = XB. Then,
the determinant of Sz , det (Sz ), is maximized by taking B = Aq . The
statistical importance of this property follows because the determinant of
a covariance matrix, called generalized variance, can be used as a simple
measure of spread for a multivariate random variable.
Property 5. Each element of X can be predicted by a linear function of Z,
Z = XB. If j2 is the residual variance in predicting xj from Z , then
Pp
2
j=1 j is minimized by taking B = Aq . The statistical implication
of property 5 is that Eq. (4.20) is the best linear predictor of X in a
q-dimensional subspace, in terms of squared prediction error.
4.2.2
Data preprocessing: centering and scaling
As it was anticipated in Section 4.2, data are usually centered before PCA is
carried out. When the variable means are subtracted from the data sample, all
the observations are converted to fluctuations, thus leaving only the relevant
variation for analysis. Moreover, when working with centered variables, centered PCs are obtained. Centering is usually used with all the scaling criteria
described below.
Scaling is an essential operation when the elements of X are in different
units or when they have very different variances. These aspects have both
to be faced when analyzing the thermochemical state of a reacting system
since temperature and species concentrations have different units. Moreover,
temperature may range from ambient conditions to thousands of degrees while
species mass fractions vary between zero and one. Besides, even among species
mass fractions, there may be need for scaling. For example, radicals appear in
small concentrations and their mass fractions may range from zero to something
far less than one (i.e. 103 106 ), while major species mass fractions range
from 0 to 1. Taking into account centering, it is possible to define a scaled
ej as:
variable, x
ej =
x
xj xj
dj
69
(4.22)

P
where xj = n1 ni=1 xij , j = 1, 2, . . . , p, while dj is the scaling parameter relative
to variable xj . In matrix form, Eq. (4.22) becomes:

f = X X D 1
X
(4.23)
where D is the diagonal matrix containing the scaling parameters, dj . When

scaling is applied, Eqs. (4.15)-(4.19) are modified as:
S=
0

X X X X D 1
.
Z = X X D 1 A
1
Zq = X X D Aq
1
1
n1 D
(4.24)
The choice of the scaling parameters is very important, and has a potentially
strong impact on the resulting eigenvectors. The following choices are available:
1. Auto scaling, also called unit variance scaling. It is commonly applied and
uses the standard deviation, sj , as the scaling factor. After auto scaling,
all the elements of X have a standard deviation equal to 1 and therefore
the data is analyzed on the basis of correlations instead of covariances
2. Vast scaling [97]. Vast is an acronym of variable stability scaling and
it is an extension of auto scaling. It focuses on stable variables, the
variables that do not show strong variation, using the standard deviation
and the so-called coefficient of variation as scaling factors. The use of the
coefficient of variation, defined as the ratio of the standard deviation and
the mean: sj/xj , results in a higher importance for variables with a small
relative standard deviation
3. Range scaling. Range scaling adopts the difference between the minimal
and the maximal value, (xj,max xj,min ), as scaling factor. A disadvantage of range scaling with respect to other scaling methods is that only
two values are used to estimate the range, while for the standard deviation all measurements are taken into account. This makes range scaling
more sensitive to outliers. To increase the robustness of range scaling,
the range could also be determined by using robust range estimators or
after the outliers have been removed.
4. Level scaling. The mean values of the variables, xj , are used as scaling
factors. Level scaling converts deviations from the mean (the mean is
always subtracted) in percentages compared to the mean values. As for
the range scaling, also level scaling can be affected by outliers. Then,
a more robust estimator of the mean, the median, could be used or the
mean could be determined after outlier removal. Level scaling can be
used when large relative changes are of specific interest. However, in
the case of the thermochemical state of a system, this could lead to an
overestimation of the role of chemical species which appear in very small
concentrations, i.e. radicals.
70
4.2. Sample PCA

5. Max scaling. The variables are normalized by their maximum values,
xj,max , so that they are all bounded between zero and one. As for the
range and level scaling, a robust estimator of maximum values or a procedure for outliers removal should be employed.
As it was pointed out in the above discussion, range, level and max scaling
can be affected by the presence of outliers in the sample X. This problem can
be overcome by means of robust estimators of the quantities of interest (i.e.
median in place of sample average); however, a procedure for the detection and
removal of outlier observations could provide a viable solution to the problem.
PCA can be effectively employed for outlier detection in large data sets.
4.2.2.1
Outlier detection and removal with PCA
Experimental data sets usually contain a few unusual observations. If we refer

to a one-dimensional problem, the outliers can be classified as those observations which are either very large or very small with respect to the others. In
high dimensions, there can be outliers that do not appear as outlying observations when considering each dimension separately and, therefore, they will not
be detected using univariate criteria. Thus, a multivariate approach must be
pursued.
The usual procedure for outlier detection in multivariate data analysis is
to measure the distance of each observation, Xi = (xi1 , xi2 , . . . , xip ), from the
data center, using the so called Mahalanobis distance:
0

DM = Xi X S 1 Xi X .
(4.25)
The observations associated to large values of DM are considered outliers and,

then, they are discarded. However, a procedure must be employed to ensure a
robust estimation of the sample mean. To this purpose, PCA can be effectively
exploited [98].
PCA has long been used for multivariate outlier detection. The sum of
squares of the PCs, standardized by the eigenvalue size, equals the Mahalanobis
distance for an observation i:
p
X
z2
ik
k=1
lk
2
2
zip
zi1
z2
+ i2 + . . . +
= DM .
l1
l2
lp
(4.26)
The first few principal components have large variances and explain most of
the variation in X. Therefore, these major components are strongly affected
by variables with relatively large variances and covariances. Consequently, the
observations that are outliers with respect to the first few components usually
correspond to outliers on one or more of the original variables. On the other
hand, the last few principal components represent linear functions of the original variables with minimal variance. These components are sensitive to the
71

observations that are inconsistent with the correlation structure of the data but
are not outliers with respect to the original individual variables.
Based on the above considerations, the following detection scheme can be
proposed [98]:
1. Multivariate trimming. A fraction (0.005-0.01) of the data points characterized by the largest value of DM is classified as outliers and removed.
New trimmed estimators for X and S are then computed from the remaining observations. The trimming process can be iterated to ensure
that X and S are resistant to outliers.
2. Principal components classifier (PCC). The PCC consists of two funcP
z2
, and one from the minor PCs,
tions, one from the major PCs, qk=1 lik
k
2
Pp
zik
k=pr+1 lk . The first function can easily detect observations with large
values on some of the original variables; in addition, the second function
helps detect the observations that do not conform to the correlation structure of the sample. The number of major components, q, is determined
to explain about 50% of the original data variance, while r is chosen so
that the minor components used for the definition of the PCC are those
whose variance is less than 0.20, thus indicating the existence of almost
linear relations among the variables. Based on the PCC definition, an
observation Xi is classified as outlier if:
2
2
Pq zik
Pp
zik
(4.27)
k=1 lk > c1
k=pr+1 lk > c2
where c1 and c2 are chosen as the 0.99 quantile of the empirical distribu2
Pp
P
zik
z2
and
.
tions of qk=1 lik
k=pr+1
l
k
k
An example of application of the outlined detection scheme in shown in Figure
4.1 and Figure 4.2, with reference to a data set consisting of 62766 observations
of 10 variables [99]. Outliers have been artificially introduced in the experimental data: specifically, 1000 random numbers between 0 and 1 have been
generated and successively scaled using the standard deviation, sj , of the variables xj . The effect of the outliers on the PCs is very clear from Figure 4.1.
The introduced unusual observation are outliers with respect to the original
variables and they are reflected in the first few PCs; a small cluster of points,
separated from the majority of observations, appears in the plot of the first and
second PCs scores (Figure 4.1 (a)). If the outlier detection scheme is applied,
the introduced outliers are completely removed (Figure 4.1 (b)); in addition,
outliers present in the original experimental data set, affecting the last PCs
scores (multivariate outliers), are also detected with the procedure described.
Outliers must be treated with care as they can strongly affect the determination
of the covariance matrix, thus leading to the identification of false PCs (Figure
4.2).
All the data sets analyzed in the present Chapter have been preprocessed
to remove outlying observations.
72
4.2. Sample PCA
(a)
(b)
Figure 4.1: Principal components scores with (a) and without (b) outliers.
(a)
(b)
Figure 4.2: Eigenvalues size with (a) and without (b) outliers.
73

Figure 4.3: Size reduction process with PCA.
4.2.3
Choosing a subset of Principal Components or Variables
The major objective in many applications of PCA is to replace the p elements

of X by a much smaller number q of PCs, which nevertheless discard very little
information. Then, it is crucial to determine how small q can be taken without
serious information loss. Several criteria have been proposed and they will be
discussed in the following Sections. Once a criteria for selecting q is adopted,
size reduction can be accomplished, as described by Figure 4.3 for a simple two
dimensional case. The procedure outlined in Figure 4.3 is extremely attractive
for the analysis of turbulent reacting systems, as it provides a mathematical
formalism for the automate selection of the parameters which maximize the
variance in state space.
4.2.3.1
Cumulative percentage of total variance
The most obvious criterion for choosing q is to select a cumulative fraction

of the total variance that the PCs have to account for, i.e. 0.8 or 0.9. The
required number of PCs, q, is then the smallest value of q for which this chosen
percentage is exceeded. As discussed in Section 4.2, the PCs are successively
selected to maximize
their
Pp
Pp variance, expressed by the associated eigenvalue lk .
Thus, being k=1 lk = j=1 var (xj ), the desired fraction of the total variance
can be defined as:
Pq
lk
tq = Ppk=1 .
k=1 lk
(4.28)
Following the derivation of tq , an appropriate measure of lack-of-fit of the rank

q approximation of X is:
p
n X
X
(xq,ij xij )2 =
i=1 j=1
p
X
k=q+1
74
lk .
(4.29)
4.2. Sample PCA

For a given number of retained PCs, it is possible to define the variance
accounted for each variable by the retained eigenvectors as:
tq,j
2
q
X
ajk lk
=
sj
(4.30)
k=1
where ajk is the weight of the jth variable on the kth eigenvector and sj is the
standard deviation of variable xj .
4.2.3.2
Variance of Principal Components
The rule described in this Section originally applies to correlation matrices,

although it can be easily adapted to covariance matrices.
The idea behind this rule is that if all elements of X are independent, then
the PCs are the same as the original variables and they are characterized by
unit variances in the case of a correlation matrix (e
xj = xj/sj ). Thus, any PC
with variance less than 1 explains less variation than the original variables and
can be discarded. According to this rule, also known as Kaisers rule [100], only
those PCs whose variances lq exceed 1 are retained. However, a cut-off at lq =
1 could lead to discard important information in some circumstances. In fact,
if a variable from the sample is more or less independent of all others, it will be
characterized by small coefficients in (p 1) PCs but it will dominate one PC,
whose variance will be close to 1 when using the correlation matrix. However,
deletion will occur if Kaisers rule is used, and if, due to sampling variation,
lq < 1. It is therefore advisable to choose a cut-off lower than 1, to allow for
sampling variation. Jolliffe [101] suggests that 0.7 is roughly the correct level.
The rule just described can be easily extended to covariance matrices by taking
as cut-off l equal to the average value, l, of the eigenvalues, or a somewhat
lower cut-off such as l = 0.7l .
4.2.3.3
Broken Stick Model
Frontier [102] proposed a broken-stick method to select the number of PCs. If

we have a stick of unit length, broken into p segments, then it can be shown
that the expected length of the qth longest segment is:
q
lq
1X1
=
.
p
k
(4.31)
k=1
This method actually compares the eigenvalues from the observed sample
with the eigenvalues from random data. Based on Eq. (4.31), the observed
eigenvalues are considered interpretable if they exceed lq .
75

4.2.3.4
Scree plot
Another common method to determine the number of PCs is the Scree Plot
(Figure 4.2). This is a simple plot of the eigenvalues sorted in descending order
against their indexes. The number of eigenvalues to retain is based on the
observation of the index q at which the slopes of lines joining the plotted points
are steep to the left of q, and not steep to the right of it. Cattell [103] originally
proposed that the points to the left of the straight line, defined by the smaller
eigenvalues (three components for the data in Figure 4.2), should be considered
important. Afterwards, Cattell and Vogelmann [104] concluded that also the
first eigenvalue to the right of this point should be included (four components
for the data in Figure 4.2). Often the Scree Plot approach is complicated by
either the leak of any obvious break or the possibility of multiple break points.
4.2.3.5
Choosing a subset of Variables
Principal components (PCs) are linear combinations of all variables available.

However, these variables are not necessarily equally important to the formation
of PCs: some of the variables might be critical, but some might be redundant.
Motivated by this fact, it was attempted to link the PCs back to a subset of
the original variables, selecting critical variables or eliminating irrelevant ones.
Such an approach is very appealing as it overcomes one of the major issues
related to PCA. Being linear combination of all the original variables, the PCs
often lack of physical meaning. Therefore, working in terms of the original
variables could be helpful and straightforward. However, it should be noticed
that for a given number, m, of retained variables, an equal number of PCs will
always explain more of the original data variance.
A number of methods exist for selecting a subset of m original variables
which preserve most of the variation in X. Some of them are directly related
to the PCs.
B4 Forward method [101]. PCA is performed on the original matrix of
p variables and n observation, i.e. size (X) = (n, p). The eigenvalues of
the covariance/correlation matrix are then computed and a criterion is
chosen to retain q of them (l = 0.7l [101]). If p1 components have eigenvalues less than l , the eigenvectors associated with the remaining p p1
eigenvalues are evaluated starting with the first component. The variable
associated with the highest eigenvector coefficient is then retained from
each of the p p1 variables, as it is highly correlated with an important
PC. A second PCA is performed such that p p1 p2 variables remain.
PCA is then repeated until all the components have eigenvalues larger
than l .
B2 Backward method [101]. PCA is performed on the original matrix of
p variables and n observation, i.e. size (X) = (n, p). The eigenvalues
76
4.2. Sample PCA

of the covariance/correlation matrix are then computed and a criterion
is chosen to retain q of them (l = 0.7l [101]). If p1 components have
eigenvalues larger than l , the eigenvectors associated with the remaining p p1 eigenvalues are evaluated starting with the last component.
The variable associated with the highest eigenvector coefficient is then
discarded from each of the p p1 variables, as it is highly correlated with
an unimportant PC. A second PCA is performed such that p p1 p2
variables are discarded. PCA is then repeated until all the components
have eigenvalues larger than l .
M2 backward method [105]. PCA is performed on the original matrix of
p variables and n observation, i.e. size (X) = (n, p). The eigenvalues
of the covariance/correlation matrix are then computed and a criterion
is chosen to retain q of them. The (n x q) matrix of PCs scores, Zq , is
then evaluated. The goal is to select m (m < p and m q) variables
from X which preserve most of the data variation. The PCs scores from
Being q the true dimensionality of
the reduced data are denoted by Z.
the data, as determined with the PCA analysis, Zq is the true matrix of
is the corresponding
scores, also indicated as true configuration, while Z
approximation based on m variables. The discrepancy between the two
configuration is evaluated with a Procrustes Analysis2 . The idea is to
to establish which set of m original
compare the shape of Zq and Z,
variables better reproduces the true configuration Zq . In practice, this
consists in the following steps:
Find the sum of squared differences between corresponding points of
after they have been matched as well as possible under
Zq and Z,
translation, rotation and reflection.
Matching under translation is ensured by centering both Zq and
Z
Matching under rotation and reflection is ensured by considering
Zq as the fixed configuration and transforming Z.

The quantity which is to be minimized in the selection of variables is
the following sum of squared differences between the configurations:
0

0 Z
2ZQ
0 Z0 =
M 2 = trace Zq Zq + Z
0
q
(4.32)
0
= trace Zq Zq + Z Z 2
Q=VU
0 Zq = U V 0
Z
(4.33)
0 Zq .
where is the matrix of singular values from the SVD of Z
2
In statistics, Procrustes analysis is a form of statistical shape analysis used to analyze

the distribution of a set of shapes.
77

The M2 algorithm employs the following backward elimination procedure
for the retention of m variables from the original data:
1. Initially, set m = p and, for a fixed q, compute the matrix of PCs
scores, Zq .
2. Obtain and store the matrix of PCs scores obtained by deleting in
turn each variable from X.
3. Compute M 2 for each matrix of scores and identify the variable xu
u denote the corresponding
which yields the smallest M 2. Let Z
matrix of scores.
u and return to stage 2 with p 1
4. Delete variable xj . Set Zq = Z
variables. Continue the cycle until only m variables are left.
McCabe criteria [106]. This approach originates from the observation that
the PCs satisfy a certain number of optimality criteria (Section 4.2.1). A
subset of the original variables that optimizes one of these criteria is
called a set of principal variables by McCabe [106]. Suppose that the
set of variables of X is partitioned into subsets X (1) and X (2) . The
covariance matrix of X can be partitioned as:

S 11 S 12
S=
.
(4.34)
S 21 S 22
Then, the partial covariance matrix for X (2) given X (1) is:
S 22,1 = S 22 S 21 S 1
11 S 12 .
(4.35)
The criteria proposed by McCabe [106] for the definition of the principal
variables are:
Q
MC1 max |S 11 | = min |S 22,1 | = P
min m
k=1 k
MC2
min tr (S 22,1 ) = min m
Pmk=1 2k
(4.36)
2
MC3
min
kS
k
=
min
22,1
k=1 k
Pr
MC4 max k=1 2k , with r = min (m, p m)
where k are the eigenvalues of S 22,1 and k are the canonical correlations between the selected and not selected variables. As McCabe [106]
points out, after the selection of the PVs, S 22,1 represents the information
left in the remaining unselected variables and, then, it is quite plausible
that three of the optimality criteria should be functions of this matrix.
McCabe [106] criteria are very appealing as they satisfy well defined properties. For instance, criterion MC1 maximizes the variance of the data
explained by the subset of variables, while MC2 and MC3 both minimize
the reconstruction error. However, the criteria rapidly becomes computationally unfeasible for very large data sets.
78
4.2. Sample PCA

Principal Features [107]. Using this method the dimension reduction is
accomplished by choosing a subset of the original features that contains
most of the essential information, both in the sense of maximum variability of the variables in the lower dimensional space and in the sense
of minimizing the reconstruction error. The rows of the eigenvector matrix, Aq , denoted as Vj , represent the projection of the jth variable onto
the lower dimensional space, that is, the q elements of Vj correspond
to the weights of xj on each axis of the subspace. The key observation
of the method is that variables that are highly correlated or have high
mutual information will have similar weight vectors Vj . On the two extreme sides, two independent variables have maximally separated weight
vectors; while two fully correlated variables have identical weight vectors
(up to a change of sign). Therefore, the structure of the rows Vj is first
analyzed, to find the subsets of variables that are highly correlated; then,
a variable is extracted from each subset. The chosen variables represent
each group optimally in terms of spread in the lower dimension, reconstruction and insensitivity to noise. The principal features algorithm can
be summarized in the following five steps [107]:
1.
2.
3.
4.
Compute the sample covariance/correlation matrix, S.

Compute the PCs and eigenvalues of S, lk .
Choose the subspace dimension q and construct the matrix Aq .
Cluster the q vectors, Vj , to m q clusters using the k-means algorithm [108]. The distance measure used for the k-means algorithm
is the Euclidean distance. Choosing m greater than q is usually
necessary if the same variability of the PCs subset is desired.
5. For each cluster, find the corresponding vector Vj which is closest to
the mean of the cluster. Choose the corresponding feature, xj , as
a principal variable. This step yields the choice of m features. The
reason for choosing the vector nearest to the mean is twofold. This
feature can be thought of as the central feature of that cluster, the
one dominant in it, and which holds the least redundant information
of features in other clusters.
4.2.4
Interpretation of principal components
The principal components are, by construction, linear combinations of all the

measured variables. Therefore, their physical interpretation is usually not
straightforward. Various attempts have been made to overcome this difficulty;
among them, PCs rotation represents a very common solution. Through rotation, the weights can be redefined to meet alternative criteria. In particular,
rotation is aimed at attaining a simple structure for Aq , so that weights on a
principal component are either close to unity or close to zero and, thus, variables
have large weights on only few or (ideally) one principal component.
79

More formally, rotation is concerned with finding an orthogonal matrix, T ,
so that orthogonally rotated weights (or loadings) for the PCs can be defined
as:
(4.37)
Bq = Aq T .
The matrix T is chosen to optimize one of many simplicity criteria available

[109]. The most common orthogonal rotation is based on the maximization of
the VARIMAX criterion [110]:
p
q
X
X
1
VM AX (Aq ) =
a4ik
p
k=1
i=1
p
1X
!2
a2ik
(4.38)
i=1
Kaiser [110] refers to this as raw VARIMAX, but it is the version that has
become most popular. Verbally, this is simply the sum of the column-wise
variances of the squared elements of Aq . In other words, a criterion is defined
to maximize the amount of variance explained for any of the original variables
on single PCs. After VARIMAX rotation, Aq will generally have fewer large
loadings in its columns, thereby making the columns more easily interpretable.
A simple analytical solution for the maximization of the criterion in Eq. (4.38)
exist for the two-dimensional case [110]. Indicating the columns of Aq with k
and l, the two-dimensional solutions is:

Pp
2 a2 (2a a ) +
t = 2p
a
ik il
i=1
il
ik

Pp
P 2
2
ahik a2il
i=1 (2aik ail ) i
2
P
,
b = p pi=1 a2ik a2il (2aik ail )2 +
2
P 2
P
2
[ pi=1 (2aik ail )]
aik a2il
(4.39)
thus leading to the following definition of the optimal rotation angle,

1
= arctan (t, b) .
4
(4.40)
The optimal rotation matrix is then given by:

B2 =
cos () sin ()
sin () cos ()

.
(4.41)
The two-dimensional solution just presented can be extended to the more

general case of a sample X with dimensionality p. In this case, a possible
algorithm consists in applying the planar solution over all the (2 x 2) orthogonal
B2 matrices, through a sequence of iterates converging to a final solution. Little
is known about the convergence behavior of the algorithm, but in practice it
usually converges to a local maximizer of the VARIMAX criterion.
80
4.3. Local Principal Components Analysis
Figure 4.4: Schematic illustration of the VARIMAX rotation [110].
4.3
Local Principal Components Analysis
The PCA transformation described in Section 4.2 can suffer from its reliance on
second order statistics. In fact, the PCs are uncorrelated, i.e. their second-order
product moment is zero, but they can still be highly statistically dependent.
This is particularly important when the relationships among the correlated
variables are non-linear, as it usually happens for a reacting system. In this
case, PCA fails to find the most compact description of the data and it usually
requires a larger number of components to model the low-dimensional hyper
plane embedded in the original space, with respect to a non-linear technique.
This simple realization has prompted the development of non-linear alternatives
to PCA. A considerable amount of work has been done in the context of neural
networks. Nevertheless, here we are more interested in a different approach,
introduced by Kambhatla and Leen [111] in the field of images processing and
known as Local Principal Components Analysis (LPCA).
LPCA employs a local linear approach to reduce the statistical dependency
between the variables of a sample and to achieve the desired optimal dimension
reduction. According to LPCA, a Vector Quantization (VQ) algorithm first partitions the data space into disjoint regions and then PCA is performed in each
cluster, relying on the observation that, if the local regions are small enough,
the data manifold will not curve much over the extent of the region and the
linear model will be a good fit. For the LPCA to be effective, the VQ algorithm
should not be independent of the PCA analysis. For example, a partitioning
based on the Euclidean distance is very intuitive and easy to implement but
the sample clustering is carried out without any connection with the following
projection onto the lower-dimensional subspace. For this reason, Kambhatla
and Leen [111] introduce a VQ algorithm based on a reconstruction error metric. Given an observation from the sample X, Xi , a global reconstruction error
for each observation can be defined as:

(k)
GRE Xi , X
= kXi Xi,q k =

(k)

= Xi X
+ Zi,q A(k)
(4.42)
q
81

(k)
where X
is kth cluster centroid, Xi,q is the rank q approximation of Xi ,
Zi,q is the ith value of the truncated set of PCs, Zq , and A(k)
q is the matrix obtained by retaining only the first q eigenvectors of the covariance matrix, S (k) ,
associated to the kth cluster. In the context of reacting systems Eq. (4.42)
needs to be modified to take into account the differences in size and units of
the state variables. In fact, a clustering based on GRE would lead to an optimization with respect to temperature only. Therefore, the original LPCA
algorithm from Kambhatla and Leen [111] was modified [92] to include data
preprocessing (Section 4.2.2) in the quantization scheme. A very stable algorithm is obtained by using a global scaled reconstruction error metric, GSRE ,
defined as:

(k)
f

f
GSRE Xi , X , D = X
X
(4.43)
i
i,q
ei is the ith observation of the sample scaled by D, the diagonal matrix
where X
whose jth diagonal element is the scaling factor dj associated to xj . The
proposed LPCA algorithm, briefly referred as VQPCA, can be summarized as
follows:
(k)
1. Initialization: the cluster centroids, X , are randomly chosen from the

data set and S (k) is initialized to the identity matrix for each cluster.
2. Partition: each observation from the sample is assigned to a cluster using
the squared reconstruction distance given by GSRE .
3. Update: the cluster centroids are updated on the basis of partitioning
carried out at step 2.
4. Local PCA: PCA is performed in each disjoint region of the sample.
5. Steps 2-4 are iterated until convergence is reached.
The VQPCA algorithm is illustrated in Figure 4.5. The vector quantization step
partitions the data into cluster, trying to follow the curvature of the manifold
in the low-dimensional space. Then, the points are assigned to the clusters
depending on their low-dimensional projection on each of the identified clusters.
The goodness of reconstruction given by VQPCA is measured with respect
to the mean variance in the data as:
GSRE,n =
E (GSRE )
E [var (e
xj )]
(4.44)
ej is the scaled jth variable

where E denotes the expectation operator and x
from X. If auto scaling is employed in data preprocessing, Eq. (4.44) reduces
to:
GSRE,n = E (GSRE ) .
82
(4.45)
4.3. Local Principal Components Analysis
Figure 4.5: Schematic illustration of the VQPCA algorithm [92] .

Convergence can be judged using the following criteria:
1. The normalized global scaled reconstruction error, GSRE,n , is below a
specific threshold, GSRE,n .
2. The relative change in cluster centroids between two successive iterations
is below a fixed threshold, i.e. 108 .
3. The relative change in GSRE,n between two successive iterations is below
a fixed threshold, i.e. 108 .
Requirements 2 and 3 are particularly useful if an explanatory analysis on
the performances of VQPCA in terms of GSRE,n is of interest. In this case,
requirement 1 can be relaxed and the variation of GSRE,n as a function of the
number of eigenvalues and clusters can be analyzed by enforcing requirements
2 and 3. Otherwise, all the three conditions can be used and an iterative
procedure for the determination of the number of eigenvalues required to achieve
a fixed GSRE,n can be employed. Staring with q = 1, the number of eigenvalues
can be increased progressively until the desired error level is reached.
The VQPCA approach is based on the unsupervised partitioning of the data
into clusters, based on the minimization of the reconstruction error. Therefore,
the approach is optimal from the point of view of error minimization; however, being the partitioning iterative, the approach can result computationally
intensive for very large data sets (i.e. data from DNS). Therefore, a viable
alternative to VQPCA could be represented by a supervised partition of data
into clusters, based on the a priori knowledge on the conditioning variable.
In the context of turbulent non-premixed combustion, the obvious choice is
represented by mixture fraction. Therefore, a Mixture fraction PCA (FPCA)
algorithm can be proposed as follows [92]:
1. Partition. The data is partitioned into bins of mixture fractions.
2. Local PCA: PCA performed in each of mixture fraction bin.
The FPCA approach is schematized in Figure 4.6. The data are partitioned
into two mixture fraction bins, i.e. rich and lean regions, and a one-dimensional
83

Figure 4.6: Schematic illustration of the FPCA algorithm [92] for a CO/H2
flame [112]..
coordinate system is identified in each cluster. With respect to the VQPCA
approach, FPCA allows a very fast clustering. However, it is not possible to
state a priori that the choice of the mixture fraction as conditioning variable
is the best available.
In the following, the local approaches will be compared to the classic approach consisting in the application of PCA to compete sets of data, i.e. taking
k = 1, and denoted with Global PCA (GPCA).
4.4
Data sets for model validation
A prerequisite for the application of the PCA methodology described in the

previous Sections is the availability of data sets for the extraction of the PCs.
Both experimental and numerical data were used in the present study. The
data to be analyzed with PCA have been organized in (n x p) matrices, X,
whose rows represent instantaneous spatial snapshots of the reacting species
concentrations and temperature.
4.4.1
Experimental data
High fidelity experimental data provided under the framework of the Workshop
on Measurement and Computation of Turbulent Non-premixed Flames (TNF
workshop) [80] have been used to assess the PCA methodology.
The first flame investigated in the present study is a turbulent non-premixed
CO/H2 /N2 (0.4/0.3/0.3 by vol.) jet flame [112], hereafter called simply jet
flame, selected as base case for the analysis due to its favorable properties. In
84
4.4. Data sets for model validation

fact, the flame does not experience any liftoff or localized extinction and retains
the simple flow geometry of the hydrogen jet flames [80], while adding a modest
level of chemical kinetic complexity. Moreover, the flame is fully characterized
in terms of scalar data. Simultaneous Raman/Rayleigh/LIF measurements
of temperature and species concentrations were conducted at Sandia National
Laboratories, California. About 800 to 1000 measurements were taken at different spatial locations, for a total of 66.275 data points of nine different state
variables (T, N2 , O2 , H2 O, H2 , CO, CO2 , OH, NO). The mixture fraction
calculated following Bilger [75] is also available in the experimental results.
The stoichiometric value of the mixture fraction for this flame is 0.295. The
experimental uncertainties can be obtained from [112].
The second flame is a CH4 flame, part of a series of four piloted jet flames,
C, D, E and F, investigated by Barlow and Frank [99]. Starting from Flame
D, the velocity, and then the Reynolds number, associated to the main jet
is raised, thus increasing the probability of local extinction phenomena. The
flames of interest for the present study are Flame D and, in particular, Flame F.
Flame F shows severe non-equilibrium effects and it is close to global extinction
in the downstream part of the flame; therefore, it represents a challenging
system to judge PCA capabilities in terms of chemical manifolds identification
and parametrization. Likewise the jet flame, Raman/LIF measurements of
temperature and species concentrations (T, N2 , O2 , H2 O, H2 , CH4 , CO, CO2 ,
OH, NO), are provided at different spatial locations for a total of 62.766 data
points. The mixture fraction is defined according to Bilger [75], except that
only the elemental mass fractions of hydrogen and carbon are included. The
mixture fraction for this flame is 0.351. The experimental uncertainties can be
obtained from [99, 113].
The third system investigated is the jet in hot co-flow (JHC) burner [46,
49, 50], hereafter denoted with JHC, designed to emulate the flameless combustion regime (Section 1). It consists of a central fuel jet (80% CH4 and 20%
H2 ) within an annular co-flow of hot exhaust products from a secondary burner
mounted upstream of the jet exit plane. The O2 level of the co-flow is controlled
at three different levels, i.e. 3, 6 and 9% (by vol.), while the temperature and
exit velocity are kept constant. Similarly to the other flames, around 56.000
observations are provided for temperature and species concentrations (T, N2 ,
O2 , H2 O, H2 , CH4 , CO, CO2 , OH, NO). The mixture fraction is defined according to Bilger [75], The experimental uncertainties can be obtained from [46].
The availability of experimental data for the JHC system has been particularly
important for the present Thesis, as it has allowed to give insights for the CFD
analysis of the combustion systems investigated in Chapters 5-7. In particular,
information regarding turbulence/chemistry interactions in flameless combustion regime have been extracted from the PCA analysis.
85

4.4.2
Numerical data
In conjunction with high fidelity experimental data, numerical results from the
Direct Numerical Simulation (DNS) of CO/H2 oxidation with detailed chemistry [114] have also been considered. Details about the DNS simulations and
code can be found in Sutherland et al. [88]. Two DNS data sets have been considered: a spatially evolving and a temporally evolving jet, characterized by a
significant degree of extinction. For the first data set, indicated as DNS1, three
temporal slices, each consisting of approximately 1.500.000 scalar observations
(T, H2 , O2 , O, OH, H2 O, H, HO2 , H2 O2 , CO, CO2 , HCO), are available; the
second data set, DNS2, consists of twelve temporal slices, each one comprising
around 700.000 observations of the same variables.
The advantage of DNS data with respect to experimental data lies in the
large amount of data accessible. Moreover, DNS simulations give access to
many additional variables, beside scalar values, which are not provided by any
experimental campaign. In particular, the scalar source terms can be extracted
from DNS simulations, thus allowing to judge the capabilities of the extracted
PCs to parametrize not only the original variables, but also their source terms.
Of course, in the perspective of adopting PCA as a predictive model, the generation of data with DNS for PCs extraction does not represent a viable solution
and other approaches, such as One Dimensional Turbulence (ODT) [115], could
be pursued.
4.5
Results
This section describes the results of the PCA methodology applied to the experimental and numerical data sets described in Section 4.4 are here presented.
First, the capabilities of PCA for the identification of low-dimensional manifolds in turbulent reacting systems is investigated. In particular, the effect of
the preprocessing strategies and modeling approaches (i.e. GPCA vs. LPCA)
on the manifold dimensionality is thoroughly discussed, trying to provide also
a physical interpretation for the extracted PCs.
Then, the feasibility of a PCA based combustion model is discussed. The
PCA model is validated a priori using the DNS data sets and its performances
are compared to those of an ideal flamelet parametrization (Chapter 2).
4.5.1
PCA for the identification of low-dimensional manifolds
The objective of the present Section is to provide a methodology i) to investigate the existence of low-dimensional manifolds in turbulent flames, ii)
to find the most compact representation for them and iii) to guide the selection of optimal reaction variables able to accurately reproduce the state
space of a reacting system. PCA has been previously applied to combustion.
Frouzakis et al. [116] applied PCA for data reduction of two-dimensional DNS
86
4.5. Results
data of opposed jet flames. The analysis was aimed at identifying the number of components required to accurately approximate the original data. To
this purpose, the correlations among velocities, pressure and species concentrations at different times were taken into account, thus leading to eigenvectors
which are linear combination of the temporal snapshots considered. Similarly,
Danby and Echekki [117] implemented PCA for the analysis of an unsteady
two-dimensional direct numerical simulation of auto ignition in homogeneous
hydrogen air mixtures, with the main purpose of determining the requirements
to reproduce passive and reactive scalars during the process of auto ignition.
The approach presented here is quite different from the ones described above.
The main purpose of the developed PCA methodology is to find correlations
among the state variables (temperature and species concentration) to allow
an optimal approximation of the system in a low-dimensional space. Such an
approach leads to the determination of eigenvectors which are linear combinations of the original variables in a way that allows reducing the dimension of the
system. A similar method was proposed by Maas and Thvenin [118] for the
analysis of DNS data. However, they only considered a very small sampling in
state space. The current study provides significantly more depth in its analysis,
and applies PCA to both experimental and numerical data sets.
4.5.1.1
GPCA of experimental data sets
Figure 4.7 shows the magnitude of the eigenvalues associated with the PCA
reduction of the jet flame data set, together with the contribution of the q
largest eigenvalues to the amount of variance explained by the new basis vectors.
The eigenvalue distribution reflects the covariance structure of the data set,
shown in Table 4.1, and obtained by applying the auto scaling criterion. It is
clear that the first two eigenvalues alone account for more than the 92% of the
total variance in the data. On the other hand, the last four smallest eigenvalues
are very close to zero; therefore, they contain no useful information and only
explain linear dependencies among the original variables. Therefore, a strong
size reduction, from 9 to 2 or 3, can be accomplished by using PCA, through
the identification of the most active directions in the original data. The total,
tq , and individual variance, tq,j , accounted for the jet flame by the first two
or three eigenvalues are listed in the first two columns of Table 4.2. It can be
observed that, by choosing q = 2, it is possible to capture more than 90% of the
individual variances of all the main species and temperature, while the minor
species, OH and NO, require an additional component, q = 3, to reach levels
of approximation comparable to the other state variables.
This is confirmed by the analysis of the parity plots of temperature and
species mass fractions given by the PCA reconstruction for the cases q = 2
(Figure 4.8) and q = 3 (Figures 4.9). It can be observed that the addition of a
component has a small effect on temperature and main species, whose variation
is mainly explained by the first two components (Table 4.2). Moreover, the par87

Figure 4.7: Scree-graph and histograms of the q largest eigenvalues for the jet
flame data set, preprocessed with auto scaling.
ity plots of temperature (Figures 4.8 and 4.9 (a)), H2 O mass fraction (Figures
4.8 and 4.9 (d)) and minor species such as OH and NO (Figures 4.8 and 4.9
(e, f)) point out the existence of non linear deviations in the recovered data,
which can be probably ascribed to non linear dependencies among the original
variables. This result suggests that the low-dimensional projection of the thermochemical state shows significant non linearities which cannot be taken into
account with a global linear approach. Therefore, specific algorithms performing PCA in locally linear regions of the data (Section 4.3) could be taken into
account, to improve the accuracy of the parametrization.
Figure 4.10 shows the eigenvalue size distribution and the contribution of
the q largest eigenvalues to the total explained variance, tq , for Flame D, F and
JHC data sets. The covariance matrices for the data sets are shown in Table
4.3-4.5. Similarly to the jet flame, a significant size reduction can be achieved
for D and F flames, although an additional component is required, q = 3 or
q = 4, due to the higher complexity of the piloted flames (Section 4.4.1). On
the other hand, the JHC data set shows a higher dimensionality and at least 4
components are needed to explain as much as 90% of the total variance in the
original data. The number of required PCs, q, increases to 5 if an individual
variance, tq,i , above 90% is desired for all the variables, as indicated in Table
4.6. Such result is particularly interesting for the present Thesis, as it confirms
the complexity in the numerical modeling of the flameless combustion regime
[45, 78, 79], caused by the overlap between chemical and mixing scale and,
thus, by the need of optimal progress variables for the description of complex
interactions which take place in such regime.
Table 4.6 lists the values of tq and tq,j accounted for Flame D, F and for
JHC. It is interesting to observe the very strong similarities between Flame
D and F, confirmed by the analysis of their covariance structure (Tables 4.3
88
T
YO2
YN2
YH2
YH2 O
YCO
YCO2
YOH
YN O
YN 2
YH2 O
T
Y O2
YH2
YCO
YCO2
1.000 0.825 0.512 0.005
0.938
0.117
0.984
1.000
0.887 0.541 0.909 0.646 0.767
1.000 0.835 0.667 0.902 0.438
1.000
0.196
0.973 0.082
1.000
0.329
0.892
1.000
0.024
1.000
YOH
0.771
0.562
0.266
0.168
0.725
0.081
0.793
1.000
YN O
0.815
0.558
0.256
0.170
0.678
0.113
0.855
0.639
1.000
Table 4.1: Covariance matrix for the jet flame data set. Scaling criterion adopted: auto scaling.
4.5. Results
89

tq,i (%)
auto
q=2
0.971
0.986
0.986
0.968
0.930
0.994
0.973
0.738
0.772
0.924
q=3
0.973
0.986
0.986
0.969
0.936
0.994
0.977
0.940
0.930
0.966
range
q=2
0.983
0.994
0.981
0.962
0.945
0.995
0.979
0.731
0.728
0.946
q=3
0.991
0.994
0.981
0.963
0.945
0.997
0.987
0.991
0.795
0.975
max
q=2
0.979
0.997
0.971
0.957
0.944
0.990
0.977
0.745
0.729
0.942
q=3
0.990
0.997
0.971
0.960
0.944
0.994
0.988
0.992
0.802
0.975
vast
q=2
0.992
0.975
1.000
0.945
0.940
0.979
0.981
0.660
0.744
0.992
q=3
0.992
0.978
1.000
0.947
0.978
0.980
0.985
0.687
0.970
0.996
level
q=2
0.896
0.942
0.965
0.991
0.870
0.987
0.908
0.870
0.701
0.949
q=3
0.943
0.961
0.970
0.991
0.884
0.987
0.959
0.993
0.926
0.980
Table 4.2: Total, tq , and individual variance, tq,j , accounted for the jet flame data set, as a function of the number of retained
PCs, q, and the preprocessing criterion.
T
YO2
YN2
YH2
YH2 O
YCO
YCO2
YOH
YN O
tq (%)
90
4.5. Results
(a)
(b)
(c)
(d)
(e)
(f)
Figure 4.8: Parity plots of temperature (a), H2 O (b), H2 (c), CO (d), OH (e)
and NO (f) mass fractions illustrating the GPCA (q = 2) reduction of the jet
flame data set. Scaling criterion adopted: auto scaling.
91

(a)
(b)
(c)
(d)
(e)
(f)
and NO (f) mass fractions illustrating the GPCA (q = 3) reduction of the jet
flame data set. Scaling criterion adopted: auto scaling.
92
4.5. Results
and 4.4), thus indicating that the relations between the state variables are not
strongly affected by the increase in Reynolds number from one flame to the
other.
A closer look at the covariance matrices structure indicate that, with the
exception of the JHC data set, there is always a strong correlation between
temperature, oxidation products (CO2 , H2 O), OH and NO (Table 4.1, Tables
4.3-4.4), as it is expected for a turbulent non premixed flame. The covariance
matrix for the JHC data set still shows a strong correlation between temperature and products mass fractions; however, the covariance between temperature and the minor species, i.e. OH and NO, is lower. Once again, this indicate
the existence of a more complex flame structure, arising from a balance between
turbulent mixing and chemical kinetics.
Figure 4.11 and Figure 4.12 show the GPCA reconstruction of Flame F,
with q = 3 and q = 4, respectively. Similarly to the jet flame, the addition
of a PC barely affects the accuracy in the prediction of the major species, as
it mainly acts on the prediction of the minor species, i.e. OH and NO (Table
4.6). Very similar results are observed for Flame D.
With regard to the JHC system, very large (non linear) deviations are observed for temperature (Figure 4.13 (a)), CO (Figure 4.13 (c)) and OH (Figure
4.13 (e)), for the case q = 4. The increase of the number of PCs to q = 5
strongly improves the prediction of CO (Figure 4.14 (c)) and OH (Figure 4.14
(e)), but not temperature (Figure 4.13 (a)) and other species, i.e. CO2 . It is
noteworthy that NO is very well captured, even with q = 4. This results suggests that one of the retained PCs is highly correlated with NO, thus leading
to the observed result.
PCs interpretation and rotation It is interesting to provide an interpretation of the results described above by looking at the structure of the eigenvectors matrices for the different experimental data sets. Tables 4.7-4.10 report the
weights of the original variables on the retained principal components, before
(a) and after applying VARIMAX rotation, for the jet flame, Flame D, Flame F
and JHC, respectively. As it was pointed out in Section 4.2.4, the PCs weights
are determined to maximize variance and not physical interpretability. However, PCs rotation can help overcome such difficulty, through the determination
of a simpler structure for the eigenvectors.
The analysis of the rotated eigenvectors matrices shows a common pattern
for the different systems, again with the exception of the JHC data set. It
can be observed how the first (rotated) PC is always an ensemble component,
consisting of temperature, oxidizer, product species and NO. This component
has the effect of capturing as much as possible of the original data variance in
the data, trying to explain the (non linear) relations among the state variables
with a single parameter. The other PCs differs from one data set to the other.
For the jet flame, the second PC consists of reactants (CO, H2 , Air)., while
the third is basically OH, which is determinant for capturing the reaction zone
93

(a)
(b)
(c)
Figure 4.10: Scree-graph and histograms of the q largest eigenvalues for Flame
D (a), Flame F (b) and JHC (c). Scaling criterion adopted: auto scaling.
94
T
YO2
YN2
YH2
YH2 O
YCH4
YCO
YCO2
YOH
YN O
T
Y O2
YN2
YH2
YH2 O
YCH4
YCO
YCO2
YOH
YN O
1.000 0.960 0.134 0.418
0.979 0.295 0.535
0.984
0.681
0.912
1.000
0.323 0.589 0.977 0.093 0.688 0.932 0.645 0.859
1.000 0.473 0.194 0.867 0.451 0.109 0.061 0.052
1.000
0.548
0.194
0.919
0.320
0.056
0.240
1.000 0.257 0.658

0.949
0.666
0.883
1.000
0.102 0.312 0.221 0.329
1.000
0.442
0.213
0.372
1.000
0.708
0.933
1.000
0.688
1.000
Table 4.3: Covariance matrix for Flame D data set. Scaling criterion adopted: auto scaling.
4.5. Results
95

Table 4.4: Covariance matrix for Flame F data set. Scaling criterion adopted: auto scaling.
YOH
YN O
YCO
YCO2
YH2 O
YCH4
YN 2
YH2
T
YO2
T
1.000 0.968 0.073 0.418
0.984 0.312 0.543
0.981
0.745
0.824
YO2
1.000
0.241 0.545 0.976 0.128 0.660 0.940 0.748 0.790
YN2
1.000 0.378 0.109 0.882 0.349 0.053 0.057 0.026
1.000
0.512
0.124
0.926
0.305
0.189
0.229
YH2
YH2 O
1.000 0.296 0.636
0.956
0.754
0.816
1.000
0.041 0.327 0.232 0.297
YCH4
YCO
1.000
0.432
0.262
0.331
YCO2
1.000
0.767
0.851
YOH
1.000
0.633
YN O
1.000
96
T
YO2
YN2
YH2
YH2 O
YCH4
YCO
YCO2
YOH
YN O
T
YO2
YN 2
YH2
YH2 O
YCH4
YCO
YCO2
YOH
YN O
1.000 0.476 0.616 0.534 0.892 0.534 0.292
0.913
0.427
0.388
1.000 0.306 0.420 0.619 0.418 0.378 0.614 0.150 0.126
1.000 0.990 0.483 0.991 0.139

0.465
0.216
0.253
1.000 0.392 0.998 0.085 0.376 0.195 0.240
1.000 0.398 0.516

0.927
0.340
0.389
1.000 0.089 0.381 0.196 0.241
1.000
0.266 0.072 0.123
1.000
0.362
0.376
1.000
0.214
1.000
Table 4.5: Covariance matrix for JHC data set. Scaling criterion adopted: auto scaling.
4.5. Results
97

(a)
(b)
(c)
(d)
(e)
(f)
Figure 4.11: Parity plots of temperature (a), H2 O (b), CO (c), H2 (d), OH (e)
and NO (f) mass fractions illustrating the GPCA (q = 3) reduction of Flame
F. Scaling criterion adopted: auto scaling.
98
4.5. Results
(a)
(b)
(c)
(d)
(e)
(f)
and NO (f) mass fractions illustrating the GPCA (q = 4) reduction of Flame
F. Scaling criterion adopted: auto scaling.
99

(a)
(b)
(c)
(d)
(e)
(f)
and NO (f) mass fractions illustrating the GPCA (q = 4) reduction of JHC
data set. Scaling criterion adopted: auto scaling.
100
4.5. Results
(a)
(b)
(c)
(d)
(e)
(f)
and NO (f) mass fractions illustrating the GPCA (q = 5) reduction of JHC
data set. Scaling criterion adopted: auto scaling.
101

Table 4.6: Total, tq , and individual variance, tq,j , accounted for Flame D, F
and JHC data sets by the GPCA reduction, as a function of the number of
retained PCs, q.
T
YO2
YN2
YH2
YH2 O
YCH4
YCO
YCO2
YOH
YN O
tq (%)
tq,i (%)
Flame D
q=3
0.971
0.982
0.979
0.959
0.987
0.984
0.940
0.965
0.743
0.902
0.941
q=4
0.985
0.986
0.981
0.966
0.988
0.986
0.965
0.985
1.000
0.932
0.977
Flame F
q=3
0.967
0.978
0.979
0.969
0.983
0.984
0.961
0.969
0.711
0.792
0.946
q=4
0.971
0.979
0.980
0.970
0.984
0.984
0.969
0.974
0.978
0.892
0.968
JHC
q=4
0.932
0.961
0.991
0.998
0.966
0.999
0.757
0.911
0.735
0.999
0.925
q=5
0.948
0.974
0.991
0.998
0.966
0.999
0.998
0.970
1.000
1.000
0.984
correctly.
Moving on to Flame D and F, the second PC observed for the jet flame is
somehow split into two components, one representative of a mixture fraction
(both N2 and CH4 are very highly correlated to mixture fraction) and one representative of the intermediate product species (CO, H2 ). The last component
is again OH, the flame marker. The eigenvectors structures of Flame D and
F are very similar. There is only a significant difference which could be highlighted, namely the NO weights on the fourth component. For Flame D, NO
does not appear as a relevant weight on the last PC, whereas it is no negligible
for the fourth component of Flame F, thus reflecting the lower correlations beTable 4.7: Retained (a) and rotated (b) eigenvectors for the jet flame data set.
T
YO2
YN2
YH2
(a)
YH2 O
YCO
YCO2
YOH
YN O
a1
0.40
-0.41
-0.33
0.14
0.41
0.18
0.39
0.31
0.31
a2
0.18
0.15
0.38
-0.55
0.05
-0.53
0.24
0.27
0.29
a3
-0.07
T
0.00
YO2
0.01
YN2
-0.05
YH2
(b)
0.12
YH2 O
0.02
YCO
-0.11
YCO2
0.74
YOH
-0.65
YN O
102
a1,r
0.44
-0.30
-0.13
-0.10
0.36
-0.06
0.46
0.22
0.54
a2,r
0.00
0.31
0.48
-0.56
-0.13
-0.56
0.05
0.11
0.13
a3,r
0.03
-0.07
-0.02
-0.07
0.20
0.01
0.00
0.81
-0.54
4.5. Results
Table 4.8: Retained (a) and rotated (b) eigenvectors for Flame D data set.
T
YO2
YN2
YH2
(a) YH2 O
YCH4
YCO
YCO2
YOH
YN O
a1
0.40
-0.40
-0.06
0.23
0.41
-0.10
0.28
0.39
0.32
0.34
a2
-0.08
-0.06
-0.59
0.39
-0.03
0.57
0.33
-0.13
-0.11
-0.15
a3
0.07
-0.07
-0.38
-0.55
0.00
0.41
-0.49
0.18
0.25
0.22
a3
0.11
T
-0.04
YO2
-0.05
YN2
0.04
YH2
0.05 (b) YH2 O
0.01
YCH4
-0.14
YCO
YCO2
0.11
-0.83
YOH
0.51
YN O
a1,r
0.46
-0.39
-0.07
0.00
0.38
-0.07
0.01
0.48
0.00
0.50
a2,r
0.02
0.08
0.69
-0.01
0.04
-0.72
0.02
0.00
0.00
0.01
a3,r
0.00
0.13
0.05
-0.69
-0.15
0.06
-0.67
0.09
0.01
0.16
a3
0.02
0.03
0.02
0.08
-0.05
0.02
-0.07
0.01
-0.99
0.04
Table 4.9: Retained (a) and rotated (b) eigenvectors for Flame F data set.
T
YO2
YN2
YH2
(a) YH2 O
YCH4
YCO
YCO2
YOH
YN O
a1
0.40
-0.40
-0.10
0.23
0.41
-0.08
0.28
0.39
0.29
0.37
a2
0.10
0.06
0.56
-0.40
0.03
-0.55
-0.34
0.14
0.19
0.18
a3
0.04
-0.03
-0.39
-0.49
-0.05
0.45
-0.44
0.13
0.40
0.16
a3
0.20
T
-0.11
YO2
YN2
-0.08
YH2
-0.14
0.07 (b) YH2 O
YCH4
0.08
-0.26
YCO
0.24
YCO2
-0.84
YOH
0.29
YN O
a1,r
0.43
-0.39
-0.07
0.03
0.39
-0.08
0.04
0.45
0.13
0.53
a2,r
-0.03
-0.08
-0.70
0.01
-0.04
0.70
-0.01
-0.01
0.00
0.02
a3,r
-0.02
0.11
0.04
-0.70
-0.11
0.04
-0.66
0.09
0.06
0.19
a3
-0.01
0.07
0.00
0.12
-0.05
0.00
-0.09
-0.03
-0.92
0.35
tween the two variables (Table 4.4), probably determined by the higher physical
complexity of the system.
Finally, regarding the eigenvectors of the JHC system, it can be observed
how the first rotated component does not show a large influence of NO, differently from all the other systems. This can be explained by taking into account
that the first PC tries to explain as much as possible of the data variability.
It is well known [3, 4, 2, 45] that NO formation in flameless combustion is
more homogeneous than in traditional non premixed combustion, due to the
smoother temperature gradients; therefore, NO is characterized by less variability and disappears from the first PC. The second and third PCs are, again,
representative of reactant and intermediate combustion products (Table 4.5),
reflecting a similar pattern to that observed for Flame F (and D). Differently
from the piloted flames, the fourth component is exclusively NO, thus meaning
that none of the previous components can take into account NO formation and
a specific PC is needed. Then, the OH component, present in all the other
103

Table 4.10: Retained (a) and rotated (b) eigenvectors for JHC data set.
T
YO2
YN2
YH2
(a) YH2 O
YCH4
YCO
YCO2
YOH
YN O
a1
0.42
-0.11
0.38
-0.35
0.40
-0.35
0.16
0.39
0.19
0.22
a2
-0.16
0.59
0.34
-0.39
-0.27
-0.39
-0.23
-0.26
-0.07
-0.07
a3
0.08
0.02
-0.10
0.08
-0.10
0.08
-0.67
0.08
0.68
0.21
a4
0.12
-0.11
0.07
-0.04
0.06
-0.04
-0.10
0.11
0.20
-0.95
a5
0.16
-0.15
0.03
0.00
0.03
0.00
-0.64
0.31
-0.67
0.00
T
YO2
YN2
YH2
(b) YH2 O
YCH4
YCO
YCO2
YOH
YN O
a1,r
0.47
-0.49
0.09
-0.02
0.46
-0.02
-0.02
0.56
0.01
0.01
a2,r
0.14
0.37
0.51
-0.53
0.06
-0.53
0.02
0.04
-0.02
-0.01
a3,r
0.06
0.08
-0.02
0.00
-0.18
0.01
-0.97
0.14
0.00
0.00
a4,r
0.00
-0.06
0.02
0.00
-0.03
0.00
0.00
-0.01
0.00
-1.00
a5,r
-0.06
-0.04
0.01
0.00
-0.01
0.00
0.00
0.05
-1.00
0.00
system, is also found here as last PC.

Principal variables As it was pointed out in the previous Section, the original variables do not contribute equally in the determination of the PCs and
rotation can improve eigenvectors interpretability, transforming them to meet
a simpler structure. Another way to help interpretation is to extract the so
called Principal Variables (PVs), described in Section 4.2.3.5.
Table 4.11 lists the PVs determined using the methods outlined in Section
4.2.3.5 for the jet flame. At first glance, the results may appear to vary widely,
going from one method to the other. However, a more careful analysis shows
the existence of many similarities. In particular, methods B4, B2, M2, MC2
and MC3 lead to very similar results, identifying a major variable (T, CO2
or H2 O), a fuel species (CO or H2 ) and OH as PVs. In fact, T, CO2 of H2 O
are highly correlated (Table 4.1), cov (T, CO2 ) = 0.984, cov (T, H2 O) = 0.938
and cov (H2 O, CO2 ) = 0.892. Similarly, H2 and CO are also exchangeable
variables, being cov (H2 O, CO2 ) = 0.973. Method MC1 replaces T (or CO2
or H2 O) with NO, which shows a strong correlation with T, although weaker
104
4.5. Results
Table 4.11: Principal variables for the jet flame data set, as provided by the
different methods described in Section 4.2.3.5.
Method
B4
B2
M3
MC1
MC2
MC3
PF
Principal Variables
H2 O, H2 , OH
CO2 , H2 , OH
T, CO, OH
NO, CO, OH
CO2 , CO, OH
CO2 , CO, OH
T, O2 , H2
Table 4.12: Principal variables for Flame D, F and the JHC data set. PV
method: MC2 (Section 4.2.3.5).
Data set
Flame D
Flame F
JHC
Principal Variables
CH4 , CO, CO2 , OH
CH4 , CO, CO2 , OH
CH4 , CO, CO2 , NO, OH
than that CO2 and H2 O. Finally, the PF method provides a different solution,
neglecting OH as PV and replacing it with O2 . However, this solution was
considered unreliable, being very far from the pattern identified by all the other
methods.
On the basis of the results obtained for the jet flame case, it was chosen
to adopt the MC2 method for the extraction of the PVs, as it provides results
comparable to most of the other models and satisfies a very appealing property of PCA, the minimization of the reconstruction error. Applying the MC2
methods to the other data sets, we get the results in Table 4.12. It is very
interesting to observe that the same considerations derived from the analysis of
the rotated PCs can be done here, with a clearer physical interpretation. The
PVs selected for Flame D and F reflect the patters of the PCs, as they include
a mixture fraction variable, an intermediate and a product species and OH.
Finally, for the JHC system, the same set of PVs obtained for Flame D (and F)
is recovered, although augmented with NO, thus confirming the need to take
explicitly into account the formation of such pollutant species.
Effect of preprocessing strategies on the manifold dimensionality In
this Paragraph, the effect of preprocessing strategies on the PCA reduction is
presented, focusing on the jet flame data set. The performances of auto scaling
have been compared to those of other scaling criteria, presented in Section 4.2.2.
Figure 4.15 shows the eigenvalue size distribution and the contribution of the
q largest eigenvalues to the total variance explained, when applying scaling
105

criterion different than auto scaling, namely range (a), vast (b), level (c) and
max (d) scaling. If we compare Figure 4.15 to Figure 4.7, it is clear that all
the methods identify the manifold dimensionality to be equal to three, with
the exception of vast scaling (Figure 4.15 (b)), which identifies only two PCs,
with a very dominant first PC. Columns 3-10 in Table 4.2 show the values of
tq and tq,j obtained by applying range, max, vast and level scaling to the jet
flame data set. Results confirm that auto scaling is the only criterion able to
provide a uniform reconstruction of the state variables, leading to comparable
values of tq,j for all of them. Range and max scaling (columns 3-6, Table
4.2), very similar as expected, perform slightly better than auto scaling for
most of the main species and temperature. However, they proves to be unable
to properly capture NO variation, even with q = 3. Similarly, vast scaling
(columns 7-8, Table 4.2) concentrates on extremely stable variables, i.e. N2 ,
but completely fails in recovering OH properly. Then, the higher values of tq
given by range, max and vast scaling, compared to auto scaling, are due to the
higher variance explained for the major variables; however, these approaches
miss very important features, such as the parametrization of NO and OH. The
variance explained by auto scaling for OH and NO is up to 16% and 25% higher,
respectively, than that explained by the other scaling methods.
On the opposite, level scaling (columns 9-10, Table 4.2) focuses on variables
characterized by large relative changes and leads to an overestimation of the
role of minor species in the PCA reduction. Therefore, the prediction of minor
species such as OH and NO is very accurate, but major species such as H2 O
are badly recovered.
On the basis of the described sensitivity, it was chosen to adopt the auto
scaling as default preprocessing criterion for the analysis. Obviously, in applications which do not require the accurate parametrization of minor species,
other options could provide better results than to auto scaling.
4.5.1.2
LPCA of experimental and numerical data sets
The GPCA analysis presented in Section 4.5.1.1 has shown the existence of severe non linearities in the parity plots of observed and predicted state variables.
Therefore, the determination of the manifold dimensionality with GPCA can
be somehow biased, as a globally linear approach is adopted to model complex
non linear interactions. In this context, LPCA (Section 4.3) can provide locally
linear models, able to follow the non linear development of the thermochemical
manifold in low-dimensional space.
Table 4.13 lists the values of the error metric, GSRE,n (Section 4.3), given
by GPCA, VQPCA and FPCA for the jet flame, Flame F and the JHC data
set, as a function of the number of clusters, k, and retained PCs, q.
It is interesting to observe (Table 4.13) that, when the reconstruction error is evaluated on a scaled basis, all the state variables become relevant and
the goodness of reconstruction can be properly judged. It should be recalled
106
4.5. Results
(a)
(b)
(c)
(d)
Figure 4.15: Scree-graph and histograms of the q largest eigenvalues for the
jet flame data set, preprocessed with range (a), vast (b), level (c) and max (d)
scaling.
Table 4.13: Values of GSRE,n associated with the GPCA, VQPCA and FPCA
reconstructions of the jet flame, flame F and JHC data set, as a function of the
number of clusters, k, and retained PCs, q.
k
GPCA
VQPCA
FPCA
1
2
4
6
8
2
4
6
8
Jet flame
q=2
0.681
0.208
0.112
0.091
0.079
0.214
0.121
0.103
0.092
q=3
0.309
0.106
0.056
0.046
0.034
0.084
0.066
0.051
0.045
Flame F
q=3
0.707
0.205
0.131
0.095
0.090
0.263
0.158
0.134
0.122
107
q=4
0.320
0.119
0.076
0.052
0.040
0.147
0.087
0.067
0.063
JHC
q=3
1.552
0.214
0.099
0.078
0.058
0.410
0.123
0.112
0.093
q=4
0.752
0.093
0.050
0.037
0.028
0.105
0.059
0.052
0.044

here that (Section 4.3) GSRE,n is a mean global scaled reconstruction error,
normalized by the mean variance present in the original data. So, for example, the value of GSRE,n associated with the GPCA reduction of the jet flame
with q = 2 is fairly large, GSRE,n = 0.68, thus reflecting the large deviations
observed in Figure 4.8 for some state variables. Even when q in increased to
three, a significant error, GSRE,n = 0.309 is obtained, confirming the persistence of mainly non-linear departures from the original data. If lower values
of GSRE,n are desired, i.e. < 0.1, the value of q should be increased to 4 or 5
(GSRE,n = 0.098 and GSRE,n = 0.046, respectively). However, the manifold
dimensionality obtained could not be regarded as a true manifold dimensionality. It can be argued that fake components need to be added to account for
non linear interactions because of the global linear nature of the adopted model.
However, if a locally linear model is employed, i.e. VQPCA, much higher performances in terms of GSRE,n are obtained, even for smaller values of q, i.e.
q = 2 or q = 3. Figure 4.16 shows the VQPCA reconstruction of temperature
(a), H2 O (b), CO (c), H2 (d), OH (e) and NO (f), with k = 8 and q = 2. A
much better agreement between original and reconstructed data is observed, as
it is confirmed by the value of 0.08 obtained for GSRE,n . A similar value of
GSRE,n would require q = 5 if GPCA is applied. Moreover, the parity plots for
the state variables in Figure 4.16 show how, after partitioning, the relationships
between the original and reconstructed data are mainly linear.
The values of GSRE,n for Flame F, as provided by the different approaches,
are also shown Table 4.13. Similarly to the jet flame, the reconstruction error,
GSRE,n , associated with the GPCA reductions (Figure 4.11 and Figure 4.12)
obtained by choosing q = 3 and q = 4 are very high, GSRE,n = 0.71 and
GSRE,n = 0.32, respectively, thus confirming the inability of GPCA to determine the most compact description of the data in a lower dimensional manifold.
Figure 4.17 shows the VQPCA reconstruction of temperature (a), H2 O (b), CO
(c), H2 (d), OH (e) and NO (f), with k = 8 and q = 3. The value of GSRE,n
obtained is 0.09, almost eight and four times smaller than the values given by
GPCA with q = 3 and q = 4, respectively. Also, GPCA would require 6 PCs
to give a value of GSRE,n as small as 0.09.
Finally, the same considerations hold for the JHC data set. In fact, the
effect of data partitioning is even more evident for this case with respect to the
jet flame and Flame F. Table 4.13 clearly points out that GSRE,n dramatically
decreases when VQPCA with k = 2 is employed, going from 1.552 and 0.752
to 0.241 and 0.093, for q = 3 and q = 4, respectively. This corresponds to a
reduction of around 85% for both cases, which is not observed for any of the
other investigated data sets. Such result is extremely interesting and suggests
the existence of different flame structures within the JHC system. Figure 4.18
shows the VQPCA reconstruction of temperature (a), H2 O (b), CO (c), H2
(d), OH (e) and NO (f), with k = 6 and q = 3. With respect to the jet flame
and Flame F, it is possible to reach values of GSRE,n below 0.1 with a smaller
number, i.e. k = 6, of clusters.
108
4.5. Results
(a)
(b)
(c)
(d)
(e)
(f)
and NO (f) mass fractions illustrating the VQPCA (q = 2, k = 8) reduction of
the jet flame data set. GSRE,n = 0.08.
109

(a)
(b)
(c)
(d)
(e)
(f)
Flame F data set. GSRE,n = 0.08.
110
4.5. Results
(a)
(b)
(c)
(d)
(e)
(f)
JHC data set. GSRE,n = 0.08.
111

Table 4.14: Total, tq , and individual variance, tq,j , accounted for by the retained PCs for the DNS1 and DNS2 data sets as a function of the number of
components, q.
k
GPCA
VQPCA
FPCA
1
2
4
6
8
10
2
4
6
8
10
DNS1
q=2
3.130
0.816
0.307
0.116
0.043
0.036
0.625
0.243
0.122
0.062
0.046
q=3
1.830
0.176
0.065
0.025
0.010
0.009
0.216
0.052
0.030
0.020
0.015
DNS2
q=3
1.800
0.734
0.235
0.141
0.114
0.096
0.773
0.263
0.204
0.167
0.140
q=4
1.130
0.369
0.076
0.046
0.038
0.033
0.417
0.081
0.066
0.054
0.043
VQPCA has been exploited also for the analysis of the DNS data sets, DNS1
and DNS2, described in Section 4.4.2. Regarding the DNS2 data set, multiple
time steps have been merged before analyzing the data, namely t = 1.5e 03 s,
t = 2.0e 03 s, t = 2.5e 03 s and t = 3.0e 03 s. However, the resulting data
set (3.800.000 data points) have been conditioned in mixture fraction space,
between f = 0.1 and f = 0.8, to overcome memory issues (Figure 4.19).
Table 4.14 lists the values of GSRE,n given by GPCA, VQPCA and FPCA
for the DNS data sets. Similarly to the JHC case, the first partition is characterized by a dramatic reduction of GSRE,n also for the DNS data sets. This
indicates, once again, that a global approach would lead to misleading estimation of the manifold dimensionality. Table 4.14 also shows that DNS2 requires
an additional PC with respect to DNS1 to reach acceptable levels of accuracy.
This is determined by the higher complexity of the DNS2 data set, characterized
by a significant degree of extinction.
Figure 4.20 shows the contour plots of original and recovered temperature
and OH mass fraction distribution for the DNS1 data set. It can be observed
how a VQPCA approach with q = 3 and k = 8 allows to capture with great
accuracy the flame features, resulting in a very small reconstruction error,
GSRE,n = 0.01. This is a very appealing result, indicating that VQPCA could
be effectively exploited for the compression of DNS data sets, characterized by
very large storage requirements, for visualization and post-processing purposes.
Very strong compressions could be achieved, as shown here, prescribing the desired accuracy of the recovered data. For a given manifold dimensionality, the
dimensions of the reduced data sets are independent of the number of clusters;
therefore, the parameters q and k can be varied to optimize the accuracy and
112
4.5. Results
(a)
(b)
Figure 4.19: Original (a) and conditioned (b) temperature field for DNS2 data
set at time step t = 1.5e 03 s.
113

the disk space required for any data set.
The quality of the VQPCA reconstruction can be also qualitatively assessed
by means of contour plots of recovered and observed state variables (Figure
4.21). As it was pointed out for the experimental data, after the VQPCA
reduction, the parity plots show a linear behavior throughout the domain of the
state variable, confirming the capability of the VQPCA to follow the curvature
of the low-dimensional manifold in the reduced state space.
Figure 4.22 shows the contour plots of (conditioned) original and recovered
temperature distribution at time steps t = 1.5e03 s (a, a) and t = 2.0e03 s
(b). As for Flame F (Figure 4.17), VQPCA is able to capture flame extinction and re-ignition remarkably well, resulting in the quantitative agreement
between observed and recovered variables shown in Figure 4.23.
Comparison of VQPCA and FPCA Rows 6-9 in Table 4.13 and rows 7-11
in Table 4.14 list the values of GSRE,n given by FPCA for the experimental
and numerical data sets, respectively, as a function of the number of clusters, k,
and retained PCs, q. These values can be weighed against those obtained with
the VQPCA algorithm (rows 2-5 in Table 4.13 and rows 2-6 in Table 4.14) for
the same values of q and k. The comparison is illustrated graphically in Figure
4.24 for the experimental data, and in Figure 4.25 for the DNS data sets.
For the jet flame (Figure 4.24 (a)), VQPCA performs generally better with
respect to FPCA. The values of GSRE,n given by VQPCA are 6 25% lower
than those provided by FPCA with the exception of the cases corresponding to
k = 2 and q = 3. However, the performances of FPCA can be considered very
satisfying and promising, especially from the point of view of model implementation. In fact, FPCA partitioning is much simpler and straightforward than
the one underlying the VQPCA algorithm. Moreover, it is interesting to investigate how the VQPCA partitioning is reflected in the mixture fraction space.
To this purpose, the indexes of the observations with respect to the original
f were stored and used to reconstruct the mixture fraction vecdata matrix, X,
tors in each cluster identified by VQPCA. Figure 4.26 shows the temperature
as a function of mixture fraction for the two clusters selected by VQPCA, with
q = 2. It can be observed that the data are almost clustered into two regions
corresponding to the rich and lean zones of the flame, being the stoichiometric
mixture fraction of the jet flame flame equal to 0.295. This result suggests that,
for the jet flame, the mixture fraction can be considered an optimal variable for
the parametrization of the thermochemical state of the system, as it is generally assumed in many models for non-premixed combustion. The information
added here is that mixture fraction is an optimal variable from the point of
view of reconstruction error minimization
As far as Flame F flame is concerned, Table 4.13 and Figure 4.24 (b) point
out that VQPCA outperforms FPCA in all cases, providing values of GSRE,n
13-46% lower than those given by FPCA. These results confirm the notorious
complexity of this flame, characterized by significant local extinction and re114
4.5. Results
(a)
(a)
(b)
(b)
Figure 4.20: Contour plots of original and recovered temperature (a, a) and
OH mass fraction (b, b) distribution for DNS1. VQPCA reduction with q = 3
and k = 8. GSRE,n = 0.01.
115

(a)
(b)
Figure 4.21: Parity plots of original and recovered temperature (a, a) and OH
mass fraction (b, b) distribution for DNS1. VQPCA reduction with q = 3 and
k = 8. GSRE,n = 0.04.
ignition. In the context of the Conditional Moment Closure [73], for example,
it has been recognized [119] that conditioning on mixture fraction is not sufficient for Flame F and a second conditioning variable should be used. Figure
4.27 shows the temperature as a function of mixture fraction for the two clusters selected by VQPCA with q = 3. It can be observed that, differently from
the jet flame, the VQPCA algorithm extracts features from the whole mixture
fraction space in order to achieve the best q-dimensional representation of the
thermochemical state of the system. Then, it can be concluded that, for Flame
F, mixture fraction does not represent an optimal reaction variables. Therefore, VQPCA could provide an appealing alternative to guide the selection of
the most compact subset of reaction variables needed to properly describe the
thermochemical state of such reacting system.
Figure 4.24 (c) shows the comparison between the reconstruction provided
by VQPCA and FPCA for the JHC data set. Similarly to Flame F, the VQPCA
algorithm provides GSRE,n , 10-50% lower than those obtained with FPCA.
The mixture fraction partitioning does not optimally follow the curvature of
the manifold in state space, indicating the complexity of turbulence/chemistry
interactions for the system. This is further confirmed by Figure 4.28, which
shows the partition of temperature in the two clusters selected by VQPCA
with q = 3, in mixture fraction space. The algorithm selects a first cluster
characterized by a lean branch and part of a rich region, with an aspect characteristic of a non premixed flame. On the other hand, the second cluster shows
important non equilibrium phenomena, such as extinction, similarly to cluster
2 for Flame F (Figure 4.27 (b)).
To better understand the underlying mechanism of the VQ partitioning
algorithm, it is possible to analyze the structure of the rotated eigenvectors in
116
4.5. Results
(a)
(a)
(b)
(b)
Figure 4.22: Contour plots of original (a, b) and recovered (a, b) temperature
distribution for DNS2, at two different time steps, i.e. t = 1.5e 03 s (a,
a) and t = 2.0e 03 s (b, b). VQPCA reduction with q = 4 and k = 8.
GSRE,n = 0.04.
(a)
(b)
Figure 4.23: Parity plots of temperature (a) and OH (b) mass fraction illustrating the VQPCA (q = 4, k = 8) reduction of DNS2 data set. GSRE,n = 0.04.
117

Figure 4.24: Values of GSRE,n as a function of the number of clusters, k, and retained PCs, q, for the jet flame, Flame F and
JHC data sets.
118
4.5. Results
Figure 4.25: Values of GSRE,n as a function of the number of clusters, k, and

retained PCs, q, for the DNS1 and DNS2 data sets.
(a)
(b)
Figure 4.26: Temperature as a function of mixture fraction in the two clusters

selected by VQPCA for the jet flame. q = 2 and GSRE,n = 0.21.
119

(a)
(b)

selected by VQPCA for Flame F. q = 3 and GSRE,n = 0.21.
(a)
(b)

selected by VQPCA for the JHC data set. q = 3 and GSRE,n = 0.21.
120
4.5. Results
Table 4.15: Rotated eigenvector in the first (a) and second (b) cluster identified
by VQPCA for Flame F. q = 3 and GSRE,n = 0.21
T
YO2
YN2
YH2
(a) YH2 O
YCH4
YCO
YCO2
YOH
YN O
a1,r
-0.01
-0.09
-0.70
-0.01
-0.03
0.71
0.00
0.00
0.00
-0.02
a2,r
0.55
-0.44
-0.07
-0.04
0.44
-0.10
0.07
0.51
0.05
0.18
T
YO2
YN2
YH2
(b) YH2 O
YCH4
YCO
YCO2
YOH
YN O
a1
0.30
-0.34
-0.12
0.05
0.33
-0.01
0.12
0.35
0.60
0.42
a2
-0.01
-0.09
-0.12
0.64
-0.09
0.02
0.70
-0.11
-0.17
-0.13
the two clusters identified by VQPCA for Flame F (Table 4.15) and for the
JHC data set (Table 4.16). For Flame F, the eigenvectors associated to the
first cluster (Table 4.15 (a)) are a mixture fraction 3 and a linear combination
of major species and temperature, respectively. This supports the graphical
observation provided by Figure 4.27 (a), which shows the first cluster to be
characterized by the lean and rich branches of the flame. On the other hand,
the reaction region identified by Figure 4.27 (b), needs to be described by means
of parameters with a strong contribution of intermediate and minor species, as
it is shown in Table 4.15 (b).
With regard to the JHC data set, the structure of the rotated eigenvectors prompts very interesting considerations. In particular, the second cluster
Table (4.16 (b)) is parametrized by a first component with significant weights
on the fuel species, intermediate species and temperature, whereas the second
PC reduces to OH. Thus, VQPCA is able to extract the subset of the data set
dominated by finite rate chemistry effects by means of progress variables able to
capture the ignition process. In the context of the numerical modeling of flameless combustion, such result confirms the need of combustion models suited for
the description of turbulence-chemistry interactions in such combustion regime.
As far as the numerical data are concerned, the VQPCA and FPCA reductions appear comparable for the DNS1 data set (Table 4.14), while VQPCA
outperforms FPCA for DNS2 (Table 4.14).This confirms that mixture fraction
is not optimal from the point of view of error minimization when the physics
under investigation become too complex. This is somehow expected, as mixture
fraction is only a measure of the local system stoichiometry and, then, it can
only cover relatively fast scales.
The small discrepancies between FPCA and VQPCA for DNS1 (and for
3
The denomination mixture fraction is used here because the variables which define the
first PC are highly correlated with the f , cov (f, N2 ) = 0.97 and cov (f, CH4 ) = 0.90.
121

Table 4.16: Rotated eigenvector in the first (a) and second (b) cluster identified
by VQPCA for the JHC data set. q = 3 and GSRE,n = 0.21
T
YO2
YN2
YH2
(a) YH2 O
YCH4
YCO
YCO2
YOH
YN O
a1
0.48
-0.51
0.07
0.00
0.46
0.00
0.05
0.53
0.09
-0.03
a2
0.01
0.03
-0.02
0.00
0.04
0.00
-0.03
0.04
0.01
1.00
T
YO2
YN2
YH2
(b) YH2 O
YCH4
YCO
YCO2
YOH
YN O
a1,r
-0.25
-0.13
-0.50
0.48
-0.23
0.49
0.15
-0.32
0.10
-0.05
a2,r
0.21
-0.03
0.00
-0.06
0.06
0.00
-0.07
0.05
0.96
0.14
the jet flame) suggest that VQPCA actually tends to FPCA when dealing with
relatively simple systems, characterized by fast chemistry and a small degree
of extinction. This is confirmed by Figure 4.29, showing the VQPCA (a-d) and
FPCA (a-d) partition of the DNS1 data. Both the approaches identify a rich
and lean region, together with a rich and lean reacting layer.
Computational cost of the analysis In the above discussion, VQPCA
has been showed to be generally superior to FPCA from the point of view
of reconstruction error minimization. However, it should be reminded that
VQPCA is an iterative algorithm, whereas FPCA is based on the supervised
partitioning of data into bins of mixture fraction (Section 4.3). Therefore,
the CPU time associated with VQPCA is certainly higher that that of FPCA;
moreover, it increases with k, as shown in Figure 4.30, for an experimental
and numerical data set. It is clear how the CPU associated with VQPCA can
reach values of the order of minutes, for the experimental data sets (Figure
4.30 (a)), and hours, for the numerical data sets (Figure 4.30 (b)), whereas
the corresponding CPU time of FPCA is of the order of seconds and minutes.
Therefore, FPCA represents certainly a valid solution for applications similar
to the jet or the DNS1 flame, as it optimizes both CPU time and accuracy of
predictions.
4.6
Development of a PCA based combustion model
In the previous Section, a methodology based on Principal Components Analysis (PCA) has proposed for the identification of low-dimensional manifolds
in turbulent flames, the estimation of their dimensionality and the selection
of optimal reaction variables. The reduced representation given by PCA has
great potential, especially in its local formulations, i.e. VQPCA and FPCA.
122
4.6. Development of a PCA based combustion model
(a)
(b)
(c)
(d)
(a)
(b)
(c)
(d)
JHC data set. GSRE,n = 0.08.
Figure 4.30: CPU time associated with the FPCA and VQPCA reductions.as a
function of the number of clusters, k, and retained PCs, q, for the experimental
(a) and numerical (b) data sets.
123

In fact, the selection of optimal variables for turbulent reacting systems could
be exploited for the development of turbulence-chemistry interaction models.
In this context, the linearity of the PCA method is extremely appealing. In
fact, if a set of reaction variables is selected, only few linear combinations of
the original variables need to be transported in a numerical simulation.
4.6.1
Transport equations for the PCs
Recalling the conservation equation for a reacting species (Chapter 2), it is

possible to show that the equation

uj Yk
Yk
Yk
+
=
Dk
+ k
t
xj
xj
xj
can be transformed into a transport equation for the reacting species. Introducing the material derivative and the Lewis number, the species equation
becomes:
DYk
2
=
Dt
cp Lek x2j
Yk
xj
(4.46)
+ k
where we have assumed the density and species diffusivity to be constant.

If the species mean, Y k , is subtracted and a scaling factor, dk , is applied to
the centered variable, we get:

(Yk Y k )
"
#
D
dk
Yk Y k
2
k
=
+
.
(4.47)
Dt
cp Lek x2j
dk
dk
Indicating with aki the weight of the kth variable on the ith PC, the following
equation is obtained:

D
(Yk Y k )
dk

aki
2
=
cp Lek x2j
Dt
"
#
Yk Y k
k aki
aki +
.
dk
dk
(4.48)
Summing over all the variables we have:

D
Pp
(Yk Y k )
k=1
dk
Dt

aki
" p
#
2 X Yk Y k
=
aki +
cp Lek x2j
dk
k=1
1 X k aki
. (4.49)
dk
k=1
124

P
(Y Y )
But pk=1 k dk k aki is simply the definition of the ith PC, zi . Therefore Eq.
(4.49) can be rewritten as:
Dzi
2 zi
+ zi
=
Dt
cp Lei x2j
(4.50)
where zi is the source term for zi

p
zi
1 X k aki
=
.
dk
(4.51)
k=1
If temperature is also included in the PCs definition, Eq. (4.51) becomes:

p
zi
Qr
1 X k aki
=
+
cp dT
dk
(4.52)
k=1
where Qr is the heat released by reaction, akT is the weight of temperature on

the ith PC and dT is the scaling factor for temperature.
A prerequisite for the validity of the above equation is that the matrix of
PCs coefficients A is constant. Being the proposed PCA analysis based on the
processing of a multitude of observations in both space and time, A is constant
by construction,
More compactly, we can express Eq (4.50) as:
D
(Z) = (jZ ) + ( Z ) ,
Dt
(4.53)
where jZ is the mass diffusive flux of Z In Eq. (4.53), the source terms of
temperature and all species contribute to the source term for each PC.
4.6.2
PCA Modeling Approach
A complete PCA modeling approach requires several ingredients. First, the PCs
must be identified using the procedure outlined in Section 4.1. This identification requires high-fidelity, fully-resolved data including source terms. Once the
PCs are selected, transport equations may be derived for each PC as described
in Section 4.6.1.
Second, the initial conditions (ICs) and boundary conditions (BCs) on the
PCs must be defined using the transformation matrix A. For Dirichlet BCs
on all the original variables, we obtain Dirichlet conditions on the PCs (ICs
are analogously defined). Likewise, Neumann conditions on X yield Neumann
conditions on Z. Mixed conditions on X yield Robin boundary conditions on
Z.
Diffusion terms in the transport equations for Z require evaluation of the
diffusive fluxes for each component of X. In turbulent flow calculations, the
molecular diffusion term is typically augmented by a turbulent diffusion term
125

arising from closure of the convective term. In many cases, and particularly
at high Reynolds number, the molecular diffusion term is small relative to the
turbulent diffusion term and it is neglected. However, even when one wishes
to retain the full description of molecular diffusion, the treatment with PCA is
straightforward. First, X is approximated from Z. Next, the diffusive terms for
X, jX , are constructed. Finally, the diffusive fluxes for the PCs are calculated
as jZ = jX A.
Source terms for the PCs, Z , can be parametrized by Z and tabulated
a priori to avoid run-time calculation. The accurate parametrization of the
source terms is crucial for the successful application of PCA as a modeling
approach. Therefore, the data adopted for PCs extraction must have source
terms for all X, which is currently impossible to obtain from experimental
data. Then, the decisive step for moving on from data analysis (Section 4.5.1)
to predictive modeling is the availability of computational data generated from
reliable chemical mechanisms, using methods such as DNS or ODT. Furthermore, the reliability of PCA as a modeling approach also hinges on the relative
invariance of PCs from one data set to another which is nearby in parameter
space.
4.6.3
Parametrizing the State Variables
This section presents the results of PCA applied to two DNS data sets of
non-premixed CO/H2 combustion. The DNS data-sets (Case A and B) have
been obtained using a code with 8th order spatial and 4th order temporal discretization. Detailed kinetics of CO/H2 oxidation have been used [114], along
with mixture-averaged transport approximations. The fuel stream is 0.45% CO,
0.05% H2 , and 0.5% N2 , giving a stoichiometric mixture fraction of fst = 0.4375,
and both fuel and air streams are at 300 K.
Case A is a spatially-evolving jet with an initial max = 25 s1 , while case
B is a temporally-evolving jet with an initial max = 125 s1 . The primary
difference between the two data-sets is the initial scalar dissipation rate ()
and turbulence intensity, which affects the degree of extinction observed; case
A exhibits virtually no extinction, while case B exhibits moderate extinction.
The existence of moderate extinction in case B is shown qualitatively in Figure
4.31 (a), which shows T versus at fst 4 . Additional details of the DNS code
and simulation configuration may be found elsewhere [120, 88].
To quantify the error in representing the data in low-dimensional space,
parametrized by Z, we calculate the R2 value,
n
X
R =1
(xij xij )2
"
i=1
#" n
#1
X
2
(xij xj )
(4.54)
i=1
The results shown in this Section refer to data conditioned on mixture fraction, f , since
this is a convenient variable to force as the first component.
126
(a)
(b)
Figure 4.31: Parametrization of temperature at fst by (a) and z1 (b) for case
B. Solid lines are the doubly-conditional mean temperature. R2 is calculated
from Eq. (4.54).
where xi is the ith observation of the jth variable, xij is its parametrized approximation, and x
j is the mean of xj . For the state variables, R2 is equivalent
to the parameter tq,j , introduced in Section 4.2.3.1. However, for the source
term, such parameter is not available and the R2 can be directly calculated.
Figures 4.31 (a) and 4.31 (b) show the parametrization (at fst ) of T by
and the first PC, z1 , respectively, for Case B. Examining Figure 4.31 (b), we
see that z1 acts as a progress variable, capturing the extinction process remarkably well. This has also been observed for other choices of progress variables
such as CO2 [88]. Comparing the two-parameter PCA approach with the (f, )
parametrization is reasonable since both are two-parameter models, although
the second parameter ( versus z1 ) represents different physical phenomena
(gradient versus chemical state). Figure 4.32 shows the parametrization of the
OH mass fraction by the common (f, ) and the proposed (f, z1 ) parametrizations. This demonstrates that the PCA approach can be used to represent a
wide range of the state variables, not temperature alone.
Also shown on Figures 4.31 (a) and 4.31 (b) is the R2 value as calculated by
Eq.(4.54). Table 4.17 lists R2 values for reconstruction of the temperature and
all species mass fractions as a function of the number of parameters adopted,
q. These values are a concise, quantitative representation of the information
presented graphically in Figs. 4.31 and 4.32. For example, for Case B with
q = 1, we obtain RT2 = 0.967, corresponding to Figure 4.31 (b). For comparison,
Table 4.17 also lists the R2 values given by the (f, ) parametrization, RT2 =
0.801 (Figure 4.31 (a)). Clearly, the two-parameter (f, z1 ) parametrization
reconstructs the temperature and most other state variables with much more
accuracy than the (f, ) parametrization. It should be noted that the results for
the (f, ) parametrization represent the best possible performance of a model
based on (f, ); the steady laminar flamelet model typically does not perform
ideally [88].
127

T
0.789
0.983
0.983
0.801
0.967
0.996
0.990
H2
0.344
0.259
0.936
0.509
0.370
0.845
0.904
O2
0.811
0.976
0.968
0.807
0.910
0.982
0.982
O
0.718
0.930
0.958
0.697
0.614
0.882
0.984
OH
0.165
0.240
0.963
0.426
0.736
0.931
0.979
H2 O
0.085
0.178
0.924
0.186
0.531
0.990
0.991
H
0.695
0.823
0.964
0.648
0.524
0.858
0.985
HO2
0.839
0.986
0.980
0.665
0.940
0.974
0.977
H2 O 2
0.816
0.916
0.985
0.729
0.849
0.941
0.933
CO
0.803
0.978
0.969
0.810
0.907
0.981
0.981
CO2
0.827
0.956
0.976
0.058
0.094
0.378
0.854
HCO
0.828
0.980
0.980
0.817
0.901
0.984
0.980
Table 4.17: R2 values defined by Eq. (4.54). Also shown are results for the parametrization. All results are at f = fst = 0.4375.
A
1
2
1
2
3
128
(a)
(b)
Figure 4.32: Parametrization of OH mass fraction at fs t by (a) and z1 (b)

for case B. Solid lines are the doubly-conditional mean temperature. R2 is
calculated from Eq. (4.54).
Table 4.17 also demonstrates that increasing the number of retained PCs
increases the accuracy with which the state variables are represented. This
indicates that one may select a desired error threshold and then determine the
minimum number of PCs required to achieve that accuracy. Conversely, one
may choose the number of PCs and estimate a priori the associated error.
4.6.4
Parametrizing Source Terms
The PCs are not conserved variables and their source terms must be parametrized
by the PCs. In this section we explore the ability of PCA to parametrize
source terms. Any function of X may be approximated by F(X) F (XAq ) .
However, it is more accurate to calculate F(X) directly from the data in pdimensional space and then project it onto Z by calculating the conditional
mean hF (X) |Z i. Thus, source terms are calculated directly from the original
observables, X, and their conditional means are projected onto Z. Figure 4.33
illustrates this for the two-dimensional (f, z1 ) parametrization of z1 .
Table 4.18 summarizes the ability of an q-dimensional PCA to parametrize
the source terms of the PCs. We first consider the columns describing the
results at fst . For case A, a two-dimensional parametrization (f, z1 ) captures
z1 with R2 z = 0.978. For case B, 3 PCs are required to parametrize z1 to
1
a similar degree of accuracy. Comparing the dimensionality requirements for
parametrizing Z with those for parametrizing the state variables (Table 4.17),
we see that parametrizing the source terms does not require more PCs than
parametrization of the state variables themselves, an encouraging result.
129

(a)
Figure 4.33: Parametrization of z1 at fst by z1 for case B. Solid line: doublyconditional mean value of z1 . R2 is calculated from Eq. (4.54).
Table 4.18: R2 values defined by Eq. (4.54) for PC source terms, s .
A
B
q
z1
z2
z1
z2
z3
1
0.993
0.270
-
f = 0.2
2
3
0.985
0.996
0.844
0.967
0.835
0.955
0.976
f = fst = 0.4375
1
2
3
0.978
0.985
0.922
0.815
0.932
0.958
0.951
0.961
0.731
130
1
0.923
0.809
-
f = 0.6
2
0.934
0.876
0.852
0.883
-
3
0.902
0.909
0.831
4.7. Summary
4.6.5
Global versus Semi-Local PCA
The results presented thus far have been obtained locally at fst . One may
consider whether a PCA performed at fst is applicable at other f . We term
this a semi-local PCA. If the PCA is highly dependent on mixture fraction,
then one of two options must be considered
Eliminate the mixture fraction as a parameter and seek a global PCA on
the entire data set. This approach typically requires more PCs than a
PCA obtained at fst (Section 4.5.1).
Perform a local PCA (Section 4.5.1) in f -space and derive transport equations for Z|f . These equations would have exchange terms representing
transport in mixture fraction space. This approach is further complicated
by the fact that the definition of the PCs would vary with f .
If the PCA obtained at fst reasonably represents the data at other f , then
the transport equations derived in Section 4.6.2 may be used directly at all f ,
eliminating the need for conditional equations in f -space.
Tables 4.19 and 4.20 provide parametrization errors for the state variables
at f = 0.2 and f = 0.6, respectively. Table 4.18 shows the parametrization
errors for Z at f = 0.2 and f = 0.6. Interestingly, the parametrizations do
not perform well at lean conditions (especially for CASE B); the same is true for
the (f, ) parametrization. A posteriori testing is necessary to fully determine
the parametrization accuracy required. However, these results show promise
for the ability to use a PCA obtained at fst globally.
4.7
Summary
In the first part of the present Chapter, a novel methodology based on Principal
Components Analysis (PCA) has been proposed for the identification of lowdimensional manifolds in turbulent flames, the estimation of their dimensionality and the selection of optimal reaction variables. To this purpose, high fidelity
experimental and numerical data sets have been investigated. Three different
PCA approaches are proposed. A global PCA analysis, GPCA, has been compared to two local PCA models, VQPCA and FPCA, based on the partitioning
of the experimental data into separate clusters where PCA is performed locally.
However, the partitioning algorithm used by VQPCA is unsupervised and based
on reconstruction error minimization while FPCA conditions the data a priori
on the mixture fraction. Results show that the local PCA approaches (VQPCA
and FPCA) outperform the global approach in all cases. Indeed, GPCA is unable to provide a compact representation of the data in a low-dimensional space
due to the highly non-linear relationships existing among the state variables.
Regarding the local approaches, the performances of VQPCA and FPCA are
comparable for a simple jet flames, while FPCA proves unable to capture important features for systems characterized by complex equilibrium phenomena
131

q
T
0.097
0.500
0.968
0.497
0.979
0.996
0.990
H2
0.798
0.413
0.910
0.542
0.741
0.877
0.963
O2
0.169
0.816
0.881
0.390
0.866
0.945
0.958
O
0.774
0.212
0.868
0.303
0.337
0.819
0.989
OH
0.736
0.188
0.859
0.329
0.219
0.822
0.977
H2 O
0.245
0.134
0.940
0.269
0.749
0.994
0.994
H
0.827
0.319
0.888
0.558
0.127
0.806
0.978
HO2
0.812
0.433
0.838
0.537
0.805
0.970
0.984
H 2 O2
0.811
0.398
0.855
0.390
0.858
0.960
0.982
CO
0.580
0.666
0.867
0.417
0.859
0.958
0.968
CO2
0.432
0.555
0.940
0.206
0.513
0.737
0.808
HCO
0.881
0.619
0.934
0.689
0.403
0.860
0.955
Table 4.19: R2 values at f = 0.2 using the PCA obtained at fst . Also shown are results for the parametrization.
A
B
1
2
1
2
3
132
1
2
1
2
3
T
0.676
0.956
0.959
0.628
0.964
0.984
0.986
H2
0.190
0.287
0.962
0.081
0.134
0.612
0.769
O2
0.740
0.958
0.949
0.593
0.904
0.928
0.948
O
0.642
0.887
0.923
0.662
0.804
0.836
0.888
OH
0.548
0.587
0.775
0.112
0.197
0.373
0.543
H2 O
0.073
0.076
0.826
0.246
0.755
0.986
0.991
H
0.434
0.542
0.768
0.365
0.721
0.822
0.913
HO2
0.741
0.966
0.955
0.508
0.938
0.960
0.967
H2 O 2
0.727
0.867
0.898
0.616
0.650
0.791
0.839
CO
0.467
0.836
0.911
0.521
0.844
0.873
0.909
CO2
0.572
0.868
0.919
0.268
0.442
0.542
0.841
HCO
0.555
0.751
0.889
0.570
0.896
0.930
0.941
Table 4.20: R2 values at f = 0.6 using the PCA obtained at fst . Also shown are results for the parametrization.
4.7. Summary
133

(e. g. Flame F, JHC, DNS2), resulting in higher reconstruction error with
respect to VQPCA.
In the second part of the Chapter, a modeling approach based on PCA has
been proposed and tested a priori using DNS data. This modeling approach
is complete, with the exception of a turbulent closure model which would be
required if this model were used in a LES or RANS context. The model is based
on a rotation of the thermochemical state basis from one based on temperature,
pressure, and Ns 1 species mass fractions to one which best represents the
variance in the data. Implementation of the model requires transport equations for the principal components, which are reacting scalars. Results from
a quantitative a priori analysis of this approach using DNS data show great
promise. State variables and source terms both are parametrized well by a
two-parameter (f, z1 ) model, and adding additional parameters provides a
significant increase in accuracy for all state variables and their reaction rates.
Results also indicate a uniformly better representation of the DNS data using
an (f, z1 ) parametrization over the commonly used (f, ). There are many
potential applications of this modeling approach. For example, laminar flame
calculations could benefit from PCA modeling approaches to provide rapid
solutions using a reduced set of equations. Once a full calculation has been
performed, subsequent calculations may be performed using PCs rather than
the full set of species and energy equations. The number of PCs retained can be
chosen by the desired accuracy. This could be particularly useful for stochastic
models such as the Linear Eddy Model (LEM) [121] and the one-dimensional
turbulence (ODT) model [115, 122], which require many realizations of a flow
field. The first realization could employ full chemistry while subsequent realizations utilize a reduced set of equations defined by PCA. Another application is
in modeling turbulent flows, where a compact parametrization of the thermochemical state is key to achieving affordable simulations. Additionally, while
the analysis presented herein has been applied to non-premixed combustion, the
PCA approach applies in principle to all combustion regimes from premixed to
non-premixed. For application to turbulent flows, additional closure models
are required for the unresolved convective and source terms. In the context of
transported PDF methods, a PCA modeling approach could drastically reduce
the computational cost by significantly reducing the thermochemical dimensionality while maintaining a quantified error bound on the thermochemical
reduction.
Future work will focus on examining the feasibility of PCA with various
fuels and exploring the universality of the PCA, i.e. the applicability of a PCA
obtained under one set of conditions to be applied to a simulation at another
set of conditions. Also, a posteriori tests will be conducted to determine the
effect of nonlinear propagation of errors in source term parametrization.
134

Principal Components Analysis For Turbulence-Chemistry Interaction Modeling

Uploaded by

Copyright:

Available Formats

You might also like

Principal Components Analysis For Turbulence-Chemistry Interaction Modeling

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Principal Components Analysis For Turbulence-Chemistry Interaction Modeling

Uploaded by

Copyright:

Available Formats

Chapter 4

Principal Components Analysis

Chapter 4. Principal Components Analysis for turbulence-chemistry

4.1. Definition and derivation of Principal Components

Definition and derivation of Principal Components

To determine z1 , a vector a1 is sought so that var (z1 ) = a1 a1 is maximized,

where Ip is the (p x p) identity matrix. Thus, is an eigenvalue of and a1 is

Chapter 4. Principal Components Analysis for turbulence-chemistry

The second PC, z2 = Xa2 , maximizes the variance var (z2 ) = a2 a2

cov (Xa1 , Xa2 ) = a1 a2 = a2 a1 = 1 a1 a2 = 1 a2 a1 ,

any of the equations

could be used to specify no correlation between z1 and z2 . Choosing arbitrarily

where and are Lagrange multipliers. Differentiating with respect to a2 and

Then, is once more an eigenvalue of and a2 is the corresponding eigenvec0

4.2. Sample PCA

where the vector of coefficients a1 is chosen to maximize the variance

subject to the constraint a1 a1 = 1. Then, for the second PC:

where a2 is chosen to maximize the sample variance of z12 , subject to the

where A is the (p x p) orthogonal matrix whose columns are the eigenvectors of

Recalling the eigenvector decomposition of a symmetric, non-singular matrix,

Chapter 4. Principal Components Analysis for turbulence-chemistry

being A orthonormal and, hence, A1 = A . This means that, given Z, the

Eq. (4.19) can be inverted to obtain:

The linear transformation provided by Eq. (4.20) is particularly appealing

Optimal properties of the PCA reduction

4.2. Sample PCA

Data preprocessing: centering and scaling

Chapter 4. Principal Components Analysis for turbulence-chemistry

where D is the diagonal matrix containing the scaling parameters, dj . When

4.2. Sample PCA

Outlier detection and removal with PCA

Experimental data sets usually contain a few unusual observations. If we refer

The observations associated to large values of DM are considered outliers and,

Chapter 4. Principal Components Analysis for turbulence-chemistry

4.2. Sample PCA

Chapter 4. Principal Components Analysis for turbulence-chemistry

Figure 4.3: Size reduction process with PCA.

Choosing a subset of Principal Components or Variables

The major objective in many applications of PCA is to replace the p elements

Cumulative percentage of total variance

The most obvious criterion for choosing q is to select a cumulative fraction

Following the derivation of tq , an appropriate measure of lack-of-fit of the rank

4.2. Sample PCA

Variance of Principal Components

The rule described in this Section originally applies to correlation matrices,

Broken Stick Model

Frontier [102] proposed a broken-stick method to select the number of PCs. If

Chapter 4. Principal Components Analysis for turbulence-chemistry

Choosing a subset of Variables

Principal components (PCs) are linear combinations of all variables available.

4.2. Sample PCA

Zq as the fixed configuration and transforming Z.

In statistics, Procrustes analysis is a form of statistical shape analysis used to analyze

Chapter 4. Principal Components Analysis for turbulence-chemistry

4.2. Sample PCA

Compute the sample covariance/correlation matrix, S.

Interpretation of principal components

The principal components are, by construction, linear combinations of all the

Figure 4.25: Values of GSRE,n as a function of the number of clusters, k, and