This page intentionally left blank

Academic Press is an imprint of Elsevier
125 London Wall, London EC2Y 5AS, United Kingdom
525 B Street, Suite 1650, San Diego, CA 92101, United States
50 Hampshire Street, 5th Floor, Cambridge, MA 02139, United States
The Boulevard, Langford Lane, Kidlington, Oxford OX5 1GB, United Kingdom
Copyright © 2022 Elsevier Inc. All rights reserved.
No part of this publication may be reproduced or transmitted in any form or by any means,
electronic or mechanical, including photocopying, recording, or any information storage and
retrieval system, without permission in writing from the publisher. Details on how to seek
permission, further information about the Publisher’s permissions policies and our arrangements
with organizations such as the Copyright Clearance Center and the Copyright Licensing Agency,
can be found at our website: www.elsevier.com/permissions.
This book and the individual contributions contained in it are protected under copyright by the
Publisher (other than as may be noted herein).
Knowledge and best practice in this field are constantly changing. As new research and experience
broaden our understanding, changes in research methods, professional practices, or medical
treatment may become necessary.
Practitioners and researchers must always rely on their own experience and knowledge in
evaluating and using any information, methods, compounds, or experiments described herein. In
using such information or methods they should be mindful of their own safety and the safety of
others, including parties for whom they have a professional responsibility.
To the fullest extent of the law, neither the Publisher nor the authors, contributors, or editors,
assume any liability for any injury and/or damage to persons or property as a matter of products
liability, negligence or otherwise, or from any use or operation of any methods, products,
instructions, or ideas contained in the material herein.

Library of Congress Cataloging-in-Publication Data

A catalog record for this book is available from the Library of Congress

British Library Cataloguing-in-Publication Data

A catalogue record for this book is available from the British Library

ISBN: 978-0-12-815861-6

For information on all Academic Press publications

visit our website at https://www.elsevier.com/books-and-journals

Publisher: Candice Janco

Acquisitions Editor: Brian Romer
Editorial Project Manager: Susan Ikeda
Production Project Manager: Surya Narayanan Jayachandran
Designer: Matthew Limbert
Typeset by VTeX
To our wives: Marit, Karina, Ellen Johanne
This page intentionally left blank

Biography xi
Preface xiii

1. Introduction 1
1.1. Computer code 6
References 6

2. Parametric, nonparametric, locally parametric 7

2.1. Introduction 7
2.2. Parametric density models 9
2.3. Parametric regression models 17
2.4. Time series 20
2.5. Nonparametric density estimation 23
2.6. Nonparametric regression estimation 29
2.7. Fighting the curse of dimensionality 33
2.8. Quantile regression 37
2.9. Semiparametric models 38
2.10. Locally parametric 40
References 43

3. Dependence 49
3.1. Introduction 49
3.2. Weaknesses of Pearson’s ρ 52
3.3. The copula 56
3.4. Global dependence functionals and tests of independence 61
3.5. Test functionals generated by local dependence relationships 80
References 81

4. Local Gaussian correlation and dependence 87

4.1. Introduction 87
4.2. Local dependence 90
4.3. Local Gaussian correlation 94
4.4. Limit theorems 99
4.5. Properties 105
4.6. Examples 112
4.7. Transforming the marginals: Normalized local correlation 115
4.8. Some practical considerations 120
4.9. The p-dimensional case 123

viii Contents

4.10. Proof of asymptotic results 123

References 133

5. Local Gaussian correlation and the copula 135

5.1. Introduction 135
5.2. Local Gaussian correlation for copula models 136
5.3. Examples 142
5.4. Recognizing copulas by goodness-of-fit 148
5.5. A real-data study 157
References 159

6. Applications in finance 161

6.1. Introduction 161
6.2. Conditional correlation and the bias problem 164
6.3. Empirical analysis of dependence of financial returns 167
6.4. The portfolio allocation problem 182
6.5. Financial contagion 196
References 209

7. Measuring dependence and testing for independence 213

7.1. Introduction 213
7.2. Testing of independence in iid pairs of variables using local correlation
functionals 214
7.3. Testing for serial independence in time series 224
7.4. Describing nonlinear dependence and tests of independence for two
time series 235
7.5. Proofs 253
References 259

8. Time series dependence and spectral analysis 261

8.1. Introduction 261
8.2. Local Gaussian spectral densities 265
8.3. Visualizations and interpretations 280
References 297

9. Multivariate density estimation 301

9.1. Introduction 301
9.2. Description of the estimator 304
9.3. Asymptotic theory 308
9.4. Bandwidth selection 313
9.5. An example 316
9.6. Investigating performance in the multivariate case 318
9.7. A more flexible version of the LGDE 323
Contents ix

9.8. Proofs 327

References 333

10. Conditional density estimation 335

10.1. Introduction 335
10.2. Estimating the conditional density 337
10.3. Asymptotic theory for dependent data 339
10.4. Examples 342
10.5. Proof of theorems 348
References 352

11. The local Gaussian partial correlation 353

11.1. Introduction 353
11.2. The local Gaussian partial correlation 354
11.3. Properties 358
11.4. Estimation of the LGPC by local likelihood 360
11.5. Asymptotic theory 362
11.6. Examples 365
11.7. Testing for conditional independence 370
11.8. The multivariate LGPC 376
References 382

12. Regression and conditional regression quantiles 385

12.1. Introduction 385
12.2. Comparison with additive regression modeling 387
12.3. Local Gaussian regression estimation 388
12.4. Asymptotic normality 390
12.5. Example 394
12.6. Conditional quantiles 396
12.7. Proof 398
References 401

13. A local Gaussian Fisher discriminant 403

13.1. Introduction 403
13.2. A local Gaussian Fisher discriminant 408
13.3. Some asymptotics of Bayes risk 413
13.4. Choice of bandwidth 417
13.5. Illustrations 420
13.6. Summary remark 425
References 426

Author index 429

Subject index 437
This page intentionally left blank

Dag Tjøstheim
is Emeritus Professor, Department of Mathematics, University of Bergen.
He has a PhD in applied mathematics from Princeton University (1974).
He has authored more than 120 papers in international journals. He is a
member of the Norwegian Academy of Sciences and has received several
prizes for his scientific work. His main interests are in econometrics, non-
linear time series, nonparametric methods, modeling of dependence, spatial
variables, and fishery statistics.

Håkon Otneim
is Associate Professor at the Norwegian School of Economics. He has a
PhD in statistics from the University of Bergen (2016), and he has published
papers in international journals about multivariate density estimation and
conditional density estimation. His research interests include development
and application of nonparametric and semiparametric statistics, statistical
programming, and data visualization.

Bård Støve
is Professor of Statistics at the University of Bergen. He received his PhD
degree in statistics, 2005. He was Assistant Professor at the Norwegian
School of Economics (2007–2011), and worked as an Actuary in a con-
sulting firm (2005–2007). He has been working on the development of
nonparametric models and application of such models to finance and eco-
nomics. He has published several research papers in such journals as Econo-
metric Theory and Scandinavian Journal of Statistics.

This page intentionally left blank

The central idea of this book is the approximation of a general multivariate

density f by a family of Gaussian distributions. Locally around a point x in
the support of f , f is approximated by a multivariate Gaussian distribution.
This makes it possible to define a local mean, a local variance, and a local
correlation matrix.
This idea is powerful and can be applied to a number of tasks and
problems for continuous stochastic variables. This has been done in sev-
eral recent papers. These papers, thirteen altogether, form the basis of this
book. In particular, following two introductory chapters, each of the main
Chapters 3–11 and 13 is composed from one or more of these papers.
In Chapter 4 the emphasis is on the local correlation, its properties, and
its use as a measure of dependence. It can be defined on the original x-scale,
but also on a normalized z-scale obtained by transforming the marginals of
f to the standard normal. This is in a way analogous to the transformation
to uniform variables in a copula construction, and the relationship between
the copula concept and the Gaussian approximation concept is explored in
Chapter 5.
It has long been realized that a multivariate Gaussian distribution fitted
to data in finance or econometrics may not be a good idea. In fact, data
in finance and econometrics have thick tails, and a global Gaussian fit may
lead to disastrous results with a large underestimation of economic risk. In
Chapter 6, we apply local Gaussian approximation to financial data, includ-
ing financial contagion and a preliminary attempt of portfolio construction.
With the establishment of local Gaussian correlation as a measure of lo-
cal dependence, an obvious next step is employing this measure to testing of
independence. This is done in Chapter 7 in three stages: testing of indepen-
dence between two sequences, each consisting of independent identically
distributed variables, testing of serial independence in a time series, and
testing of independence between two stationary time series. This implies
the introduction of a local autocorrelation and cross-correlation concept.
The locally Gaussian autocorrelation introduced in Chapter 7 is used in
Chapter 8 to construct a locally Gaussian spectral density. It coincides with
the ordinary power spectral density in the Gaussian time series case. For
non-Gaussian data, the new local spectral concept can be used to pick up
spectral peaks that may be hidden in a conventional spectral estimation.
xiv Preface

Chapters 9 and 10 are devoted to estimation of multivariate density

functions and to estimation of multivariate conditional densities. This is
done by merging the Gaussian densities of the local approximating families.
In the conditional case the unique properties of the conditional (global)
Gaussian distribution is of crucial importance. Further, the curse of dimen-
sionality is sought circumvented by a simplified Gaussian approximation,
in a sense similar to the use of the additive simplification in nonparametric
The local Gaussian approximation also makes it possible to introduce a
local Gaussian partial correlation. In Chapter 11, it is shown how this can
be used to construct a local measure of conditional dependence and to test
for conditional independence. We believe that this idea has potential for
network theory and causality.
Perhaps the most important use of nonparametric estimation methods
is currently in nonparametric regression. The local Gaussian approxima-
tion concept is developed for multivariate statistical analysis, where all of
the variables are, so to speak, on the same basis, in contradistinction to re-
gression analysis, where one variable or a group of variables are dependent
variables expressed as a function of another group of explanatory variables.
Nevertheless, in Chapter 12, we make an attempt to apply local Gaussian
approximation techniques to regression estimation and quantile regression
estimation. More work is required to determine in what way local meth-
ods can complement nonparametric techniques like, for instance, additive
The local Gaussian approach can be applied to other fields of statistics
as well. As an example of such an application, in Chapter 13, we look
at applications to classification and discrimination, involving among other
things a local Fisher discriminant.
To put local Gaussian approximation analysis into context with other
methods, the book also contains three introductory chapters. Chapter 1
contains a general introduction explaining the overall features and con-
cepts of our approach. Chapter 2 gives a brief but at the same time quite
broad overview over parametric, nonparametric, and locally parametric ap-
proaches to statistics. Chapter 3, based on a very recent survey paper to
appear in Statistical Science, contains an overview of the statistical concept
of dependence, how it can be measured, and how we can test for inde-
pendence. The survey concentrates on methods developed in the last two
decades, going beyond the most used measure of dependence, the Pearson
Preface xv

correlation. The local Gaussian correlation is put directly into this context
in Chapter 4.
There is some overlap between the various chapters in the book. This
has been done intentionally, so that a reader can single out the chapters of
primary interest to her/him. Most chapters can be read independently of
each other as the basic material from Chapter 4 is included briefly as an
introductory material in each of the following chapters. The mathematical
and technical level of each chapter is quite modest. For readers with more
interest in technical details, we give references, often to supplementary ma-
terial to the papers that the book is composed from.
There are three R-packages that have been developed for various types
of analysis in the book. We do not present details of use of these packages
in this book, but references to the packages are given in Chapter 1.
The local Gaussian approach is a recently developed methodology. Some
of the chapters are based on papers that have just appeared or are in the
process of appearing in journals. Putting this in a book, the emphasis is on
presenting the fundamental concepts inherent in a local Gaussian approx-
imation and in demonstrating their usefulness in several areas in statistics.
At the same time, we hope that the book may serve as a starting point and
inspiration for further research and applications in the subject matters taken
up in each of the chapters of the book, as well as in new subject areas.
The chapters of the book have been primarily based on papers by the
three authors of the book, but some chapters have also benefited from joint
work and joint papers with others, namely Karl Ove Hufthammer (Chap-
ters 4 and 6), Geir Berentsen (Chapters 5 and 7), Viginia Lacal (Chapter 7),
Lars Arne Jordanger (Chapter 8), Martin Jullum (Chapter 13), and Anders
Sleire (parts of Chapter 6). Without their contributions the book had not
been possible in its present form, and we are very grateful to them for their
good work and cooperation on these subjects.

Dag Tjøstheim
Håkon Otneim
Bård Støve
Bergen, May 2021
This page intentionally left blank

1.1. Computer code 6
References 6

The most important distribution in statistics is the Gaussian distribu-

tion. It has a number of very useful and special properties, particularly
in the multivariate case. Just think about a normally distributed vector
X = (X1 , . . . , Xp )T of dimension p, where T denotes transposed. Its dis-
tribution is given by the density function
f (x) = exp{(x − μ)T  −1 (x − μ)},
(2π)p/2 ||1/2

where μ = {μi } and  = {σij }, i, j = 1, . . . , p, are the mean vector and co-
variance matrix of X, respectively. Looking at this familiar expression, it
is easy to forget its simplicity and elegance. Here we have a distribution
whose location is completely determined by its means μi , the scale by the
variances σii , and whose dependence relations have the amazing property
that they are completely determined by the pairwise covariances σij .
Moreover, if X is subdivided into two components X = (X 1 , X 2 ), then
any linear combination of X 1 and X 2 is again Gaussian, and the condi-
tional distribution fX 1 |X 2 (x1 |x2 ) is Gaussian. The dependence properties of
these derived distributions are again determined by the pairwise covari-
ances, in the latter case, through the partial covariances. These properties
make the Gaussian especially suitable for linear statistical modeling. Fur-
ther, the properties of the conditional distribution imply that in a Gaussian
system the optimal least squares predictor, given by the conditional mean,
is linear and equals the optimal linear predictor. Finally, uncorrelatedness is
equivalent to independence in the Gaussian distribution, that is, X 1 and X 2
are independent if and only if they are uncorrelated. In this case, we can
test for independence by computing covariances.
Unfortunately, data are not always well described by a Gaussian distri-
bution and a linear model. In particular, for data in economics and finance
the data are usually governed by distributions having thicker tails, and the
dependence properties are not well described by pairwise covariances only,
Statistical Modeling using Local Gaussian Approximation Copyright © 2022 Elsevier Inc.
https://doi.org/10.1016/B978-0-12-815861-6.00008-0 All rights reserved. 1
2 Statistical Modeling using Local Gaussian Approximation

as is inherent in the Gaussian distribution. In fact, assuming a linear Gaus-

sian model can lead to disastrous results with drastic underestimation of the
risk involved in economic and financial transactions; see, for example, Taleb
To some degree, these problems can be avoided, or at least lessened, by
trying to fit other parametric families to the data or by using a semiparamet-
ric or nonparametric approach. An important concept in a nonparametric
methodology is the concept of a local approximation. In nonparametric
density estimation, we may take as the starting point a locally smoothed
version of a histogram of the available data. In a regression, the regression
relationship may be approximated by locally fitted polynomials (the partic-
ular case of the locally constant case being the regression kernel estimator).
What is local is determined by a bandwidth parameter, which for a given
point x, selects the neighboring points close to x, “close” being determined
by the bandwidth acting as a distance measure. A density, a conditional den-
sity, or a regression can be estimated nonparametrically in this manner. As
the dimension p increases, the curse of dimensionality emerges, and sim-
plifying assumptions, such as the additive model for regression, have to be
We are now ready to formulate the main idea of this book and of
the papers it consists of. The idea is simply to approximate an arbitrary
p-dimensional density function f locally by a family of Gaussians distribu-
tions. This can be viewed as an example of a semiparametric or a locally
parametric approach. In principle, another family of parametric distribu-
tions could be used as a local approximant (as has been done by Hjort and
Jones (1996), who considered the locally parametric density estimator), but
we believe that the well-known simple and elegant properties of Gaussians
makes this family of distributions the optimal choice. So, for a point x, in
a neighborhood of x, we fit a Gaussian distribution, the neighborhood be-
ing determined by a bandwidth parameter. The parameters of this Gaussian
distribution will be functions of the coordinates of the point x. Moving
to another point y and fitting another Gaussian in the neighborhood of
y will in general result in another set of parameters depending on y. The
exception is when f itself is Gaussian. In that case, as the number of avail-
able observations tends to infinity at the same time as the bandwidth tends
to zero, the estimated parameters at x and y will ultimately coincide and
be equal to the parameters of the Gaussian f . The advantage of using the
Gaussian distribution as an approximating family is that the unique proper-
ties of the Gaussian can be locally used for a general, possibly non-Gaussian,
Introduction 3

density f . For instance, for a thick-tailed distribution, it can be locally ap-

proximated in the tail by a Gaussian with large variance. In the multivariate
case, we have the potential of approximating multivariate tail behavior lo-
cally by an appropriate multivariate Gaussian. This turns out to be useful
for multivariate financial market data.
Using this idea as a statistical modeling philosophy has many ramifi-
cations and applications as we try to illustrate throughout the book. For
example, we can define local covariances and correlations, and even lo-
cal partial covariances. Local dependence and conditional dependence can
be measured by these quantities, and local independence and conditional
independence can be tested. Dependence properties may be analyzed by ag-
gregation, and independence may be tested over larger regions, ultimately
over the entire range of the data. Further, by using the same principle in
approximating locally the joint distribution of time series variables, we can
introduce concepts of local autocorrelation and cross-correlation. A local
spectral density can be constructed, which makes it possible to derive one
local spectral density describing the oscillatory properties at one level (e.g.,
close to extremes) and another local one describing the frequency distribu-
tion close to the center of the data.
Seen from this perspective, the unique properties of the Gaussian can be
utilized in a non-Gaussian environment but, again, locally. In this book, we
discuss many such examples, but there are other avenues that have not been
explored so far. Local principal component analysis is one of them. The
ordinary principal components are found by solving an eigenvalue problem
involving the covariance matrix. Local principal components can be found
by replacing the ordinary global covariance matrix by a matrix of local
covariances. Another area where research has been initiated, is multiple
spectral analysis, where a local amplitude spectrum and phase spectrum can
be introduced. There are other possibilities as well, and we believe local
Gaussian approximation to be a very comprehensive tool.
Another potential extension is to broaden the Gaussian family to a more
general family like the family of elliptic distributions. Then we lose some of
the simple properties of the Gaussian (e.g., independence is not equivalent
to uncorrelatedness), but, on the other hand, a multivariate t-distribution,
belonging to the elliptic family, is much easier to approximate. Other fam-
ilies are considered by Hjort and Jones (1996), but just in the situation of
deriving an alternative density estimator to the kernel estimator.
Local quantities like the local correlation is sensitive to the curse of di-
mensionality as the dimension of X increases. Throughout the book, we
4 Statistical Modeling using Local Gaussian Approximation

discuss ways of bypassing it. Somewhat similarly to the additive approxi-

mation in regression analysis, we try to approximate the elements σij (x) of
the covariance matrix by a function of two coordinates σij (xi , xj ). Still we
have to be careful at the edges of the data set where there are few observa-
tions. Another device that has been much used in this book is transforming
the data to a standard normal marginal scale by using the marginal empir-
ical distribution function. A very different approach, which may deserve a
closer examination, is trying to fit a parametric model to the local quanti-
Here is a brief overview of the contents of the book:
To put our method into perspective, we summarize briefly traditional
parametric, semiparametric, nonparametric, and locally parametric mod-
eling in Chapter 2. Very briefly, properties of the Gaussian and elliptic
distribution are also included.
Local correlation represents one way of measuring dependence as a lo-
cal version of the traditional Pearson correlation. Chapter 3 presents a fairly
self-contained review of recent developments in nonlinear dependence
analysis, among them, the Brownian distance covariance and reproducing
kernel Hilbert space measures, both having received considerable attention
In Chapter 4, we contrast the dependence measures of Chapter 3 with
the local Gaussian correlation (LGC). We define this concept as well as the
concept of local Gaussian approximation. Two versions are described, one
on the original x-scale and one on the z-scale obtained by transforming
the marginals to standard normals. We give a number of properties and
illustrate these on simulated and financial real data.
The connection to description of dependence by means of copulas is
explored in Chapter 5. We compute the local correlation for traditional
copulas like the Clayton, Gumbel, and Frank copulas.
Much of what we do is motivated by problems met in the description of
financial markets. Chapter 6 contains a number of applications to financial
and econometric data. It is shown that key “typical” dependence properties
of such markets are well described by the local Gaussian correlation, and
we also look at applications to financial contagion, portfolio analysis, and
value at risk.
In Gaussian distributions, independence and uncorrelatedness are
equivalent, so it is natural to use the local Gaussian correlation to test
for local and nonlinear independence. In Chapter 7, we extend this to tests
of serial dependence in a univariate time series and to independence test-
Introduction 5

ing between two time series. We compare to other tests like the Brownian
distance covariance for both simulated and real data.
In Chapter 8 the time series frame is kept, but here we focus on the
local autocorrelation and the local spectrum that can be derived from it.
It is shown that frequency behavior that cannot be detected by ordinary
spectral analysis can be detected by the local spectrum. The chapter also
contains a brief review of alternative nonlinear spectral techniques.
Chapters 9 and 10 are devoted to density estimation and conditional
density estimation, respectively. The density estimation is the aspect stressed
by Hjort and Jones (1996) in their local parametric analysis. We carry this
through for the local Gaussian approximation of a density and compare
with other methods as the dimension increases. In the conditional density
estimation, we exploit locally the fact that the conditional density in a joint
Gaussian density framework is again a Gaussian density, where the local
mean vector and covariance matrix can be found by explicit formulas.
In a sense testing for conditional independence is more important than
testing for independence. This is due to the applications to causality analysis
among other things. For globally Gaussian data, the partial correlation co-
efficient is an important tool, for example, in path analysis. In Chapter 11,
we introduce the local partial correlation and use it both for measur-
ing conditional dependence and for testing of conditional independence.
We compare with alternative tests and give applications to testing Granger
Regression and conditional quantile estimation is covered in Chap-
ter 12. We note that the local Gaussian approach is primarily suited to a
situation where all the variables are treated on the same basis. It is perhaps
less well suited to a situation where there is one dependent variable and
one or several explanatory variables. Nevertheless, we show in this chapter
that the local Gaussian approximation can be applied and that in particular
cases it may offer an alternative to the additive approximation in regression
The traditional Fisher discriminant for discriminating between two or
more populations is based on a Gaussian assumption. In Chapter 13, we
make the parameters of the Gaussian local and derive a local Gaussian Fisher
discriminant, which is applied to simulated and real data. It is easy to find
examples where the global Fisher discriminant does not work, whereas the
local one does.
6 Statistical Modeling using Local Gaussian Approximation

1.1 Computer code

The package lg, see Otneim (2021), for the R programming language (see
R Core Team, 2017) provides implementations of most of the methodolog-
ical advances on applications of the local Gaussian approximation presented
in this book. This includes estimation of the local Gaussian correlation it-
self, multivariate density estimation, conditional density estimation, various
tests for independence, conditional independence and financial contagion
(cf. Chapter 6), and a graphical module for creating dependence maps;
see Otneim (2019). Note that the use of local Gaussian correlation in
spectral analysis of time series, presented in Chapter 8, has its own com-
putational ecosystem in the localgaussSpec-package1 for R. The R package
localgauss (see Berentsen et al., 2014) provided the first publicly available
implementation of the LGC and a test for independence. Note that the
lg-package depends on the localgauss-package.
We refer to the R documentation of the mentioned packages and Ot-
neim (2021) for the direct use of various available functions, as this will not
be covered in the book.

Berentsen, G.D., Kleppe, T., Tjøstheim, D., 2014. Introducing localgauss, an R package for
estimating and visualizing local Gaussian correlation. Journal of Statistical Software 56
(12), 1–18.
Hjort, N., Jones, M., 1996. Locally parametric nonparametric density estimation. Annals of
Statistics 24 (4), 1619–1647.
Otneim, H., 2019. lg: Locally Gaussian distributions: estimation and methods. https://
CRAN.R-project.org/package=lg. R package version 0.4.1.
Otneim, H., 2021. Ig: an R package for local Gaussian approximations. To appear, The R
Journal. URL: https://journal.r-project.org/archive/2021/RJ-2021-079/index.html.
R Core Team, 2017. R: A Language and Environment for Statistical Computing. R Foun-
dation for Statistical Computing, Vienna, Austria.
Taleb, N.N., 2007. The Black Swan: The Impact of the Highly Improbable. Random

1 See https://github.com/LAJordanger/localgaussSpec for details.


