Professional Documents
Culture Documents
Sparse Downscaling and Adaptive Fusion of Multi-Sensor Precipitation
Sparse Downscaling and Adaptive Fusion of Multi-Sensor Precipitation
net/publication/258462630
CITATIONS READS
0 115
2 authors, including:
Ardeshir M. Ebtehaj
University of Minnesota Twin Cities
56 PUBLICATIONS 469 CITATIONS
SEE PROFILE
Some of the authors of this publication are also working on these related projects:
All content following this page was uploaded by Ardeshir M. Ebtehaj on 26 April 2014.
5944
EBTEHAJ AND FOUFOULA-GEORGIOU: REGULARIZED DOWNSCALING, DATA FUSION, AND ASSIMILATION
observations or outputs of a global circulation model [e.g., Georgiou [2011a] proposed a fusion methodology in the
Reichle et al., 2001a; Castro et al., 2005; Zupanski et al., wavelet domain to merge TRMM-PR and ground-based
2010]. Statistical downscaling methods encompass a large NEXRAD measurements, aiming to preserve the non-
group of methods that typically use empirical multiscale Gaussian structure and local extremes of precipitation fields.
statistical relationships, parameterized by observations or [5] Data assimilation has played an important role in
other environmental predictors, to reproduce realizations of improving the skill of environmental forecasts and has
fine-scale fields. Precipitation and soil moisture statistical become by now a necessary step in operational predictive
downscaling has been mainly approached via spectral and models [see Daley, 1993]. Data assimilation amounts to
(multi)fractal interpolation methods, capitalizing on the integrating the underlying knowledge from the observa-
presence of a power law spectrum and a statistical self-sim- tions into the first guess or the background state, typically
ilarity/self-affinity in precipitation and soil moisture fields provided by a physical model from the previous forecast
[Lovejoy and Mandelbrot, 1985; Lovejoy and Schertzer, step. The goal is then to obtain an improved estimate of the
1990; Gupta and Waymire, 1993; Kumar and Foufoula- current state of the system with reduced uncertainty, the so-
Georgiou, 1993; Perica and Foufoula-Georgiou, 1996; called analysis. The analysis is then used to forecast the
Veneziano et al., 1996; Wilby et al., 1998a, 1998b; Deidda, state at the next time step and so on (see Daley [1993] and
2000; Kim and Barros, 2002; Rebora et al., 2005; Badas Kalnay [2003] for a comprehensive review). One of the
et al., 2006; Merlin et al., 2006; among others]. In varia- most common approaches to the data assimilation problem
tional approaches, a direct cost function is defined whose relies on variational techniques [e.g., Sasaki, 1958; Lorenc,
optimal point is the desired fine-scale field which can be 1986; Talagrand and Courtier, 1987; Courtier and Tala-
obtained via using an optimization method. Recently along grand, 1990; Parrish and Derber, 1992; Zupanski, 1993;
this direction, Ebtehaj et al. [2012] cast the rainfall DS Courtier et al., 1994; Reichle et al., 2001b; Margulis and
problem as an inverse problem using sparse regularization Entekhabi, 2003; among many others]. In these methods,
to address the intrinsic rainfall singularities and non- one explicitly defines a cost function, typically quadratic,
Gaussian statistics. This variational approach belongs to whose unique minimizer is the analysis state. On the other
the class of methodologies presented and extended in this hand, very recently, Freitag et al. [2012] proposed a regu-
paper. larized variational data assimilation scheme to improve
[4] The DF problem has also been a subject of continu- assimilation results in advection-dominated flow in the
ous interest in the precipitation science community mainly presence of sharp weather fronts.
due to the availability of rainfall measurements from multi- [6] The common thread in the DS, DF, and DA problems
ple spaceborne (e.g., TRMM and GOES satellites) and is that, in all of them, we seek an improved estimate of the
ground-based sensors (e.g., the NEXRAD network and rain true state given a suite of noisy and down-sampled observa-
gauges). The accuracy and space-time coverage of tions and/or uncertain model-predicted states. Specifically,
remotely sensed rainfall are typically conjugate variables. let us suppose that the unknown true state in continuous
In other words, more accurate observations are often avail- space is denoted by x(t) and its indirect observation (or
able with lower space-time coverage and vice versa. For model output), by y(r). Let us also assume that x(t) and y(r)
instance, low-orbit microwave sensors provide more accu- are related via a linear integral equation, called the Fred-
rate observations but with less space-time coverage com- holm integral equation of the first kind, as follows:
pared to the high-orbit geo-stationary infrared (GOES-IR) Z 1
sensors. Moreover, there are often multiple instruments on Hðr; tÞxðtÞdt ¼ yðrÞ; 0 r 1; ð1Þ
a single satellite (e.g., precipitation radar and microwave 0
imager on TRMM), each of which measures rainfall with where Hðr; tÞ is the known kernel relating x(t) and y(r).
different footprints and resolutions. A wide range of meth- Recovery of x(t) knowing y(r) and Hðr; tÞ is a classic linear
odologies, including weighted averaging, regression, filter- inverse problem. Clearly, the deconvolution problem is a
ing, and neural networks, has been applied to combine very special case with the kernel of the form Hðr tÞ,
microwave and Geo-IR rainfall signals [e.g., Adler et al., which in its discrete form plays a central role in this paper.
2003; Huffman et al., 1995; Sorooshian et al., 2000; Huff- Linear inverse problems are by nature ill-posed, in the
man et al., 2001; Hong et al., 2004; Huffman et al., 2007]. sense that they do not satisfy at least one of the following
Furthermore, a few studies have addressed methodologies three conditions: (1) existence, (2) uniqueness, and (3) sta-
to optimally combine the products of the TRMM precipita- bility of the solution. For instance, when due to the kernel
tion radar (PR) with the TRMM microwave imager (TMI) architecture, the dimension of the observation is smaller
using Bayesian inversion and weighted least squares than that of the true signal, infinite choices of x(t) may lead
(WLS) approaches [e.g., Masunaga and Kummerow, 2005; to the same y(r) and there is no unique solution for the
Kummerow et al., 2010]. From another direction, Gaussian problem. For the case when y(r) is noisy and has a larger
filtering methods on Markovian tree-like structures, the so- dimension than the true state, the solution is typically very
called scale recursive estimation (SRE), have been pro- unstable because the high-frequency components in y(r) are
posed to merge spaceborne and ground-based rainfall typically amplified and spoil the solution in the inversion
observations at multiple scales [e.g., Gorenburg et al., process. A common approach to make an inverse problem
2001; Tustison et al., 2003; Bocchiola, 2007; Van de well posed is via the so-called regularization methods [e.g.,
Vyver and Roulin, 2009; Wang et al., 2011], see also Hansen, 2010]. The goal of regularization is to properly
Kumar [1999] for soil moisture applications. Recently, constrain the inverse problem aiming to obtain a unique
using the Gaussian-scale mixture probability model and an and sufficiently stable solution. The choice of regulariza-
adaptive filtering approach, Ebtehaj and Foufoula- tion typically depends on the continuity and degree of
5945
EBTEHAJ AND FOUFOULA-GEORGIOU: REGULARIZED DOWNSCALING, DATA FUSION, AND ASSIMILATION
smoothness of the state variable of interest, often called the rainfall downscaling are presented in this section by taking
regularity condition. For instance, some state variables or into account the specific regularity and statistical distribu-
environmental fluxes are very regular with high degree of tion of the rainfall fields in the derivative space. Section 4
smoothness and differentiability (e.g., pressure), while is devoted to the regularized DF class of problems with
others might be more irregular and suffer from frequent examples and results on remotely sensed rainfall data. The
and different sorts of discontinuities (e.g., rainfall). In fact, regularized DA problem is discussed in section 5. Conclud-
it can be shown that the proper choices of regularization ing remarks and future research perspectives are presented
not only yield unique and stable solutions but also reinforce in section 6. The important duality between regularization
the underlying regularity of the true state in the solution. It and its statistical interpretation is further presented in Ap-
is important to note that different regularity conditions are pendix A, while Appendix B is devoted to algorithmic
theoretically consistent with different statistical signatures details important for implementation of the proposed
in the true state, a fact that may guide proper design of the methodologies.
regularization, as explored in this study.
[7] The central goal of this paper is to propose a unified 2. Discrete Inverse Problems: Conceptual
framework for the class of DS, DF, and DA problems by Framework
recasting them as discrete linear inverse problems using a
relevant regularization in the derivative space, aiming to [10] In this section, we briefly explain the conceptual
solve them more accurately compared to the classic key elements of discrete linear inverse estimation relevant
weighted least squares (WLS) formulations. From a statisti- to the problems at hand and leave further details for the
cal standpoint, the main motivation is to explicitly incorpo- next sections. Analogous to equation (1), linear discrete
rate non-Gaussianity of the underlying state in the inverse problems typically amount to estimating the true
derivative domain as a prior knowledge to obtain an high-resolution m-element state vector x 2 Rm from the
improved estimate of jump and isolated extreme variabil- following observation model:
ities in the time-space structure of the hydrometeorological
state of interest. Note that the proposed framework relies y ¼ Hx þ v; ð2Þ
on the seminal works by, for example, Tibshirani [1996],
Chen et al. [2001], Candes and Tao [2006], and recent where y 2 Rn denotes the observations (e.g., output of a
developments in mathematical formalisms of inverse prob- sensor), H 2 Rnm is an n m observation operator which
lems [e.g., Hansen, 2010; Elad, 2010], which have maps the state space onto the observation space, and v
received a great deal of attention in statistical regression N ð0; RÞ is the Gaussian error in Rn . Note that the observa-
and image processing, but are relatively new to the com- tion operator, which is a discrete representation of the ker-
munities of hydrologic and atmospheric sciences. To the nel in equation (1), and the noise covariance are supposed
best of our knowledge, in these areas, the only studies that to be known or properly calibrated. Depending on the rela-
explore these methodologies are Ebtehaj et al. [2012] and tive dimension of y and x, this linear system can be under
Freitag et al. [2012] for rainfall downscaling and data determined ðm nÞ or overdetermined ðm nÞ. In the
assimilation of sharp fronts, respectively. under-determined case, there are infinite different x’s that
[8] The presented methodologies for the DS and DF satisfy equation (2), while for the overdetermined case a
problems are examined through downscaling and data unique solution may not exist. As is evident, the DS prob-
fusion of remotely sensed rainfall observations, which have lem belongs to the class of under-determined systems
fundamental applications in flash flood predictions, espe- because the sensor output is a coarse-scale and noisy repre-
cially in small watersheds [Rebora et al., 2005; Siccardi sentation of the true state. However, the class of DF and
et al., 2005; Rebora et al., 2006]. We show that the pre- DA problems falls into the category of overdetermined sys-
sented methodologies allow us to improve the quality of tems, as the total size of the observations and background
rainfall estimation and reduce estimation uncertainty by state exceeds the dimension of the true state.
recovering the small-scale high-intensity rainfall extreme [11] In each of the above cases, we may naturally try to
features, which have been lost in the low-resolution sam- obtain a solution with minimum error variance by solving a
pling of the sensor. For the DA family of problems, the linear WLS problem. However, for the under-determined
promise of the presented framework is demonstrated via an case the solution still does not exist, while for the overde-
elementary example using the heat equation, which plays a termined case it is commonly ill-conditioned and sensitive
key role in the study of land surface heat and mass fluxes to the observation noise (see section 4). Therefore, the min-
[e.g., Peter-Lidard et al., 1997; Liang et al., 1999]. The imum variance WLS treatment cannot properly make the
results demonstrate that the accuracy of the analysis and above inverse problems well posed. To obtain a unique and
forecast cycles in a DA problem can be markedly stable solution, the basic idea of regularization is to further
improved, compared to the classic variational methods, constrain the solution. For instance, among many solutions
especially when the initial state exhibits different forms of that fit the observation model in equation (2), we can obtain
discontinuities. the one with minimum energy, mean-squared curvature, or
[9] Section 2 provides conceptual insight into the dis- total variation. The choice of this constraint or regulariza-
crete inverse problems. Section 3 describes the DS problem tion highly depends on a priori knowledge about the
in detail, as a primitive building block for the other studied underlying regularity of x. For sufficiently smooth x, we
problems. Important classes of regularization methods are naturally may promote a solution with minimum mean-
explained and their statistical interpretation is briefly dis- squared curvature to impose the desired smoothness on the
cussed from the Bayesian point of view. Examples on solution. However, if the state is nonsmooth and contains
5946
EBTEHAJ AND FOUFOULA-GEORGIOU: REGULARIZED DOWNSCALING, DATA FUSION, AND ASSIMILATION
frequent jumps and discontinuities, a solution with mini- above regularization term minimizes the energy in the de-
mum total variation might be a better choice. In subsequent rivative space, which naturally imposes extra smoothness
sections, we explain these concepts in more detail for the on the solution.
DS, DF, and DA problems with examples relevant to some [15] Depending on the intrinsic regularity of the underly-
land-surface hydrometeorological problems. ing state and the selected L, other choices of the regulariza-
tion term are also common. For example, in the case when
3. Regularized Downscaling the L projects a major part of the state vector onto (near)
zero values, the preferred choice is the ‘1-norm regulariza-
3.1. Problem Formulation tion [e.g., Tibshirani, 1996; Chen et al., 1998, 2001]. Such
[12] To put the DS problem in a linear inverse estimation a property is often called sparse representation in the L
framework, we recognize that in the observation model of space and gives rise to the following formulation of the
equation (2), the true high-resolution (HR) state x 2 Rm regularized DS problem:
has a larger dimension than the low-resolution (LR) obser-
vation vector y 2 Rn , that is, m n. Throughout this 1
m x^ ¼ argmin jjy Hxjj2R1 þ jjLxjj1 ; ð6Þ
work, a notation is adopted in which the vector p x ffiffi2
ffi pRffiffiffi x 2
may also represent, for example, a 2-D field X 2 R m m , X
which is vectorized in a fixed order (e.g., lexicographical). where the ‘1-norm is jjxjj1 ¼ jx j. By choosing L as a
[13] As explained in the previous section, the DS prob- i i
derivative operator in equation (6), in effect we minimize a
lem naturally amounts to obtaining the best WLS estimate
measure of total variation of the state of interest. It is well
x^ of the HR or fine-scale true state as follows:
understood that in this case, we typically better recover dis-
continuities and local jump singularities compared to the
1 2
x^ ¼ argmin jjy HxjjR1 ; ð3Þ ‘2-norm regularization in the derivative domain. Note that,
x 2 contrary to the Tikhonov regularization in equation (4), the
‘1-norm regularization is a nonsmooth convex optimization
where jjxjj2A ¼ xT Ax denotes the quadratic norm, while A as the regularization term is nondifferentiable and the con-
is a positive definite matrix. Due to the ill-posed nature of ventional iterative gradient descent methods are no longer
the problem, this optimization does not have a unique solu- applicable in their standard forms.
tion, as setting
the derivative
of the cost function to zero, [16] One of the common approaches to treat the nondif-
the Hessian HT R1 H is definitely singular. To narrow ferentiability in equation (6) is to replace the ‘1-norm with
down all possible solutions to a stable and unique one, a a smooth X approximation, the so-called Huber norm,
common choice is to regularize the problem by constrain- jjxjjHub ¼ ðx Þ, where
i T i
ing the squared Euclidean norm of the solution to be less
x2 jxj
than a certain constant, that is, jjLxjj22 const :, where L is T ðx Þ ¼ ; ð7Þ
ð2jxj Þ jxj >
an appropriately chosen transformation and jjxjj22 ¼
X and denotes a nonnegative threshold (Figure 1). The
jx j2 denotes the Euclidean ‘2-norm. Note that, by put-
i i Huber norm is a hybrid norm that behaves similarly to the
ting a constraint on the Euclidean norm of the state, we not ‘1-norm for values greater than the threshold while for
only narrow down the solutions but also implicitly suppress smaller values it is identical to the ‘2-norm. From the statis-
the large components of the inverted noise and reduce their tical regression point of view, the sensitivity of a norm as a
spoiling effect on the solution. penalty function to the outliers depends on the (relative)
[14] Using the theory of Lagrange multipliers, the dual values of the norm for large residuals. If we restrict our-
form of the constrained version of the optimization in equa- selves to convex norms, the least sensitive ones to the large
tion (3) is residuals or say the outliers are those with linear behavior
1 for large input arguments (i.e., ‘1 and Huber). Because of
x^ ¼ argmin jjy Hxjj2R1 þ jjLxjj22 ; ð4Þ this property, these norms are often called robust norms
x 2
[Huber, 1964, 1981; Boyd and Vandenberghe, 2004].
where > 0 is the Lagrange multiplier or the so-called reg- Throughout this paper, for solving equation (6), we use the
ularizer. This problem is a smooth convex quadratic pro- Huber relaxation due to its simplicity, efficiency, and adap-
gramming problem and is known as the Tikhonov tivity to all of the concerning classes of DS, DF, and DA
regularization with the following unique analytical problems. This issue is further discussed in Appendix B.
solution: [17] In general, the first term in equations (4) and (6)
1 measures how well the solution approximates the given
x^ ¼ HT R1 H þ 2LT L HT R1 y; ð5Þ (noisy) data, while the second term imposes a specific regu-
larity on the solution. In effect, the regularizer plays a
provided that LT L is positive definite [Tikhonov et al., trade-off role between making the fidelity to the observa-
1977; Hansen, 1998; Golub et al., 1999; Hansen, 2010]. tions sufficiently large, while not imposing too much regu-
As is evident, the L transformation also plays a key role in larity (degree of smoothness) on the solution. The smaller
the solution of the regularized DS problem. For instance, the value of , the more weight is given to fitting the
choosing an identity matrix in equation (4) implies that we (noisy) observations which typically results in solutions
are looking for a solution with the smallest Euclidean norm that are less regular and prone to overfitting. On the other
(energy), while if L represents a derivative operator, the hand, the larger the value of , the more weight is given to
5947
EBTEHAJ AND FOUFOULA-GEORGIOU: REGULARIZED DOWNSCALING, DATA FUSION, AND ASSIMILATION
5948
EBTEHAJ AND FOUFOULA-GEORGIOU: REGULARIZED DOWNSCALING, DATA FUSION, AND ASSIMILATION
Figure 2. Two-dimensional mathematical models for the smoothing and down-sampling properties of
an LR sensor via the convolution operation. (a) A simple representation of an observation model for a
neighborhood of size 3 3 using a simple smoothing (averaging) observation operator. (b and c) A sam-
ple effect of the filtering operation (C) and its transpose (CT) on a discrete 2-D unit pulse, given the 3
3 kernel on the left. (d) A sample effect of the 2-D down-sampling operator (D) and its transpose (DT)
with scaling ratio 2.
to be properly estimated and calibrated based on observa- form pðxÞ / exp ðjxjÞ. It is seen that the analyzed rain-
tional and theoretical studies [e.g., Ciach and Krajewski, fall image exhibits (nearly) sparse representation in the de-
1999; Hossain and Anagnostou, 2005, 2006; Krajewski et rivative space with a large mass around zero and heavier
al., 2011; Maggioni et al., 2012; AghaKouchak et al., 2012]. tail than the Gaussian.
[22] The choice of the regularization term also plays a [23] This well-behaved non-Gaussian structure in the de-
very important role on the accuracy of the DS solution. Fig- rivative space mainly arises due to the presence of spatial
ure 4a demonstrates a NEXRAD reflectivity snapshot (re- coherent and correlated patterns in the rainfall fields which
solution of 1 1 km) over the Texas TRMM satellite contain sharp transitions (large gradients) and isolated sin-
ground validation site, while Figure 4b displays the stand- gularities (high-intensity rain cells). In effect, over the large
ardized histogram of the discrete Laplacian coefficients areas of almost uniform rainfall reflectivity values, a mea-
(second-order differences) and the fitted exponential of the sure of derivative translates those values into a large num-
ber of (near) zero coefficients; however, over the less
frequent jumps and isolated high-intensity rain cells, deriv-
ative coefficients are markedly larger than zero and form
the tails. Note that this non-Gaussianity is due to the intrin-
sic spatial structure of rainfall fields and cannot be resolved
by a logarithmic or power law transformation (e.g., Z-R
relationship). It is seen that after applying a relevant Z-R
relationship on the reflectivity fields, the shape of the rain-
Figure 3. (a) A uniform smoothing (low pass) kernel of fall histogram remains non-Gaussian and still can be
size sc sc. (b) The discrete (high pass) generalized Lapla- approximated by the Laplace density (not shown here).
cian filter of size 3 3, where is a parameter ranging [24] The universality of this statistical structure in the
between 0 and 1. The Laplacian coefficients, obtained by distribution of derivative coefficients has been observed in
filtering the 2-D state with the Laplacian kernel, are ap- many rainfall reflectivity fields [Ebtehaj and Foufoula-
proximate measures of the second-order derivative. Georgiou, 2011b], denoting that the choice of the Laplace
Throughout this paper, we choose ¼ 0.5, which corre- prior and ‘1-norm regularization is preferred in the rainfall
sponds to the standard second-order differencing operation. DS problems rather than the choice of the Tikhonov
5949
EBTEHAJ AND FOUFOULA-GEORGIOU: REGULARIZED DOWNSCALING, DATA FUSION, AND ASSIMILATION
Figure 4. A rainfall reflectivity field and the distribution of its standardized Laplacian coefficients,
Lxn ¼ Lx=std ðLxÞ, where std ðÞ is the standard deviation. (a) NEXRAD reflectivity snapshot at the
TRMM GV-site in Houston, TX (HSTN) on 11/13/1998 (00:02:00 UTC) at scale 1 1 km. (b) The his-
togram of the standardized Laplacian coefficients, with ¼ 0.5 (Figure 3b) and (c) their corresponding
log histogram. Note that the zero coefficients over the nonrainy background have been excluded from
the histogram analysis. The solid line in Figure 4b is the least squares fitted exponential of the form
pðxÞ / exp ðjxjÞ, and the dash-dot line shows a standard normal distribution for comparison. The log
histogram in Figure 4c contrasts the heavy-tailed structure of the Laplacian coefficients versus the Gaus-
sian distribution, clearer than the original histogram in Figure 4b.
regularization. Throughout this paper, we use the Laplacian [28] To demonstrate the performance of the proposed
for L not only for its sparsifying effect on rainfall fields but regularized DS methodology, the NEXRAD HR observa-
also because of our empirical evidence about its stabilizing tion x was assumed as the true state, while the LR observa-
role and computational adaptability for rainfall downscal- tions y were obtained by smoothing x with an average filter
ing and data fusion problems. of size sc sc, followed by a down-sampling operator with
[25] In practice, the histogram of the derivatives may ex- ratio sc. Given the true state and constructed LR observa-
hibit a thicker tail than the Laplace density, requiring a heav- tions, we can quantitatively examine the effectiveness of
ier tail probability model, such as the Generalized Gaussian the presented DS methodology by comparing the down-
Density (GGD) of the form pðxÞ / exp ðjxjp Þ, where scaled HR fields with the true HR field using some com-
p < 1 [see Ebtehaj and Foufoula-Georgiou, 2011b]. How- mon quality metrics.
ever, using such a prior model gives rise to a nonconvex [29] Both the Huber and Tikhonov regularization meth-
optimization problem in which convergence to the global ods were examined to downscale the observations from
minimum cannot be easily guaranteed. Therefore, the choice scales 4 4 and 8 8 km down to 1 1 km (Figure 5). A
of the ‘1-norm (the Laplace prior) for rainfall downscaling is very small amount of white noise v with standard deviation
indeed the closest convex relaxation that can partially fulfill of le-2 (5% of the standard deviation of the reference rain-
the strict statistical interpretation of the rainfall fields in de- fall field only over the wetted areas) was added to the LR
rivative domains. Following our observations related to the observations (equation (2)), giving rise to a diagonal error
distribution of the rainfall derivatives, here we direct our covariance matrix. In both of the regularization methods,
attention to the Huber penalty function as a smooth approxi- for downscaling from 4-to-1 and 8-to-1 km in grid spacing,
mation of the ‘1 regularization, and cast the rainfall DS as the regularization parameter was set to 5e-3 and le-2,
the following constrained variational problem: respectively. These values were selected through trial and
error ; however, there are some formal methods for auto-
1 matic estimation of this parameter, which are left for future
x^ ¼ argmin x jjy Hxjj2R1 þ jjLxjjHub
2 ð8Þ work [e.g., Hansen, 2010, chap. 5]. In our experiments, it
s:t: xⱰ0:
turned out that small values of the Huber threshold , typi-
cally less than 10% of the field maximum range of variabil-
[26] Obviously, the constraint is due to the nonnegativity
ity, led to a successful recovery of isolated singularities and
of the rainfall fields. In this study, we adopted the gradient
local extreme rainfall cells (Figures 6 and 7).
projection (GP) method [Bertsekas, 1999, p. 228], to solve
[30] In the studied snapshot, coarse graining of the rain-
the above variational problem (see Appendix B).
fall reflectivity fields to the scales of 4 4 and 8 8 km
3.2.2. Rainfall Downscaling Results was equivalent to loosing almost 20% and 30% of the rain-
[27] The same rainfall snapshot shown in Figure 4 has fall energy in the reflectivity domain in terms of the relative
been used to examine the performance of the proposed root-mean-square error (RMSE), RMSE ¼ jjx x^jj2 =jjxjj2
regularized DS methodology. Throughout the paper, to (see Table 1). Note that to compute the RMSE of the LR
make the reported parameters independent of the intensity observations, the size of those fields was extended to the
range, the rainfall reflectivity fields are first scaled into the size of the true field using the nearest neighborhood inter-
range between 0 and 1; however, the downscaling results polation, that is, each LR pixel was replaced with sc sc
are presented in the true range. pixels with the same intensity value. In addition to the
5950
EBTEHAJ AND FOUFOULA-GEORGIOU: REGULARIZED DOWNSCALING, DATA FUSION, AND ASSIMILATION
Figure 5. Sample results of the rainfall regularized downscaling (DS). (a) True HR rainfall reflectivity :
NEXRAD snapshot at the TRMM GV-site in Houston, TX (HSTN) on 11/13/1998 (00:02:00 UTC) at
resolution 1 1 km. (b and c) The synthetically generated, 4 4 and 8 8 km, coarse-scale and noisy
observations of the true rainfall reflectivity field. Left column: (d) Tikhonov and (f) Huber regularization
results for downscaling from 4 to 1 km ( ¼ 0.02). Right column: (e) Tikhonov and (g) Huber regular-
ized DS for downscaling from 8 to 1 km ( ¼ 0.04). Zooming views of the delineated box in Figure 5g
are shown in Figure 6.
5951
EBTEHAJ AND FOUFOULA-GEORGIOU: REGULARIZED DOWNSCALING, DATA FUSION, AND ASSIMILATION
Figure 6. A zooming view for comparing qualitatively the Tikhonov (a and c) versus the Huber (b and
d) regularization for the downscaling (DS) example in Figure 5. The results indicate a marginally
improved performance by the Huber regularization, especially for smaller scaling ratio. The Huber regu-
larization yields sharper results and is more capable to recover high-intensity rainfall cells and the cor-
rect range of variability; see Table 1 for quantitative comparison using a suit of metrics and Figure 7.
relative RMSE measure, we also used three other metrics : [31] On average, it is seen that almost 25% of the lost
(1) relative mean absolute error (MAE), MAE ¼ jj relative energy of the rainfall reflectivity fields can be
x x^jj1 =jjxjj1 ; (2) a logarithmic measure often called the restored via the regularized DS (Table 1). The ‘2-norm reg-
peak signal-to-noise ratio (PSNR), PSNR ¼ 20log 10 ularization led to smoother results, and as the scaling ratio
ðmax ðx^Þ=std ðx x^ÞÞ, where std ðÞ denotes the standard grows, this regularization was almost incapable to recover
deviation ; and (3) the structural similarity index (SSIM) by the peaks and the correct variability range of the rainfall
Wang et al. [2004]. The PSNR (in dB) represents a measure reflectivity field (Figure 6). Typically, as expected, the
that not only contains RMSE information but also encodes Huber-norm regularization results are slightly better than
the recovered range. The latter metric varies between 1 the Tikhonov ones, although not always significantly. For
and 1 and the upper bound refers to the case where the esti- large scaling ratios (i.e., sc > 4), the results of those meth-
mated x^ and reference (true) field x are perfectly matched. ods tended to coincide in terms of the selected lump quality
The SSIM metric is popular in the image processing com- metrics such as the RMSE. However, using the Huber regu-
munity as it takes into account not only the marginal statis- larization, the recovered range was markedly better than
tics such as the RMSE but also the correlation structure that by the Tikhonov regularization, as reflected in the
between the estimated and reference field. This metric PSNR metric and recovered range. For example, in down-
seems very promising for analyzing the forecast mismatch scaling from 8-to-1 km km via the Tikhonov regulariza-
with observations in hydrometeorological studies, espe- tion, the maximum recovered reflectivity values are
cially when the large-scale systematic errors (e.g., displace- approximately 41 dBZ, while using the Huber-norm regula-
ment error) might be more dominant than the random rization the maximum values are 45 dBZ (Figure 5).
errors ; see Ebtehaj et al. [2012] for applications of SSIM Employing the classic Z-R relationship for the NEXRAD
in rainfall downscaling. products (i.e., Z ¼ 300R1.4), one can easily check that the
5952
EBTEHAJ AND FOUFOULA-GEORGIOU: REGULARIZED DOWNSCALING, DATA FUSION, AND ASSIMILATION
[33] Note that the solution of the above problem not only
contains information about all of the available observations
(fusion) but also, with proper design of the observation opera-
tors, allows us to obtain an HR estimate of the state of interest
Figure 7. The quantiles of the standard normal density (downscaling). Clearly, the inverse of each covariance matrix
versus the standardized distribution of the recovered rain in equation (10) encodes the relative contribution or weight of
rates (mm/h), using Z ¼ 300R1.4 relationship, for the true each observation yi in the cost function. In other words, if the
HR field (red cross), the observed LR field (black plus), the elements of the covariance matrix of a particular observation
downscaled HR fields via the Tikhonov regularization vector are large compared to those of the other observation
(green circle), and the Huber-norm regularization (blue vectors, naturally the contribution of that observation to the
square), respectively. (a and b) The quantile-quantile plots obtained solution would be less significant.
for the HR fields obtained by downscaling from 4 4 to 1 [34] For notational convenience, the above system of
1 km and 8 8 to 1 1 km, respectively. The rainfall equations can be augmented as follows:
quantile values are only for the positive rainy part of the 2 3 2 13 2 13
y1 H v
fields and are standardized by subtracting the mean and 6 7 6 7 6 7
4 ⯗ 5 ¼ 4 ⯗ 5x þ 4 ⯗ 5;
dividing by the standard deviation. The qq-plots signify ð11Þ
that the Huber regularization performs better than the yN HN vN
Tikhonov, especially over the tails, which represent the re- ) y ¼ Hx þ v
covery of high-intensity and extreme rainfall values from
where the concatenated error vector v has the following
the LR observations.
block diagonal covariance matrix,
rain rates associated with the above reflectivity values are 2 3
R1 0
approximately 15 and 28 (mm/h), respectively. Therefore,
T 6 .. 7
although the lump quality metrics are comparable for the R ¼ E vv ¼ 4 . 5: ð12Þ
two methods in the reflectivity domain, the main advantage 0 RN
of the Huber norm over the ‘2-norm is the recovery of local
extreme rain rates (Figure 7). It is clear from the quantile-
quantile plots in Figures 7a and 7b that for a small scaling [35] Therefore, the DF problem can be recast as the clas-
ratio, for example, sc ¼ 4, the Huber regularization can sic problem of estimating the true state from the augmented
very well reproduce both the tail and the body of the true observation model of y ¼ Hx þ v. Thus, setting the gradi-
rainfall distribution. However, the tail of the recovered ent of the cost function in equation (10) to zero yields the
rainfall distribution falls below the true rainfall distribution following linear system :
5953
EBTEHAJ AND FOUFOULA-GEORGIOU: REGULARIZED DOWNSCALING, DATA FUSION, AND ASSIMILATION
Table 1. Results Showing the Effectiveness of the Proposed Regularized DS in Reducing the Estimation Error and Increasing the Accu-
racy of the Estimated Rainfall Fieldsa
Observations Versus True Tikhonov-DS Versus True Huber-DS Versus True
b
Metric 4 4 km 8 8 km 4 4 km 8 8 km 4 4 km 8 8 km
H T R 1 H x^ ¼ H T R 1 y: ð13Þ 12 12 km. Here we only restrict our consideration to the
Huber norm regularization because of its consistency with
[36] This problem is overdetermined
with a unique solu- the underlying rainfall statistics and its better performance in
tion; however, the Hessian H T R 1 H is likely to be very recovering of the rainfall heavy-tailed structure (Figure 7).
ill-conditioned. This ill-conditioning typically gives rise to To solve the DF problem, we have used the same settings
an unstable solution with large estimation error [e.g., Elad for the gradient projection (GP) method as explained in Ap-
and Feuer, 1997; Hansen, 2010]. Similar to the DS prob- pendix B.
lem, one possible remedy for stabilizing the solution is the [38] The solution of the ill-conditioned WLS formulation
regularization. Recalling the formulation discussed in the or the ML estimator in equation (10) is blocky, out of
previous section, a general regularized form of the rainfall range, and severely affected by the amplified inverted noise
DF problem can be written as (Figure 8c). On the other hand, the regularized DF can
properly restore a fine-scale and coherent estimate of the
1 rainfall field. The results show that more than 30% of the
x^ ¼ argmin x jjy Hxjj2R 1 þ L ðxÞ ; uncaptured subgrid energy of the examined rainfall reflec-
2 ð14Þ
s:t: xⱰ0 tivity field can be restored through solving the proposed
methodology (Table 2). As is evident, improvements of the
where the convex regularization function L ðxÞ can take selected fidelity measures in the DF problem are more pro-
different penalty norms, such as the Tikhonov jjLxjj22 , the nounced compared to the results of the DS experiment (see
‘1-norm jjLxjj1 , or the Huber norm jjLxjjHub . As is evident, Table 1). This naturally arises because more observations
similar to the DS problem, solution of equation (10) is are available in the DF problem than the DS one, and thus
equivalent to the frequentist ML estimator of the HR field the solution is better constrained. In terms of the selected
while equation (14) is the Bayesian MAP estimator. For lump metrics, analogous to the DS problem, we observed
further explanations and statistical interpretations please that the Huber-norm regularization is marginally better
see Appendix A. than the Tikhonov regularization, which is not reported
here. However, as expected, in terms of recovery of the
4.2. Application in Rainfall Data Fusion and Results heavy-tailed structure of the rainfall, it is verified that the
[37] To quantitatively evaluate the effectiveness of the Huber-norm regularization can capture the lost extreme
proposed regularized DF methodology for rainfall data, we values much better than the Tikhonov regularization (see
constructed two synthetic LR and noisy observations from Figure 9). It is clear from Figure 9 that the Huber-norm reg-
the original HR NEXRAD reflectivity snapshot. To resemble ularization very well captures the local extreme rainfall in-
different sensing protocols and specifications, we chose dif- tensity values while the Tikhonov regularization falls short
ferent smoothing and down-sampling operations to construct and can only partially recover those extreme intensities.
each of the synthetic observation fields. The first observation
field y1 was produced at resolution 6 6 km using a simple 5. Regularized Variational Data Assimilation
averaging filter of size 6 6, followed by a down-sampling
ratio of sc ¼ 6. Analogously, the second field y2 was gener- 5.1. Problem Formulation
ated at scale 12 12 km using a Gaussian smoothing kernel [39] Compared to the previously explained problems of
of size 12 12 with a standard deviation of 4 km, followed downscaling and data fusion, the data assimilation problem
by a down-sampling ratio of sc ¼ 12. To resemble the mea- is more involved in the sense that we also need to incorpo-
surement random error, white Gaussian errors with standard rate the evolution of a dynamical system in the estimation
deviations of le-2 and 2e-2 were also added respectively, process. Despite the increased complexity, DA shares the
which are equivalent to 5% and 10% of the standard devia- same principles with the explained formulations of the DS
tion of the reference rainfall field only over the wetted areas. and DF problems, from the estimation point of view. Here
Roughly speaking, this selection of the error magnitudes we briefly explain the classic linear three-dimensional var-
implies that the degree of confidence (relative weight) on the iational (3D-VAR) data assimilation scheme and extend its
observations at 6 6 km is twice that of the observations at formulation to a regularized format. Sample results of the
5954
EBTEHAJ AND FOUFOULA-GEORGIOU: REGULARIZED DOWNSCALING, DATA FUSION, AND ASSIMILATION
Figure 8. Data fusion and downscaling of multisensor remotely sensed rainfall reflectivity fields using
the Huber regularization. (a and b) Reconstructed LR and noisy rainfall observations at scale 6 and 12
km in grid spacing. (c) The results of the WLS solution in equation (10) and (d) the solution of the regu-
larized DF using the Huber norm with ¼ le-3 and ¼ le-2.
5955
EBTEHAJ AND FOUFOULA-GEORGIOU: REGULARIZED DOWNSCALING, DATA FUSION, AND ASSIMILATION
5956
EBTEHAJ AND FOUFOULA-GEORGIOU: REGULARIZED DOWNSCALING, DATA FUSION, AND ASSIMILATION
Figure 10. (a) The true initial condition x0 and the results of the heat equation at t ¼ 5 and t ¼ 100 (T)
with E ¼ 1 (L2/T). (b) The reconstructed background state by adding a white noise with w ¼ 0.05 to the
true initial state and (c) the LR and noisy observations with v ¼ 0.03, respectively. (d) The results of the
classic 3D-VAR and the regularized version using the explained Tikhonov (T3D-VAR) and the Huber
(H3D-VAR) regularization methods (see equation (17)). (e and f) Magnified parts of the graphs in Figure
10d over the shown zooming windows.
2 3
1111 0000 0000 overfitting, while it slightly damps the noise. Indeed, the
166 0000 1111 0000 7
7 2 Rnm ; 3D-VAR is unable to effectively damp the high-frequency
H¼ 4 ð22Þ
4 ⯗ ⯗ ⯗ ⯗ 5 error components and recover the underlying true state.
0000 0000 1111 This overfitting may arise because the 3D-VAR cost func-
tion is a redundant WLS estimator and contains extra infor-
and impose a white Gaussian error with v ¼ 0.03, equiva- mation (both observations and background) than needed for
lent to 10% of the standard deviation of the true signal. a proper estimation of the true state. On the other hand, in
[54] The top-hat initial condition is selected to empha- the regularized assimilation methods, not only the error
size the role of regularization, especially regularization term but also a cost associated with the regularity of the
resulting from linear penalization (i.e., the Huber or ‘1- underlying state is also minimized. The Tikhonov regulari-
norm). Clearly, the first-order derivative of the above initial zation (T3D-VAR), i.e., L ðxÞ ¼ jjLxjj22 , led to a smoother
condition is very sparse. In other words, the first-order de- result compared to the classic one with slightly improved
rivative is zero everywhere on its domain except at the error statistics (Table 3). However, the result of the Huber
location of the two jumps, resembling a heavy tailed and regularization (H3D-VAR), i.e., L ðxÞ ¼ jjLxjjHub , is the
sparse statistical distribution. This underlying structure best. The rapidly varying noisy components are effectively
prompts us to use a regularization norm with linear penal- damped in this regularization, while the sharp jump discon-
ization and a first-order differencing operator for L in equa- tinuities have been preserved better than the T3D-VAR.
tion (17), as follows: The quantitative metrics in Table 3 indicate that in the anal-
2 3 ysis cycle, the RMSE and MAE metrics are improved dra-
1 1 0 0 0 matically, up to 85% in the H3D-VAR, compared to other
6 0 1 1 0 07 assimilation schemes.
6
L¼4 7 2 Rðm1Þm : ð23Þ
⯗ ⯗ ⯗ ⯗ ⯗ ⯗5 [56] As previously explained, there is no unique and uni-
0 0 0 1 1 versally accepted methodology for automated selection of
the regularization parameters, namely, and . Here, to
[55] Figure 10 shows the inputs of the assimilation select the best parameters in the above assimilation exam-
experiment and the results of the analysis cycle, using the ples, we performed a few trial and error experiments. In
classic versus the regularized 3D-VAR estimators. In this other words, over a feasible range of parameter values, we
example, it is clear that the classic solution is subject to computed the analysis states and obtained the RMSE
5957
EBTEHAJ AND FOUFOULA-GEORGIOU: REGULARIZED DOWNSCALING, DATA FUSION, AND ASSIMILATION
Table 3. The Root-Mean-Square Error (RMSE) and the Mean common elements of the hydrometeorological problems of
Absolute Error (MAE) for the Studied Classic and Regularized DS, DF, and DA as discrete linear inverse problems. We
3D-VAR in the Analysis Cycle (A) and Forecast Step (F) argued about the importance of proper regularization,
which not only makes hydrometeorological inverse prob-
RMSE MAE
lems sufficiently well posed but also imposes the desired
Cycle 3D-VAR T3D-VAR H3D-VAR 3D-VAR T3D-VAR H3D-VAR regularity and statistical property on the solution. Regulari-
zation methods were theoretically linked to the underlying
A 0.0475 0.0397 0.0067 0.0376 0.0317 0.0043
F 0.0090 0.0088 0.0043 0.0071 0.0070 0.0033
statistical structure of the states and it was shown how in-
formation about the probability density of the state, or its
derivative, can be used for proper selection of the regulari-
metric by comparing them with the (known) true initial zation method. Specifically, we emphasized three types of
condition x0 (Figure 11). Note that the true initial condition regularization, namely, the Tikhonov, ‘1-norm, and Huber
is definitely not available in practice ; however, here we regularization methods. We argued that these methods are
used it to obtain the optimal values of the regularization pa- statistically equivalent to the maximum a posteriori (MAP)
rameters in the RMSE sense for comparison purposes and estimator while, respectively, assuming the Gaussian, Lap-
for demonstrating the importance of a proper regulariza- lace, and Gibbs prior density for the state of interest in a de-
tion. In the T3D-VAR, as expected, larger values of the rivative domain. It was argued that piecewise continuity of
regularization parameter (T) typically damp rapidly vary- the state and the presence of frequent jumps are often trans-
ing error components of the noisy background and observa- lated into heavy-tailed distributions in the derivative space
tions; however, they may give rise to an overly smooth
solution with larger bias and RMSE (Figures 10e and 10f).
Here, for the T3D-VAR experiment, we used the value
T ¼ 0.05 associated with the minimum RMSE (Figure
11a). In the H3D-VAR, in addition to the regularizer H,
we also need to choose the optimal threshold value of the
Huber norm. A contour plot of the RMSE values for differ-
ent choices of H and is shown in Figure 11b. By inspec-
tion, we roughly chose H ¼ 35 and ¼ 1.5e-3 for the
H3D-VAR assimilation experiment presented in Figure 10.
[57] The main purpose of the DA process is, indeed, to
increase the quality of the forecast. Given the analysis state
at initial time, we can forecast the profile of the scalar
quantity, xðs; tÞ, at any future time step through the heat
equation. One important property of the heat equation is its
diffusivity. In other words, naturally noisy components and
rapidly varying perturbations in the initial analysis are
damped but become more correlated as the profile evolves
in time. Thus, rapidly varying uncorrelated error compo-
nents become low-varying and correlated features whose
detection and removal is naturally more difficult than in the
case of uncorrelated ones. Figure 12a shows the forecast
profile at t ¼ 10(T). The results indicate the importance of
proper regularization on the quality of the forecast in the
simple heat equation. The forecasts based on the classic
3D-VAR and the T3D-VAR almost coincide, while the
T3D-VAR is marginally better. This behavior arises
because neither of those methods could properly eliminate
the noisy features in the analysis cycle; hence, low-varying
error components appear in the forecast profile. However,
the quality metrics in Table 3 indicate that using H3D-
VAR, the RMSE and MAE of the forecast are improved by
more than 50% compared to the other methods.
6. Conclusions
[58] In this paper, we presented a new direction in Figure 11. (a) Root-mean-square error (RMSE) of the
approaching hydrometeorological estimation problems by implemented T3D-VAR as a function of the regularizer T.
taking into account important intrinsic properties of the (b) RMSE contour surface for the H3D-VAR experiment
underlying state of interest, such as the presence of sharp with different choices of the regularizer H and the thresh-
jumps, isolated singularities (i.e., local extremes), and sta- old value of the Huber norm. Clearly, depending on the
tistical sparsity in the derivative space. We started by choice of the regularization method, the magnitude of the
explaining the concept of regularization and discussed the regularizer might be markedly different.
5958
EBTEHAJ AND FOUFOULA-GEORGIOU: REGULARIZED DOWNSCALING, DATA FUSION, AND ASSIMILATION
Figure 12. (a) True forecast state obtained by temporal evolution of the top-hat initial condition under
the heat equation at t ¼ 10 (T) (Figure 10a). (b and c) Magnified windows showing the forecast quality
using classic and regularized 3D-VAR assimilation methods. It can be seen that, due to ineffective error
removal by the classic 3D and T3D-VAR at the analysis cycle, large-scale correlated errors are propa-
gated in the forecast profiles, while this problem is less substantial in the result of the H3D-VAR (see
Table 3).
that favor the use of ‘1-norm or Huber-norm regularization methodologies for non-Gaussian and highly nonlinear
methods. dynamic systems.
[59] The effectiveness of the regularized DS and DF
problems was tested via analysis of remotely sensed precip- Appendix A: Statistical Interpretation
itation fields, and the superiority of the regularization with
linear penalization was clearly demonstrated. The perform- [61] In this appendix, we discuss the statistical interpre-
ance of the regularized DA was also studied via assimilat- tation of the presented downscaling, data fusion, and data
ing noisy observations into the evolution of the heat assimilation problems. We argue that the classic weighted
equation, which has fundamental applications in the study least squares formulations can be interpreted as the fre-
and data assimilation of land-surface heat and mass fluxes. quentist maximum likelihood (ML) estimators, while the
We showed that adding a Huber regularization term in the regularized formulations can be interpreted as the Bayesian
variational assimilation methods outperforms the classic maximum a posteriori (MAP) estimators. We also spell out
3D-VAR method, especially for the case where the initial the connection between the chosen regularization and the
condition exhibits a sparse distribution in the derivative prior distribution of the state (or its derivative), which can
space (e.g., first-order derivative of the top-hat initial guide proper selection of the regularization term in practi-
condition). cal applications.
[60] The presented frameworks can be potentially A1. Regularized Variational Downscaling and Data
applied to other hydrometeorological problems, such as Fusion
soil moisture downscaling, fusion, and data assimilation.
Clearly, proper selection of the regularization method [62] From the frequentist statistical point of view, it is
requires careful statistical analysis of the underlying state easy to show that the WLS solution of equation (3) is
of interest. Moreover, the problem of rainfall or soil mois- equivalent to the maximum likelihood estimator (ML)
ture retrieval from satellite microwave radiance can be con-
sidered as a nonlinear inverse problem. This nonlinear x^ML ¼ argmax pðyjxÞ; ðA1Þ
x
inversion may be cast in the presented context, provided
that the nonlinear kernel can be (locally) linearized with given that the conditional density, pðyjxÞ / exp
sufficient accuracy. Application of regularization in data
T 1
assimilation is in its infancy (e.g., see Freitag et al. [2012] =2ðy HxÞ R ðy HxÞ , is Gaussian. Specifically,
1
for a recent study) and is expected to play a significant role taking log ðÞ, one can find the minimizer of the negative
over the next decades, especially in the context of ensemble log-likelihood function log fpðyjxÞg as follows:
5959
EBTEHAJ AND FOUFOULA-GEORGIOU: REGULARIZED DOWNSCALING, DATA FUSION, AND ASSIMILATION
1 been argued that the 4D-VAR, and thus as a special case
x^ ML ¼ argmin ðy HxÞT R1 ðy HxÞ
x 2 the 3D-VAR cost function, can be interpreted via the
1 Bayesian MAP estimator [Johnson et al., 2005; Freitag et
¼ argmin jjy Hxjj2R1 ; ðA2Þ
x 2 al., 2010; Nichols, 2010]. For notational convenience, here
we only explain the statistical interpretation of the
which is identical to the WLS solution of problem (3). 3D-VAR and its regularized version, which can be easily
[63] It is important to note that in the ML estimator, x is generalized for the case of the 4D-VAR problem.
considered to be a deterministic variable (fixed), while y [67] As discussed earlier, the ML estimator is basically a
has a random nature. On the other hand, in the Bayesian frequentist view to estimate the most likely value of an
perspective, a regularized solution of equations (4) or (6) is unknown deterministic variable x from (indirect) observa-
equivalent to the maximum a posteriori (MAP) estimator tions y of random nature. The ML estimator intuitively
requires finding the state that maximizes the likelihood
x^ MAP ¼ argmax pðxjyÞ; ðA3Þ function as
x
5960
EBTEHAJ AND FOUFOULA-GEORGIOU: REGULARIZED DOWNSCALING, DATA FUSION, AND ASSIMILATION
B, that is, pðx0 Þ N xb0 ; B . More formally, this assump- iteratively as
tion implies that the deterministic background is the central
(mean) forecast and is related to the random true state via xkþ1 ¼ ½xk k rJ ðxk Þ þ : ðB7Þ
x0 ¼ xb0 þ w, where w N ð0; BÞ. Therefore, using Bayes
theorem it immediately follows that the 3D-VAR is the [76] Thus, if the descent at step k is feasible (i.e.,
MAP estimator, xa0 ¼ argmax x0 pðx0 jyÞ, assuming a Gaus- xk k rJ ðxk ÞⱰ0), the GP iteration becomes an ordinary
sian prior for the true state of interest. unconstrained steepest descent method; otherwise, the
[72] In conclusion, if we follow the frequentist approach result is mapped back onto the feasible set by the projection
to interpret the classic 3D-VAR in equation (15), the regu- operator in equation (B6).
larized 3D-VAR in equation (17) can be interpreted as the [77] In our study, the stepsize (k) was selected using the
MAP estimator, where the prior density is characterized by Armijo rule, or the so-called backtracking line search, that
the regularization term. On the other hand, taking the MAP is, a convergent and very effective stepsize rule and
interpretation for the classic 3D-VAR, the regularized ver- depends on two constants : 0 < < 0:5; 0 < & < 1. In this
sion might be understood as the MAP estimator, which also method, the stepsize is assumed k ¼ & mk , where mk is the
accounts for an extra and independent prior on the distribu- smallest nonnegative integer for which
tion of the state under the L transformation.
J ðxk k rJ ðxk ÞÞ J ðxk Þ k rJ ðxk ÞT rJ ðxk Þ: ðB8Þ
Appendix B: Gradient Projection Method for the
Huber Regularization [78] In our DS examples, the above backtracking param-
[73] Here we present the gradient projection (GP) eters are set to ¼ 0.2 and & ¼ 0.5 (see Boyd and Vanden-
method, using the Huber regularization, only for the down- berghe [2004, p. 464] for more explanation). In our coding,
jjxk xk1 jj2
scaling (DS) problem, which can be easily generalized to the iterations terminate either if jjx k1 jj2
105 or the
the data fusion (DF) and data assimilation (DA) cases. In number of iterations exceeds 200.
the case of the DS problem, the cost function and gradient [79] For the above-explained gradient projection algo-
of the Huber regularization with respect to the elements of rithm and the employed parameters, the computational cost
the downscaled field are of the proposed framework is modest for a normal desktop
machine at the present time. In particular, on a Windows
1 operating system with an Intel(R)-i7 central processing unit
J ðxÞ ¼ jjy Hxjj2R1 þ jjLxjjHub ðB1Þ (2.80 GHz clock rate), the process time of the presented
2
downscaling and data fusion experiments was approxi-
rJ ðxÞ ¼ HT R1 ðy HxÞ þ LT T ðLxÞ;
0
ðB2Þ mately 120 s.
5961
EBTEHAJ AND FOUFOULA-GEORGIOU: REGULARIZED DOWNSCALING, DATA FUSION, AND ASSIMILATION
Castro, C. J., A. S. Pielke Roger, and G. Leoncini (2005), Dynamical down- Hossain, F., and E. N. Anagnostou (2005), Numerical investigation of the
scaling: Assessment of value retained and added using the Regional impact of uncertainties in satellite rainfall estimation and land surface
Atmospheric Modeling System RAMS, J. Geophys. Res., 110, D05108, model parameters on simulation of soil moisture, Adv. Water Resour.,
doi:10.1029/2004JD004721. 28(12), 1336–1350.
Chen, S. S., D. L. Donoho, and M. A. Saunders (1998), Atomic decomposi- Hossain, F., and E. N. Anagnostou (2006), A two-dimensional satellite rain-
tion by basis pursuit, SIAM J. Sci. Comput., 20, 33–61. fall error model, IEEE Trans. Geosci. Remote Sens., 44(6), 1511–1522.
Chen, S., D. Donoho, and M. Saunders (2001), Atomic decomposition by Huber, P. (1964), Robust estimation of a location parameter, Ann. Math.
basis pursuit, SIAM Rev., 43(1), 129–159. Stat., 35(1), 73–101.
Ciach, G. J., and W. F. Krajewski (1999), On the estimation of radar rainfall Huber, P. (1981), Robust Statistics, vol. 1, John Wiley, New York.
error variance, Adv. Water Resour., 22(6), 585–595. Huffman, G., R. Adler, B. Rudolf, U. Schneider, and P. Keehn (1995),
Cooley, J. W., and J. W. Tukey (1965), An algorithm for the machine calcu- Global precipitation estimates based on a technique for combining
lation of complex Fourier series, Math. Comput., 19(90), 297–301. satellite-based estimates, rain gauge analysis, and NWP model precipita-
Courtier, P., and O. Talagrand (1990), Variational assimilation of meteoro- tion information, J. Clim., 8(5), 1284–1295.
logical observations with the direct and adjoint shallow-water equations, Huffman, G., R. Adler, M. Morrissey, D. Bolvin, S. Curtis, R. Joyce, B.
Tellus, Ser. A, 42(5), 531–549. McGavock, and J. Susskind (2001), Global precipitation at one-degree
Courtier, P., J.-N. Thepaut, and A. Hollingsworth (1994), A strategy for opera- daily resolution from multisatellite observations, J. Hydrometeorol.,
tional implementation of 4D-VAR, using an incremental approach, Q. J. R. 2(1), 36–50.
Meteorol. Soc., 120(519), 1367–1387, doi:10.1002/qj.49712051912. Huffman, G., D. Bolvin, E. Nelkin, D. Wolff, R. Adler, G. Gu, Y. Hong, K.
Daley, R. (1993), Atmospheric Data Analysis, 472 pp., Cambridge Univ. Bowman, and E. Stocker (2007), The TRMM multisatellite precipitation
Press, Cambridge, U.K. analysis (TMPA): Quasi-global, multiyear, combined-sensor precipita-
Deidda, R. (2000), Rainfall downscaling in a space-time multifractal frame- tion estimates at fine scales, J. Hydrometeorol., 8(1), 38–55.
work, Water Resour. Res., 36(7), 1779–1794. Johnson, C., N. K. Nichols, and B. J. Hoskins (2005), Very large inverse
Drusch, M., and P. Viterbo (2007), Assimilation of screen-level variables in problems in atmosphere and ocean modelling, Int. J. Numer. Methods
ECMWF’s integrated forecast system: A study on the impact on the fore- Fluids, 47(8–9), 759–771, doi:10.1002/fld.869.
cast quality and analyzed soil moisture, Mon. Weather Rev., 135(2), Kalnay, E. (2003), Atmospheric Modeling, Data Assimilation, and Predict-
300–314, doi:10.1175/MWR3309.1. ability, 341 pp., Cambridge Univ. Press, New York.
Ebtehaj, A. M., and E. Foufoula-Georgiou (2011a), Adaptive fusion of mul- Kim, G., and A. Barros (2002), Downscaling of remotely sensed soil mois-
tisensor precipitation using Gaussian-scale mixtures in the wavelet do- ture with a modified fractal interpolation method using contraction map-
main, J. Geophys. Res., 116, D22110, doi:10.1029/2011JD016219. ping and ancillary data, Remote Sens. Environ., 83(3), 400–413.
Ebtehaj, A. M., and E. Foufoula-Georgiou (2011b), Statistics of precipita- Krajewski, W. F., B. Vignal, B. C. Seo, and G. Villarini (2011), Statistical
tion reflectivity images and cascade of Gaussian-scale mixtures in the model of the range-dependent error in radar-rainfall estimates due to the
wavelet domain: A formalism for reproducing extremes and coherent vertical profile of reflectivity, J. Hydrol., 402, 306–316, doi:10.1016/
multiscale structures, J. Geophys. Res., 116, D14110, doi:10.1029/ j.jhydrol.2011.03.024.
2010JD015177.
Kumar, P. (1999), A multiple scale state-space model for characterizing
Ebtehaj, A. M., E. Foufoula-Georgiou, and G. Lerman (2012), Sparse regu- subgrid scale variability of near-surface soil moisture, IEEE Trans. Geo-
larization for precipitation downscaling, J. Geophys. Res., 116, D22110, sci. Remote Sens., 37(1), 182–197, doi:10.1109/36.739153.
doi:10.1029/2011JD017057.
Kumar, P., and E. Foufoula-Georgiou (1993), A multicomponent decompo-
Elad, M. (2010), Sparse and Redundant Representations: From Theory to
sition of spatial rainfall fields. 2. Self-similarity in fluctuations, Water
Applications in Signal and Image Processing, 376 pp., Springer, New
Resour. Res., 29(8), 2533–2544.
York.
Kummerow, C. D., S. Ringerud, J. Crook, D. Randel, and W. Berg (2010),
Elad, M., and A. Feuer (1997), Restoration of a single superresolution
An observationally generated a priori database for microwave rainfall
image from several blurred, noisy, and undersampled measured images,
retrievals, J. Atmos. Oceanic Technol., 28(2), 113–130, doi:10.1175/
IEEE Trans. Image Process., 6(12), 1646–1658, doi:10.1109/83.650118.
2010JTECHA1468.1.
Entekhabi, D., H. Nakamura, and E. Njoku (1994), Solving the inverse
Ledoit, O., and M. Wolf (2004), A well-conditioned estimator for large-
problem for soil moisture and temperature profiles by sequential assimi-
dimensional covariance matrices, J. Multivariate Anal., 88(2), 365–411.
lation of multifrequency remotely sensed observations, IEEE Trans.
Geosci. Remote Sens., 32(2), 438–448, doi:10.1109/36.295058. Levy, B. C. (2008), Principles of Signal Detection and Parameter Estima-
Freitag, M. A., N. K. Nichols, and C. J. Budd (2010), L1-regularisation for tion, 1st ed., 639 pp., Springer, New York, doi:10.1007/978-0-387–
ill-posed problems in variational data assimilation, Proc. Appl. Math. 76544-0.
Mech., 10(1), 665–668, doi:10.1002/pamm.201010324. Lewicki, M., and T. Sejnowski (2000), Learning overcomplete representa-
Freitag, M. A., N. K. Nichols, and C. J. Budd (2012), Resolution of sharp tions, Neural Comput., 12(2), 337–365.
fronts in the presence of model error in variational data assimilation, Q. Liang, X., E. F. Wood, and D. P. Lettenmaier (1999), Modeling ground
J. R. Meteorol. Soc., 139, 742–757, doi:10.1002/qj.2002. heat flux in land surface parameterization schemes, J. Geophys. Res.,
Gaspari, G., and S. E. Cohn (1999), Construction of correlation functions in 104(D8), 9581–9600.
two and three dimensions, Q. J. R. Meteorol. Soc., 125(554), 723–757, Lorenc, A. (1988), Optimal nonlinear objective analysis, Q. J. R. Meteorol.
doi:10.1002/qj.49712555417. Soc., 114(479), 205–240.
Geman, S., and D. Geman (1984), Stochastic relaxation, Gibbs distribu- Lorenc, A. C. (1986), Analysis methods for numerical weather prediction, Q.
tions, and the Bayesian restoration of images, IEEE Trans. Pattern Anal. J. R. Meteorol. Soc., 112(474), 1177–1194, doi:10.1002/qj.49711247414.
Mach. Intell., 6(6), 721–741. Lovejoy, S., and B. Mandelbrot (1985), Fractal properties of rain, and a
Golub, G., P. Hansen, and D. O’Leary (1999), Tikhonov regularization and fractal model, Tellus, Ser. A, 37(3), 209–232.
total least squares, SIAM J. Matrix Anal. Appl., 21(1), 185–194. Lovejoy, S., and D. Schertzer (1990), Multifractals, universality classes and
Gorenburg, I. P., D. McLaughlin, and D. Entekhabi (2001), Scale-recursive satellite and radar, J. Geophys. Res., 95(D3), 2021–2034.
assimilation of precipitation data, Adv. Water Resour., 24(9–10), 941– Maggioni, V., R. H. Reichle, and E. N. Anagnostou (2012), The impact of
953, doi:10.1016/S0309-1708(01)00033-1. rainfall error characterization on the estimation of soil moisture fields in
Gupta, V., and E. Waymire (1993), A statistical analysis of mesoscale rain- a land data assimilation system, J. Hydrometeorol., 13(3), 1107–1118,
fall as a random cascade, J. Appl. Meteorol., 32, 251–251. doi:10.1175/JHM-D-11–0115.1.
Hansen, P. (1998), Rank-Deficient and Discrete Ill-Posed Problems: Margulis, S. A., and D. Entekhabi (2003), Variational assimilation of radiometric
Numerical Aspects of Linear Inversion, vol. 4, Soc. for Ind. Math., Phila- surface temperature and reference-level micrometeorology into a model of the
delphia, Pa. atmospheric boundary layer and land surface, Mon. Weather Rev., 131(7),
Hansen, P. (2010), Discrete Inverse Problems: Insight and Algorithms, vol. 1272–1288, doi:10.1175/1520-0493(2003)131<1272:VAORST>2.0.CO;2.
7, Soc. for Ind. and Appl. Math., Philadelphia, Pa. Margulis, S. A., D. McLaughlin, D. Entekhabi, and S. Dunne (2002), Land
Hong, Y., K. Hsu, S. Sorooshian, and X. Gao (2004), Precipitation estima- data assimilation and estimation of soil moisture using measurements
tion from remotely sensed imagery using an artificial neural network from the Southern Great Plains 1997 Field Experiment, Water Resour.
cloud classification system, J. Appl. Meteorol., 43(12), 1834–1853. Res., 38 (12), 1299, doi:10.1029/2001WR001114.
5962
EBTEHAJ AND FOUFOULA-GEORGIOU: REGULARIZED DOWNSCALING, DATA FUSION, AND ASSIMILATION
Masunaga, H., and C. Kummerow (2005), Combined radar and radiometer estimates of tropical rainfall, Bull. Am. Meteorol. Soc., 81(9), 2035–
analysis of precipitation profiles for a parametric retrieval algorithm, 2046.
J. Atmos. Oceanic Technol., 22(7), 909–929. Talagrand, O., and P. Courtier (1987), Variational assimilation of meteoro-
Merlin, O., A. Chehbouni, Y. Kerr, and D. Goodrich (2006), A downscaling logical observations with the adjoint vorticity equation. I: Theory, Q. J.
method for distributing surface soil moisture within a microwave pixel: Appli- R. Meteorol. Soc., 113(478), 1311–1328.
cation to the Monsoon’90 data, Remote Sens. Environ., 101(3), 379–389. Tibshirani, R. (1996), Regression shrinkage and selection via the lasso,
Nichols, N. K. (2010), Mathematical concepts of data assimilation, in Data J. R. Stat. Soc. Ser. B, 58(1), 267–288.
Assimilation: Making Sense of Observations, Part 1, pp. 13–39, Tikhonov, A., V. Arsenin, and F. John (1977), Solutions of Ill-Posed Prob-
Springer, Berlin. lems, Winston. Washington, D. C.
Parrish, D. F., and J. C. Derber (1992), The National Meteorological Cen- Tustison, B., E. Foufoula-Georgiou, and D. Harris (2003), Scale-recursive
ter’s spectral statistical-interpolation analysis system, Mon. Weather estimation for multisensor Quantitative Precipitation Forecast verifica-
Rev., 120(8), 1747–1763, doi:10.1175/1520-0493(1992)120<1747: tion: A preliminary assessment, J. Geophys. Res, 107(D8), 8377,
TNMCSS>2.0.CO;2. doi:10.1029/2001JD001073.
Perica, S., and E. Foufoula-Georgiou (1996), Model for multiscale disag- Van de Vyver, H., and E. Roulin (2009), Scale-recursive estimation for
gregation of spatial rainfall based on coupling meteorological and scal- merging precipitation data from radar and microwave cross-track
ing, J. Geophys. Res., 101(D21), 26,347–26,361. scanners, J. Geophys. Res., 114, D08104, doi :10.1029/
Peter-Lidard, C. D., M. S. Zion, and E. F. Wood (1997), A soil-vegetation- 2008JD010709.
atmosphere transfer scheme for modeling spatially variable water and
Veneziano, D., R. L. Bras, and J. D. Niemann (1996), Nonlinearity and
energy balance processes, J. Geophys. Res., 102(D4), 4303–4324.
self-similarity of rainfall in time and a stochastic model, J. Geophys.
Rebora, N., L. Ferraris, J. Von Hardenberg, and A. Provenzale (2005), Sto-
Res., 101(D21), 26,371–26,392.
chastic downscaling of LAM predictions: An example in the Mediterra-
nean area, Adv. Geosci., 2, 181–185. Wang, S., X. Liang, and Z. Nan (2011), How much improvement can pre-
Rebora, N., L. Ferraris, J. Von Hardenberg, and A. Provenzale (2006), cipitation data fusion achieve with a multiscale Kalman Smoother-based
Rainfall downscaling and flood forecasting: A case study in the Mediter- framework?, Water Resour. Res., 47, W00H12, doi:10.1029/
ranean area, Nat. Hazards Earth Syst. Sci., 6(4), 611–619. 2010WR009953.
Reichle, R., D. Entekhabi, and D. McLaughlin (2001a), Downscaling of ra- Wang, Z., A. Bovik, H. Sheikh, and E. Simoncelli (2004), Image quality
dio brightness measurements for soil moisture estimation: A four- assessment: From error visibility to structural similarity, IEEE Trans.
dimensional variational data assimilation approach, Water Resour. Res., Image Process., 13(4), 600–612.
37(9), 2353–2364. Wilby, R., T. Wigley, D. Conway, P. Jones, B. J. M. Hewitson, and
Reichle, R., D. McLaughlin, and D. Entekhabi (2001b), Variational data D. S. Wilks (1998a), Statistical downscaling of general circulation model
assimilation of microwave radio brightness observations for land surface output: A comparison of methods, Water Resour. Res., 34(11), 2995–
hydrology applications, IEEE Trans. Geosci. Remote Sens., 39(8), 1708– 3008, doi:10.1029/98WR02577.
1718, doi:10.1109/36.942549. Wilby, R., H. Hassan, and K. Hanaki (1998b), Statistical downscaling of
Sasaki, Y. (1958), An objective analysis based on variational method, J. hydrometeorological variables using general circulation model output,
Meteorol. Soc. Jpn., 36, 77–88. J. Hydrol., 205(1), 1–19.
Schultz, R., and R. Stevenson (1994), A Bayesian approach to image expan- Zupanski, D., S. Q. Zhang, M. Zupanski, A. Y. Hou, and S. H. Cheung
sion for improved definition, IEEE Trans. Image Process., 3(3), 233–242. (2010), A prototype WRF-based ensemble data assimilation system for
Siccardi, F., G. Boni, L. Ferraris, and R. Rudari (2005), A hydrometeoro- dynamically downscaling satellite precipitation observations, J. Hydro-
logical approach for probabilistic flood forecast, J. Geophys. Res., 110, meteorol., 12(1), 118–134, doi:10.1175/2010JHM1271.1.
D05101, doi:10.1029/2004JD005314. Zupanski, M. (1993), Regional four-dimensional variational data assimila-
Sorooshian, S., K. Hsu, G. Xiaogang, H. Gupta, B. Imam, and tion in a quasi-operational forecasting environment, Mon. Weather Rev.,
D. Braithwaite (2000), Evaluation of PERSIANN system satellite-based 121(8), 2396–2408.
5963