J Geod (2012) 86:477497 DOI 10.



Separation of global time-variable gravity signals into maximally independent components

E. Forootan J. Kusche

Received: 4 May 2011 / Accepted: 9 November 2011 / Published online: 22 November 2011 Springer-Verlag 2011

Abstract The Gravity Recovery and Climate Experiment (GRACE) products provide valuable information about total water storage variations over the whole globe. Since GRACE detects mass variations integrated over vertical columns, it is desirable to separate its total water storage anomalies into their original sources. Among the statistical approaches, the principal component analysis (PCA) method and its extensions have been frequently proposed to decompose the GRACE products into space and time components. However, these methods only search for decorrelated components that on the one hand are not always interpretable and on the other hand often contain a superposition of independent source signals. In contrast, independent component analysis (ICA) represents a technique that separates components based on assumed statistical independence using higher-order statistical information. If one assumes that independent physical processes generate statistically independent signal components added up in the GRACE observations, separating them by ICA is a reliable strategy to identify these processes. In this paper, the performance of the conventional PCA, its rotated extension and ICA are investigated when applied to the GRACE-derived total water storage variations. These analyses have been tested on both a synthetic example and on the real GRACE level-2 monthly solutions derived from GeoForschungsZentrum Potsdam (GFZ RL04) and Bonn University (ITG2010). Within the synthetic example, we can show how imposing statistical independence in the framework of ICA improves the extraction of the original signals from a GRACE-type super-position. We are therefore condent that also for the real case the ICA algorithm, without making prior assumptions about the long-term behaviour or on the
E. Forootan (B) J. Kusche Institute of Geodesy and Geoinformation, Bonn University, Bonn, Germany e-mail:

frequencies contained in the signal, improves over the performance of PCA and its rotated extension in the separation of periodical and long-term components. Keywords Signal separation PCA ICA GRACE products

1 Introduction The launch of the Gravity Recovery and Climate Experiment (GRACE) mission in March 2002 opened a new era in measuring the Earths gravity eld. Since then, the GRACEderived spherical harmonic coefcients have provided valuable information about the global gravity eld from space at scales of a few hundred kilometres and larger (Tapley et al. 2004a). The temporal variations of the gravity eld can be used to determine the redistribution of mass within the Earth system with an accuracy corresponding to 1 cm water column in monthly snapshots (Schmidt et al. 2008). However, the gravitational potential changes measured by GRACE represent mass change integrated over vertical columns which is caused by different phenomena within the Earths interior or on its surface and atmosphere: variability of land water storage and atmospheric moisture, non-steric sea level change, ice melting and even glacio-isostatic adjustment. Whereas it is common to model the gravity changes due to atmospheric masses and reduce its effect in GRACE analysis, for other phenomena, modeling is difcult and considered a major source of uncertainty. Moreover, global water variations represent an indicator of climate system changes and therefore exhibit non-linear and complex interactions with many inherent time scales (Voeroesmarty et al. 2000), rendering simple time series approaches inefcient for discriminating between sources of mass change. In summary,



E. Forootan, J. Kusche

extracting physically meaningful information on individual processes from GRACE addresses a challenging signal separation problem, on which we focus in this paper. Various methods have been developed to identify patterns of variability from high-dimensional data sets. Most of these methods try to reduce the dimensionality of the system by retaining a smaller set of components that represent the most prominent patterns of variability. Among these methods, eigenspace techniques, including the Empirical Orthogonal Function (EOF) method [also called principal component analysis (PCA), Lorenz 1956] and its extensions such as Rotated EOF (REOF), Complex EOF (CEOF), Multichannel Singular Spectrum Analysis (MSSA) and others have been widely used since the past decade (see, e.g. von Storch and Zwiers 1999). For brevity, we will summarize these methods as PCA analysis. Recently, the application of eigenspace methods for noise reduction and signal separation in GRACE applications has gained some popularity. For instance, Chambers (2006) implemented PCA to estimate the leading modes of seasonal variability of altimetry-derived steric sea level using GRACE observations. In another study, Schrama et al. (2007) used the PCA method to identify the significant part of the surface mass signal from gridded GRACE-derived water variations and GPS height variations and thus to suppress what is identied as noise. In Chambers and Willis (2008), the PCA method was used as a tool for comparing the dominant part of GRACE-derived Ocean Bottom Pressure (OBP) with the OBP derived from ocean models. In such cases, a combination of leading PCA modes was interpreted as signal and those modes associated with low-amplitude variations were assumed to be noise. PCA, however, has been used in several studies to extract geophysical pattern from the observed data sets, cf. e.g. de Viron et al. (2006) who correlates individual GRACEderived principal components with the Southern Oscillation Index (SOI). In a regional study, Rieser et al. (2010) used PCA to derive the main components of spatial and temporal water storage variability for the Australian continent and compared each of the components with rainfall anomalies individually. Rangelova and Sideris (2008) assigned the geoid rise over north Canada to the patterns that derived from the PCA and REOF methods. In altimetric applications, several studies (i.e. Hendricks et al. 1996; Fenoglio-Marc 2001; Lombard et al. 2005) have implemented PCA to derive the trend and dominant patterns on both global and regional scales. In these applications, each of the PCA components were treated individually or used as base functions for further analysis. Several studies indicate that the objective of PCA, i.e. to maximize the variance explained by each component in succession, might cluster different physical modes within a single extracted mathematical mode and result in articial

features. This problem is called the mixing problem (see, e.g. Richman 1986; Hyvaerinen 1999) in what follows. Depending on the problem, the ability of PCA to isolate individual modes of climate variation may be limited (Jolliffe 2003). To improve the performance of PCA and also to take advantage of both the full spatial and temporal information of the data sets, Rangelova et al. (2009) applied MSSA on GRACE-derived mass variability in North America. Their investigation showed that using a lag-covariance matrix within the MSSA approach improves the conventional leastsquares t of a trend and periodic variability. In order to deal with propagating sea level signals, Cromwell (2006) used CEOF to improve the description of sea level variability. Similar studies have also been done on Sea Surface Temperature (SST) variability using CEOF Analysis (see, e.g. Guan and Nigam 2009; Latif and Barnett 1994). From a statistical point of view, all the mentioned methods use only the second-order moments contained in the auto-covariance or correlation matrices for the decomposition procedure. Therefore, they disregard a large part of information provided in higher-order moments of the probability distribution function (p.d.f) of the observed signal (Jolliffe 1986). Generally, there is no more information to be explored whenever the p.d.f of the observed variables is Gaussian (Hyvaerinen 1999), but in GRACE analysis, more often than not, the p.d.f is non-Gaussian as we will show later. Covariance or correlation will not be sufcient as a measure of statistical dependence between signals found in non-Gaussian data (Waymire and Gupta 1981). This problem has been addressed in several hydrological and climatological studies because hydrological parameters associated with physical process models contain a significant level of nonGaussianity (see, e.g. Westra et al. 2007; Beven 2001). In addition, several authors have shown that PCA-derived patterns do not necessarily allow physical interpretation (Dommenget and Latif 2002; Jolliffe 2003) and that EOFs highly correspond to the shape of the data domain, owing to the orthogonality and uncorrelation assumptions builtin to the procedure. Therefore, derived modes of variation might not be realistic and should be looked upon with caution (Richman 1986). This is clearly relevant if one is interested in identifying modes of variability in GRACE data, e.g. for some area of limited extension like a catchment or ocean basin (note that for GRACE one can even resort to PCA in the space of spherical harmonics, before gridding). To overcome some of these limitations, alternatives for PCA have been developed: REOF techniques, mostly employing orthogonal rotations, have been used in climate science (e.g. Legates 1991; Richman 1986) to obtain simpler structures, meaning that the REOF patterns are driven towards the extreme values with respect to the chosen normalization.


Separation of global time-variable gravity signals


In order to incorporate more information from the p.d.f underlying the data, Cardoso and Souloumiac (1993) and Hyvaerinen and Oja (2000) suggested to involve higher order statistical moments in PCA, leading to what is otherwise known as Independent Component Analysis (ICA). Recently, ICA has been considered as an alternative to PCA to be used as an exploratory tool for climate data analysis. Indeed, Frappart et al. (2010a,b) combine Gaussian-ltered GRACE solutions from three different analysis centres in their ICA implementation, while assuming each one of these contains independent information. Their approach improves over the Gaussian ltering of GRACE for several regions. In contrast to Frappart et al. (2010a,b), our approach does not rely on using multiple GRACE solutions on input. Therefore, it is applicable to any single GRACE product. It also does not require pre-ltering to amplify the non-Gaussianity in the three data sets. In the current study, following Comon (1994a) and Aires et al. (2002), we will interpret the ICA as a rotated PCA approach with the aim to separate the GRACE Total Water Storage (TWS) signals into their original independent sources. Independent physical processes will generate statistically independent source signals that are superimposed in the GRACE observations. Therefore, our hypothesis is that decomposing the GRACE observations into maximally statistically independent components will lead to base functions that are physically at least more representative than others based on orthogonality only. These base functions might be better suited to applications that look at each separated components individually, such as those aiming at separating GRACE and other data into signals from different compartments of the Earth system (e.g. Schmeer et al. 2008; Kusche et al. 2010; Rietbroek et al. 2011). They might also be helpful for those applications where signals from the pre-GRACE era or for a gap period between GRACE and a follow-on mission will be reconstructed. However, ICA sacrices either uncorrelatedness or orthogonality. Therefore, there will be always geophysical applications where PCA decomposition appears more adequate, and vice versa. For comparison, our investigation includes the PCA, REOF and ICA methods for decomposing the GRACEderived TWS variations. In detail, we will (1) apply conventional PCA rst to simulated GRACE-derived mass variations and also to real GRACE TWS models, (2) apply REOF with the VARIMAX optimization functional (Richman 1986) to the same data, (3) apply ICA (Comon 1994a; Cardoso and Souloumiac 1993) implemented as PCA rotation, while exploiting higher-order moments of the data and (4) discuss the corresponded results for simulation and real data. To study the separation problem in a controlled environment, we constructed a synthetic example with the main ingredients being modeled hydrological signals for South

America and Africa and added articial GRACE noise. In this case we know the solution of the decomposition problem by definition. Within this simple simulation, we found that ICA can separate distinct modes of mass variability even if they are synchronized in space and time. In contrast, although PCA and REOF do restore the original signals equally well with few modes, they fail to separate them into the original true sources. We then proceed to implement the three methods on the GRACE-derived TWS products. The remaining part of the manuscript is organized as follows: Sect. 2 is devoted to the description of the data. Section 3 briey reviews the concepts of PCA, REOF and ICA based on the rotation of principle components and discusses their applicability on deterministic signals. The performance of these approaches in the simulation is examined in Sect. 4. In Sect. 5, results are illustrated for the analysis of monthly GRACE TWS from the GeoForschungsZentrum (GFZ) and Bonn University (ITG2010) monthly solutions. Finally, Sect. 6 concludes the paper by summarizing the main ndings and providing an outlook.

2 Data The GRACE mission has been launched on 17th March 2002 as a joint project by the American National Aeronautics and Space Administration (NASA) and the German Aerospace Center (DLR). The aim of the mission is to monitor temporal and spatial variations of the Earths gravity eld on a global scale (Tapley et al. 2004b). Level-2 harmonic coefcients are derived from the continuous monitoring of the mutual distance, absolute positions and velocities of the GRACE twin satellites (Tapley et al. 2005). Three ofcial centres, the Center for Space Research (CSR), Jet Propulsion Laboratory (JPL) in the USA and the GeoForschungsZentrum (GFZ) in Germany, are responsible to provide the GRACE monthly solutions. Yet, GRACE data are also processed by other groups, e.g. Bonn University in Germany, GSFC/NASA in the USA, GRGS in France and DUT in the Netherlands. 2.1 GFZ RL04 GRACE data For this study, we used 93 monthly spherical harmonic gravity models from August 2002 to October 2010 computed by GFZ Potsdam (Flechtner 2007). These models are derived as fully normalized spherical harmonic coefcients of the geopotential computed to degree and order 120, and have been augmented by the degree-1 term from Rietbroek et al. (2009), in order to include the variation of the Earths centre of mass with respect to a crust-xed reference system. It should be mentioned here that our degree-1 term, although derived consistent with GRACE processing standards, adds other



E. Forootan, J. Kusche

space-geodetic data to the analysis. We found, however, that this is of negligible consequence for our analysis of separability. Before converting the gravitational coefcients to mass variations we also removed the problematic months in 2004 due to the GRACE orbital resonance. The mean gravity solution between August 2002 and October 2010 has been removed from the individual monthly solutions; geopotential change has been converted to surface mass change expressed in equivalent water height using a well-known relation (e.g. in Wahr et al. 1998). Several authors have shown that non-isotropic lters (e.g. Swenson and Wahr 2006; Kusche 2007) account for the GRACE stripes more favourably compared with the traditional isotropic Gaussian lter (Jekeli 1981), see, e.g. Werth et al. (2009). Therefore, the monthly solutions are smoothed here using the DDK2 lter (Kusche et al. 2009). Finally, the mass change harmonics have been evaluated on a global 1 1 degree grid of TWS. 2.2 ITG-GRACE2010 data In order to validate our ndings with a second data set, we incorporated all 84 available monthly spherical harmonic models from our own group at Bonn University for the period September 2002 to August 2009, complete to degree 120 (Mayer-Guerr et al. 2010). These models, although based on the same GRACE L1 GPS and K-band ranging data as the GFZ solutions, were derived using a different scheme of reducing geophysical background models and employing a different functional model for creating the observation equations. For each solution, the corresponding full variance covariance matrix is also provided individually. In essence, the ITG2010 products are as independent from the GFZ solutions as a reasonable GRACE solution can be. It will be interesting to see whether the differing length of the TWS data sets might also affect the robustness of the PCA, VARIMAX and ICA methods. We applied the same post-processing as described above for deriving maps of TWS. 3 Methods 3.1 Concept of PCA and REOF methods PCA is based on an eigenvalue decomposition of the autocovariance matrix of a centred data set. Although the technique is described in other studies (e.g. Schrama et al. 2007), we spend some time in describing it here since on the one hand it is the point of departure for REOF and ICA, and on the other hand, since quite often confusion arises over different ordering schemes or normalization conventions. Consider the centred data vector xi , containing n time epochs for the i th grid element or

x1;i . xi = . . xn ;i

i = 1 . . . p.


All p data vectors are collected in the form of an n p data matrix X. It should be noted that the data matrix could as well be assembled from stacking p n -dimensional row , each row describing all TWS grid values for a vectors x particular time epoch. For GRACE-derived TWS evaluated on a 1 1 global grid for 93 consecutive months, the data matrix contains 93 rows and 64,800 columns. In principle, PCA expands X in terms of a new set of spatially orthogonal vectors (EOFs) associated with temporally uncorrelated time series known as principal components (PCs) (Jolliffe 1986). It is important to mention that the PCs can be considered as orthogonal as well since their (conventional) inner product is zero. The PCA decomposition is written as X = PET , (2)

where E contains the eigenvectors of X normalized to unit length in its columns (i.e. EET = I), arranged with respect to the order of eigenvalues, and P contains the PCs. Or, with the singular value decomposition of the data matrix, ET X=P (3)

contains normalized PCs, i.e. P P T = I, and is where P diagonal and holds the singular values of the data matrix ordered according to magnitude. What makes PCA attractive in the analysis of measured time-variable gravity (or other geophysical elds) is that it allows to concentrate a large amount of variance in relatively few patterns and modes and thus for GRACE implicitly serves as a ltering method by discarding those modes that contain mostly noise. A decision on how many modes are actually required to reconstruct the data sufciently may be based upon statistical criteria, e.g. Preisendorfer and Barnett (1977) and Preisendorfer (1988), or may be reached by considering independent data (Schrama et al. 2007). Let k < n be the number of retained modes (those associated with the largest singular values); the reconstruction of the data matrix k k ET , with the n k , k k , and p k is X Xk = P k T = I, P T = I. kP matrices Pk , k , and Ek , and Ek Ek k Rotated EOF analysis (REOF) is based on the observation that one may include an orthogonal rotation of both the EOFs and the PCs in Eq. (2) without changing the left-hand side. Let R denote an orthogonal rotation matrix, i.e. RR T = I (the actual choice of the rotation matrix is made by solving an optimization problem and to keep the computational effort limited one applies REOF only to a subset of the leading components). Assuming that after an initial PCA one decides that k data modes shall be retained, then Xk can be cast into


Separation of global time-variable gravity signals

T T T Xk = Pk Rk Rk Ek = Bk Uk .



3.2 Concept of ICA As seen in the context of the VARIMAX approach, the separation of a measured geophysical eld into patterns can be based on considering higher-order statistics (i.e. statistical moments beyond the variances and covariances). This idea is further exploited in the framework of Independent Component Analysis (ICA), where non-Gaussianity is commonly expressed through higher-order cumulants (usually fourthorder, e.g. Cardoso 1999) rather than moments. Since for a Gaussian p.d.f all higher-order cumulants are zero, in fact one makes use the non-Gaussianity (if any) of the eld (see, e.g. Cardoso 1992; Hannachi et al. 2009). The fundamental idea of ICA is to minimize the statistical dependency of the pattern or of the associated temporal components. In the language of information theory, one seeks to identify independent sources. Consider a single random variable x (we do not distinguish in notation between random variable X and realization x ). Whereas its statistical moments n = E (x n ) can be written as the n -th derivative of the moment-generating function

In this representation Uk = Ek Rk contains the k rotated and still orthogonal EOFs, and Bk = XEk represents the corresponding expansion coefcients of the data (rotated PCs). Note that the RPCs dened in this way lose the property of being uncorrelated. In general, rotation aims at nding a new basis such that either the spatial patterns or the temporal expansion coefcients appear as simple as possible (Richman 1986; Mills 1995), with either more localized temporal or spatial structures. However, there exist several analytical criteria for simplicity, cf. Browne (2001). In this paper, we restricted our investigations to orthogonal rotations following from the VARIMAX criterion (Kaiser 1958). Aside from this, selecting normalized or non-normalized base functions prior to rotation will also provide different results. A discussion about the effect of normalization and orthogonality in REOF is provided in Mestas-Nunez (2000). It should be noted that ICA can be considered as REOF with a particular criterion for dening the rotation. Let k be chosen appropriately. Then the VARIMAX method consists in nding an orthogonal rotation matrix Rk such that the norm of its columns equals to one, and that the rotated EOFs Uk = Ek Rk maximize the following simplicity criterion (Kaiser 1958):
k p 4 ui j j =1 i =1

M ( y ) = E (e yx ) =
n =0

yn n!

f (Rk ) =

1 p

p 2 ui j i =1

at the origin (Koch 1988), the cumulants (or cumulative moments) n appear similarly as the Taylor coefcients of the logarithmized moment function (Ferreira et al. 1997).

(5) K ( y ) = log M ( y ) =

n =0

yn . n!

where u i j are the elements of matrix Uk . The reconstruction of the data matrix is given by Eq. (4), with the mentioned consequence of the RPCs being correlated now. Since V (x 2 ) = E (x 4 ) E (x 2 )2 , the term in squared brackets of Eq. (5) represents in fact an approximation to the variance of the squared elements of the rotated EOF. In other words, for GRACE analysis application of VARIMAX REOF comes down to choosing the rotation such that it maximizes the spatial variability of the total power (integrated squared TWS) of the REOFs. This variability will be small if the TWS RMS of the EOFs will be nearly equally distributed (in space), and it will be large if for each pattern a few regions dominate with large RMS (driven to 1 since we assume the EOFs as normalized). Loosely speaking, VARIMAX REOF maximizes the contrast of the (orthogonal) GRACE base functions. The procedure to derive simplied RPCs is similar to the above with the difference that one selects the rst k PCs (Pk ) to be rotated and to be implemented in the criterion of Eq. (5) instead of the EOFs. Spatial patterns corresponding to the optimal RPCs are derived as Zk = XT Uk , which are now correlated and arranged in the rows of matrix Z.

In particular, for a centered symmetrically distributed random variable x (i.e. j = 0 for j odd), the second cumulant equals to the variance 2 = 2 = E (x 2 ) and the fourth cumulant equals to the kurtosis k = 4 = E (x 4 ) 3 E (x 2 )2 . And in the multivariate case, with centered, symmetrically distributed n -dimensional random vectors xi , the secondorder multivariate cumulant corresponds to the covariance matrix C (xi , x j ) = E (xi x j ) and the fourth-order cumulant corresponds to C (xi , x j , xk , xl ) = E (xi x j xk xl ) E (xi x j ) E (xk xl ) E (xi xk ) E (x j xl ) E (xi xl ) E (x j xk ). (6)

In particular, under the assumption of statistical independence of the xi , their joint p.d.f decouples into a product of individual p.d.fs, and we have C (xi , x j , xk , xl ) = k (xi )i jkl where i jkl is Kroneckers function (Cardoso 1999). ICA generally aims at nding a linear transformation of the original data whose fourth-order cumulant attains this simple form.



E. Forootan, J. Kusche

A fourth-order cumulant is a tensorial quantity. Unsurprisingly, it is more handy to work with matrices and therefore one introduces cumulant matrices Q(M) (e.g. Cardoso 1999) as a 2D-contraction of a four-dimensional cumulant tensor with an (arbitrary) n n matrix M = (m i j ), or Q(M) = (qi j ) with
n n

qi j =
k =1 l =1

C (xi , x j , xk , xl )m kl .


Under the above-mentioned assumptions on the random vector, one can show (Cardoso 1999) that Q(M) = E (x T Mx)(xx T ) C trace(MC) C(M + MT )C (8)

JADE software ( guidesepsou.html) with minor modications. After applying PCA in the rst step, we estimate a maximal set of cumulant matrices Eq. (8) for either the Ek (rotating EOFs) or the Pk (rotating PCs). A maximal set is a set of matrices M j such that they span the linear space of all n n matrices, and the total number of its entries equals n 4 , the number of fourth-order cross-cumulants. Then Rk is found as the minimizer of the squared off-diagonal cumulant entries
k n2

f (Rk ) =

f i2 j

m =1

T Rk Q(Mm )Rk .


with C being the covariance matrix of x. It has been mentioned that all statistical information of a Gaussian variable is summarized in the rst and second moments and that higher-order cumulants are all zero. This makes the independence concept equivalent with orthogonality (uncorrelatedness) for Gaussian signals. As a consequence, when maximal independence of pattern is sought by EOF rotation, this criterion will be applicable only when at most one of the involved processes is Gaussian. In this study, ICA of GRACE-derived TWS is realized through a two-step procedure: rst we use PCA for identifying the leading orthogonal pattern that explain the variability in the data in descending order. Then, in the second step, these pattern are rotated towards independence. This means, we replace the VARIMAX criterion in nding Rk by a criterion based upon Eq. (8) that maximizes the independence between the pattern obtained in the rst step, where we measure independence by the fourth-order cross-cumulants between pattern. Similar as with VARIMAX rotation, there are two possibilities to derive the independent components and patterns, in which either temporal or spatial components will be rotated. In the language of statistics, these two options allow us to interpret the original data as a mixture of either independent temporal chains or spatial sequences. When applied to GRACE TWS, both ways have their own advantages which will be described in the results section. For computing Rk , one thus has to formulate and optimize a cost function as a measure of non-Gaussianity. Several criteria along with their corresponding algorithms are available to perform ICA; a comparison with respect to different applications can be found, e.g. in Karvanen and Koivunen (2002); Zavala-Fernaendez et al. (2006) and Naik et al. (2007). Here, similar to the JADE algorithm (Cardoso and Souloumiac 1993), the diagonality of fourth-order cross-cumulants is considered to nd a rotation matrix. We made this choice because of the numerical efciency of the algorithm. Furthermore, the described procedure is easily implemented using

This optimization is solved using plane rotations under unitary constraints of Rk (for more detail see Comon 1994a,b and a simple implementation based on maximum kurtosis in Cardoso 1999). Finally, rotated EOFs or PCs (the independent components) are computed, and the corresponding PCs or EOFs are found by projection of the data. 3.3 PCA and ICA in the presence of deterministic signals PCA, REOF and ICA aim at decomposing signals into uncorrelated or independent components. Since these concepts are rooted in the theory of stochastic signals, it is appropriate to discuss their applicability to deterministic signals such as often used in the GRACE community to describe the major temporal constituents of observed or modelled TWS. A trend or an annual sinus wave is deterministic if its dening slope or phase or amplitude are considered as deterministic (otherwise it is a random non-Gaussian signal). PCA requires the knowledge of second statistical moments, whereas for ICA (as we describe it here) the knowledge of fourth-order moments or cumulants is required. A rst problem arises since for deterministic signals strictly speaking neither uncorrelatedness/independence nor second/fourth moments are dened, since there exists no ensemble building on whose p.d.f. they could be constructed. In practice, however, no distinction between stochastic and deterministic signals is made since one invokes the ergodicity hypothesis and computes moments not by ensemble averaging but by temporal or spatial averaging. This has been tacitly assumed in practically all published works on PCA in GRACE analysis, and we followed this line in Sects. 3.1 and 3.2 as well. With these moments of analogous character (Kirimoto et al. 2011) one may then proceed with PCA and ICA. For example, Kirimoto et al. (2011) prove mathematically that the ICA method succeeds in separating a nite mixture of sinus-type signals, making use of the fact that sinus-type signals are uncorrelated (i.e. second moments disappear). It is not difcult to see that their proof applies as well if one of the signals would be a (centred) trend. Since ICA as discussed in this


Separation of global time-variable gravity signals


paper guarantees uncorrelatedness by pre-whiting through PCA, the condition of Kirimoto et al. (2011) would be automatically fullled. We believe that it would be possible to extend their mathematical proof to a larger class of interesting deterministic signals, but this would be clearly beyond the scope of our present paper. However, while this discussion might appear somewhat theoretical, it points to a computational problem inherent to both PCA and ICA. In practice, the computation of moments by temporal averaging is hampered by the nite length of the time series (and possibly by sampling issues as well). In summary, while no theoretical reasons would prevent us from applying PCA and ICA to deterministic signals, or to a mixture of stochastic and deterministic signals, one always has to be aware of the fact that moments and cumulants will be computed only approximately. For ICA this means that the pre-whitening achieved by PCA will only hold approximately, the consequences of which certainly require further investigations.

4 Simulation We designed a synthetic simulation of GRACE-type TWS time series, where the true spatial and temporal solution to the decomposition problem is known. The goal of the simulation is to investigate the performance of the PCA, ICA and
Fig. 1 The simulation covers the period between January 2003 and July 2009: Top The introduced spatial and temporal patterns. One can reconstruct the simulated data set with multiplying the spatial anomalies by their corresponding temporal evolutions. Bottom Colored noise, simulated using the covariance matrix of ITG2010 for the same period January 2003 to July 2009

REOF methods in a controlled situation when the observed signals are derived from a super-positioning process and corrupted by articial GRACE-type correlated noise. To dene reasonable spatial and temporal patterns for the simulation, we used 1 1 monthly GLDAS/Noah hydrological models (Rodell et al. 2004) covering the period January 2003 to August 2009 (comparable with ITG2010 products). Since the long-term trend and annual cycles are dominant in the hydrological signals (see, e.g. Tapley et al. 2005; Wouters and Schrama 2007), we suppose that the known temporal component simply consists a linear trend and a 365.25day annual cycle. For the true signals, we then extract a superposition of a linear trend and annual cycle for South America and the annual cycle only for Africa from the model using a least squares t. This set-up should render the separation task even easier for the PCA method since the spatial and temporal differences are being exaggerated with comparison to realistic hydrological data sets. The set-up of the simulation is shown in Fig. 1. To simulate realistic spatially correlated noise, we used the covariance matrix of the ITG2010 monthly solutions (MayerGuerr et al. 2010). Using Cholesky decomposition we split the covariance matrices into their upper triangular and their conjugate transpose matrices (e.g. Golub and van Loan 1996, section 4.2). Then, multiplying each of the upper triangular matrices with a column of the unit random matrix, we generated the GRACE-type realizations of monthly errors. These


484 Fig. 2 Decomposition of the simulated patterns. The gure compares the performance of PCA, VARIMAX REOF and ICA methods. (First row) the derived patterns from implementing PCA, (second row) results of VARIMAX REOF with rotating PCs, (third row) VARIMAX REOF with rotating EOFs, (fourth row) temporal ICA, (fth row) spatial ICA

E. Forootan, J. Kusche

errors are spatially correlated but temporally uncorrelated. To reduce the amplitude of the noise to be comparable with the real (ltered) GRACE data, we smoothed each of monthly noise realizations with a Gaussian lter with 500 km radius. Then maps of simulated noise for South America and Africa are added to the monthly snapshots. Typical noise patterns are shown in Fig. 1 (bottom). The results of the PCA decomposition are shown in the rst row of Fig. 2. They clearly illustrate that the PCA method fails to separate the true signals, because the rst eigenvector is oriented in the direction of maximum variance. The rst component contains both linear and annual signals and the amplitude of the signal is over tted for the African part since the spatial pattern has the same amplitude as it was simulated, but the corresponding temporal component shows more variability. Continuing the decomposition procedure, the following eigenvectors turn out orthogonal to the rst one adding remaining signal to South America and creating an opposite anomaly over Africa. Consequently, in both of the PCA components the linear trend and annual signals prevail. The VARIMAX criterion orthogonally rotates the PCAderived components to obtain a simpler structure that in this

case yields another mixture of the true signals. Results of VARIMAX rotation with either rotating PCs or EOFs are respectively illustrated in the second and third row of the Fig. 2. Rotating the EOFs showed a better performance than rotating PCs. It should be mention here, changing the number of components to be rotated within the REOF method did not enhance the separation purpose for this case. In the 4th row of the gure the results of the temporal ICA are shown in which the rst pattern on the left is related to the annual signal over South America and Africa, and the second pattern (on the right) shows the spatial distribution of the linear trend over Africa. As the gure illustrates the spatial patterns are truly recovered. Their corresponding temporal evolution is illustrated in the second and fourth columns of Fig. 2, showing that the annual cycle is separated from the linear trend. The linear component (IC2) is still contaminated by a low-amplitude annual signal. This happens because of the small length of the observed time series which does not permit the computation of the fourth-order cumulants correctly. We have scrutinized this statement using a longer simulation (120 months of GLDAS), and the problem indeed vanished.


Separation of global time-variable gravity signals


The performance of the spatial ICA is also assessed here and results are shown in last row of Fig. 2. At least with our simple simulation set-up, spatial ICA was able to completely separate the true patterns. Within the simulation, we showed that from the three methods, only ICA correctly separates all components. Keeping in mind that a simulation cannot claim the general validity of a mathematical proof, this result nevertheless encourages us to study this method on the real TWS data set from GRACE.

5 Numerical results 5.1 Processing of GFZ GRACE solutions TWS data were computed as described in Sect. 2.1 using the products of GFZ Potsdam. Before implementing the separation methods, we arranged the TWS maps in a matrix same as in the simulation. Then the data are centred. To check the data for non-Gaussianity, rst we computed the kurtosis of the time series using E (x4 )/ E (x2 )2 3, where E is the expectation operator. The computed kurtosis are summarized in Fig. 3(top), showing that 52.6% of the time series exhibit an absolute value of kurtosis more than 0.5, which can be considered as a non-Gaussian distribution. The non-Gaussianity of time series have been also tested alternatively, using statistical tests such as the Pearson chisquare (PCH), Lilliefors, Jarque-Bera (JB), Shapiro-Wilk (SW) and Anderson-Darling (AD) tests (for a summary of these tests see, e.g. Thode 2002). The results of these tests when referred to 68% significance level conrmed that more than 50% of time series exhibit a non-Gaussian kurtosis. To understand whether removing pre-dened frequencies will drive the data towards Gaussianity, we corrected the time series for a linear trend and an 365.25-day annual cycle (as, e.g. in Chambers 2006); then we recomputed the kurtosis map which is shown in Fig. 3(bottom). The results indicate that the corrected time series are still significantly non-Gaussian: This can be clearly seen for regions like Greenland, Western Antarctica, Mackenzie, Western Canada and Northwest Australia where signals are sub-Gaussian. In this context, one should mention that even dominant cycles detected by the GRACE satellites are not necessarily explained by a fundamental annual cycle and its overtones (Schmidt et al. 2008). Furthermore, the hydrological long-term patterns are not necessarily linear in time (see, e.g. Ogawa 2010; Velicogna 2009). Therefore, for the decomposition procedure we chose not to correct the time series for any such pre-dened model, whose choice would inevitably affect the interpretation of the residual signals.

Fig. 3 Maps of kurtosis derived from implementing E (x4 )/ E (x2 )2 3 on (top) the GFZs DDK2 ltered TWS (Bottom) the same time series after removing a linear trend and a 365.25-day annual signal. In the bottom gure, the color bar is changed to better demonstrate the nonGaussianity details. Note how residual variability (e.g. for Greenland, west of Antarctica, west of Canada, Mackenzie and northwest of Australia) maps into sub-Gaussian kurtosis

5.1.1 PCA/EOF results From the centred data matrix, the sample auto-covariance matrix was computed. Diagonalizing the auto-covariance matrix is done with eigenvalue decomposition using Eq. (3) (the PCA method). The spectrum of eigenvalues with their variance percentage is shown in Fig. 4. As the gure shows, the rst two eigenvalues are very close to each other and considerably larger than the rest of the eigenvalues. However, according to the North et al. (1982)s rule of thumb the rst four components, containing around 70% of variation in TWS are well separated from the rest of eigenvalues. Each of the components contribute to 28.2, 25.9, 9.7 and 6.1% of the total variation, respectively. In Fig. 5 we show two additional EOFs and PCs because they seem to contain inter-annual signals. However, their eigenvalues are close to each other and are considerably smaller than the rst four eigenvalues. The fraction of the total variance, explained with these two components, is 3.2 and 2.6%.



E. Forootan, J. Kusche

Fig. 4 Eigenvalue spectrum derived from implementing PCA method on the GFZ data set

In this contribution, all decomposition results are presented in a way that the temporal patterns are normalized by their standard deviation, rendering them unit-less. Correspondingly, the spatial patterns are scaled by these standard deviations, thus representing anomaly maps with millimeter unit. One can reconstruct each mode of TWS variability by multiplying the spatial patterns with their corresponding temporal components. As the PCs in Fig. 5 illustrate, a long-term trend modulated by low-amplitude annual and semi-annual signals is captured in the rst component (PC1). PC2, PC3 and PC4 contain annual signal; however, PC2 and PC3 are contaminated with semi-annual signals and a frequency close to the S2 tidal aliasing period of 161 days. PC5 and PC6 mainly contain a semi-annual signal along with some long-wave cycles. The PCA method by definition cannot capture a periodic signal in one single mode (e.g. see Xu and von Storch 1990), in particular when the eigenvalues are close to each others as in the present case. Chambers (2006) has also reported this problem in the study of the annual oceanic mass variations. A closer look into the spatial pattern of EOF2, EOF3 and EOF4 conrms that, e.g. the mass anomaly over Amazon is repeated in several modes. EOF5 and EOF6 show mainly a superposition of annual and semi-annual patterns. 5.1.2 VARIMAX REOF results To assess the performance of the REOF methods for decomposition of GRACE TWS data, we implement the rotation both for the VARIMAX and ICA criteria and in the two ways as described in Sects. 3.1 and 3.2. In both cases, rst the leading four EOFs and PCs are selected to be rotated, and afterwards the following two components are rotated separately. This procedure was chosen to decrease the sensitivity of the rotation with respect to the number of components (Jolliffe 1989). The results of rotating the EOFs and PCs are shown in Figs. 6 and 7, respectively. Applying the VARIMAX criterion (Eq. (5)) on EOFs (for the original EOFs and PCs see Fig. 5), the rst REOF is again related to the long-term trend which is now less contaminated by an annual signal compared

with PC1 in Fig. 5; the second, third and forth components express a mixture of annual and semi-annual signals. The last two components have changed considerably now capturing more trend while being contaminated with a semi-annual signal. In summary, REOFs computed by VARIMAX differ from conventional EOFs, but they do not appear to be more suited for interpretation. Implementing VARIMAX on PCs did not significantly improve the results whereas almost all the derived RPCs are contaminated with annual signal (see Fig. 7). Therefore, this method will not be discussed further here. 5.1.3 Temporal and spatial ICA results The PCA time series still exhibit a significant level of nonGaussianity, as we have validated by computing the kurtosis. Our analyses also indicate that the PCA components are not independent: this becomes evident when we use these components (PCs or EOFs) to construct the fourth-order cumulant tensor Eq. (8). Remember that when the introduced components are independent, the matrix Q(M) should be diagonal or close to diagonal (Cardoso and Souloumiac 1993). However, our results show that Q(M) is clearly not diagonal. Therefore, a rotation towards independence is constructed with the ICA method, following both the spatial and temporal implementation. The rotation matrix is found from Eq. (9) and it is implemented to the components using Eq. (4). The results of temporal ICA are shown in Fig. 8. The spatial patterns are generally more localized compared with the PCA results and the temporal components appear to be less clustered regarding different time scales. This statement is, to some extent, seen for example in the spatial pattern of IC1 (see Fig. 8) where the spatial component captures the most dominant trends that have been detected over the Polar Regions such as Greenland, Alaska and Antarctica. IC1 thus represents a considerable ice-mass loss during the period of study (cf. van den Broeke et al. 2009; Velicogna and Wahr 2005; Arendt et al. 2009). Some decrease is also detected such as in the north-west of Australia. In contrast, signals of moderate mass increase have been detected over the northern part of South America and crustal uplift in the northern part


Separation of global time-variable gravity signals Fig. 5 Implementing the PCA method over 93 months of global TWS computed from GRACE products provided by the GFZ centre, ltered by Kusche et al. (2009)s DDK2 lter. The components are ordered according to the size of their corresponding eigenvalues. One can reconstruct each mode of TWS variability by multiplying the spatial patterns with their corresponding temporal components


of the Canadian and Scandinavian shields (Rangelova and Sideris 2008; Timmen et al. 2006). Unlike PC1, IC1 does not show a significant annual behaviour (see Fig. 5 for comparison). It should be mentioned here that for this study we did not apply an a priori post glacial rebound (PGR) correction for those mentioned areas (e.g. van der Wal et al. 2008). The annual cycle is extracted in IC2 and IC3 with different phase. The cross-correlation of IC2 and IC3 indicates that

these two components are uncorrelated at time lag zero, but the maximum correlation of 0.93 is reached for a 3-month lag. The annual signal is clearly evident in the tropical regions, such as South America, Africa, South Asia and northern Australia. Yet, there are also weaker annual signals in mid- and higher-latitude regions on the northern hemisphere such as Alaska, Siberia and northern and central Europe that might to be related to variations in snow and ice cover. IC4 and IC5


488 Fig. 6 Implementing VARIMAX REOF on the GFZ products. The rotation is done using the rst four EOFs (EOF1 to EOF4) and then the last two components (EOF5 and EOF6) of Fig. 5. The components are ordered with respect to the magnitude of total variance they represent

E. Forootan, J. Kusche

mainly capture the semi-annual and 161-day cycles. Spatial patterns corresponding to IC5 and IC6 represent anomalies over the high-latitude and tropical regions. Finally, the pattern of IC6 is more complex than the rst ve components and we do not attempt to interpret it here. The results of the spatial ICA (selecting EOFs to nd the optimal rotation matrix) are shown in Fig. 9. Similar as for

the temporal ICA, this method appears to perform well in the sense that the long-term and the annual signals are well separated into three different components. IC4 and IC5 contain the semi-annual cycle. However, they also capture some annual variability, with their spatial concentration being different from IC2 and IC3. As with the temporal ICA, we do not attempt to interpret IC6.


Separation of global time-variable gravity signals Fig. 7 Implementing the VARIMAX rotation using PC1 to PC6 of Fig. 5. The rotation is done rst with selecting the rst 4 PCs then the rest two components. The results are ordered similar to the results of PCA method


5.2 Processing of ITG2010 Data preprocessing was applied for the ITG2010 series exactly as it was performed for the GFZ solutions, including the non-Gaussianity tests. The computed kurtosis indicates that 54.2% of the grid points exhibit a significant non-Gaussian behaviour. This is conrmed with applying normality tests. In the light of the results obtained so far, for analysing

the ITG2010 solutions we restricted ourselves to the PCA and ICA methods. 5.2.1 PCA results The results of implementing PCA on the ITG TWS data are briey summarized in Fig. 10. According to Norths test, we nd the rst four components as well separated from the


490 Fig. 8 The results of implementing temporal ICA on GFZ data with selecting the rst 6 PCs of Fig. 5. The rotation is done with rst selecting the rst 4 PCs and then the rest two components. The derived independent components are ordered with respect to the magnitude of total variance they represent (similar to the results of PCA)

E. Forootan, J. Kusche

remaining ones. However, as with the GFZ solutions, the rst six components are selected to be rotated within the ICA. They explain about 82% of the total variability, with each contributing 34.2, 25.8, 9.7, 6.7, 3.5 and 2.1%, respectively. As Fig. 10 indicates, implementing PCA results again in a mixing behaviour, even stronger as with the GFZ solutions. Spectral analysis reveals that PC1, PC2, PC3 and PC4

mainly contain the annual signal. PC1 and PC2 contain the long-term variation and their corresponding spatial patterns (EOF1 and EOF2) show a clear repeating behaviour. PC3 and PC4 exhibit a strong nearly 1/162-day frequency and a less powerful semi-annual wave besides the annual cycle. PC5 contains mainly annual and semi-annual signals and PC6 mostly semi-annual.


Separation of global time-variable gravity signals Fig. 9 Spatial ICA analysis of the products of the GFZ centre using rst EOF1 to EOF4 and then EOF5 and EOF6 of Fig. 5 for rotation. The independent components are ordered with respect to the amplitude of variations they represent


5.2.2 Temporal and spatial ICA results Results of the temporal ICA applied to the ITG TWS are shown in Fig. 11, illustrating that the decomposition is comparable with the results for GFZ products (see also Figs. 8, 9). Given that the ITG series (covering until August 2009) is shorter than the GFZ series (covering until October 2010), ICA apparently shows a robust behaviour.

Concerning the derived patterns, IC2 and IC3 are representing the annual variations, IC1 the dominant trend and IC4 and IC5 the semi-annual and 1/162-day frequencies. The amplitude of the 1/162-day cycle in IC4 is stronger than the semi-annual cycle. This situation is reverse for IC5. The trend (IC1) and annual patterns (IC2 and IC3) are comparable to the GFZ results. The spatial pattern of IC4 divides the Earth into zonal bands which is typical for the C20 effect. The


492 Fig. 10 PCA analysis of the product of Bonn University (ITG2010) lter by Kusche et al. (2009)s DDK2 lter. According to the computed eigenvalues the rst four component are well separated from the other components. The results are ordered with respect to the size of their corresponding eigenvalues

E. Forootan, J. Kusche

amplitude of the 1/162-day cycle appears somewhat larger as computed in the GFZ result (e.g. for the Gulf of Thailand and the Arabian Sea), something which has been observed before. IC6 shows a long-term variation along with semiannual cycle over regions such as Lake Victoria which corresponds to mass loss till the rst month of 2007 and mass gain from January 2007 until August 2009. The opposite

behaviour is detected over, e.g. the Caspian Sea. Results of spatial ICA are summarized in Fig. 12. Like the temporal ICA this method appears to be able to separate the spatial signals. It is worth mentioning here that for both GFZ and ITG solutions, the separation of the trend and annual signals is better performed. It might be due to the dominant effect of those components.


Separation of global time-variable gravity signals Fig. 11 Temporal ICA of the ITG2010 products using the PCs of Fig. 10. The rotation is done in two steps similar to the ICA of the GFZ products and the derived independent components are ordered with respect to the variance they represent


6 Discussion and outlooks The PCA method has been often used to extract modes of variability from GRACE and model-based TWS time series. In this contribution, rst the performance of PCA has been investigated for the decomposition of GRACE TWS signals. Then, the VARIMAX rotation is considered in order to overcome some possible drawbacks of PCA related to

orthogonality and to the interpretation of the obtained patterns. Finally, we considered ICA, which like VARIMAX has been interpreted as a rotation method starting from the PCA components. ICA not only decorrelates the data set, it goes one step further by nding components that are mutually independent. All three methods have been applied in a simulation with known source signals and to real GRACE TWS. Our major ndings can be summarized as follows:


494 Fig. 12 Spatial ICA decompositions implemented on the ITG2010 products using the EOFs of Fig. 10. The 2-step rotation and ordering is done similar to the temporal case

E. Forootan, J. Kusche

The geometrical properties of PCA can be very useful since the covariance matrix of any subset of retained PCs is always diagonal. PCA also captures the dominant part of the variance in the data set when the components are ordered with respect to the descending magnitude of singular values. Note, from an interpretation point of view when the components are treated individually, that the PCA method can also be misleading since it combines many of the actually separate

signals into its retained components, when concentrating the variance exhibited in the data set. This was evident when we implemented PCA on the synthesized example and both the ITG2010 and GFZ solutions where most of the PCs turned out to contain apparently a mixture of signals (see Figs. 2, 5 and 10). In theory, the PCA results may be enhanced to some extent by VARIMAX rotation. However, for the separation


Separation of global time-variable gravity signals

