Alghero Kalnay4 EnKFclass

You might also like

Download as pdf or txt
Download as pdf or txt
You are on page 1of 24

Advanced data assimilation methods-

EKF and EnKF

Alghero, Lecture 4

Hong Li and Eugenia Kalnay


University of Maryland
Optimal Interpolation for a scalar

Information available in OI

Tb = Tt + ! b background or forecast ! b = 0; ! b! b = " b2

To = Tt + ! o New observations ! o = 0; ! o! o = " o2

analysis=forecast+optimal weight * observational increment:

Ta = Tb + w(T0 ! Tb )
! b2 Optimal weight
w= 2
! b + ! o2

! a2 = (! b"2 + ! o"2 )"1 = (1 " w)! b2 Analysis error


Recall the basic formulation in OI

• OI for a Scalar:
Ta = Tb + w(T0 ! Tb )
Optimal weight to minimize the analysis error is:
! b2
w= 2
! b + ! o2

• OI for a Vector:
t=0 t=6h True atmosphere
x = Mx
i
f a
i !1

x ia = x bi + W(y io ! Hx bi )
W = BH T [HBH T + R ]!1
P a = [I ! WH ]B B = ! x b • ! xTb ! xb = xb " x t

• B is statistically pre-estimated, and constant with time in its practical


implementations. Is this a good approximation?
OI and Kalman Filter for a scalar

• OI for a scalar:
1
Tb (ti +1 ) = M (Ta (ti ); assume ! b2 = (1 + a)! a2 = ! a2 = const.
1" w
! b2
w= 2 ; ! 2
= (1 " w)! 2

! b + ! o2
a b

Ta = Tb + w(T0 ! Tb )

• Kalman Filter for a scalar:


Tb (ti +1 ) = M (Ta (ti )); ! b2 (t) = (L! a )(L! a )T ; L = dM / dT

! b2
w= 2 ; ! a2 = (1 " w)! b2
!b + !o2

Ta = Tb + w(T0 ! Tb )

• Now the background error variance is forecasted using the linear


tangent model L and its adjoint LT
“Errors of the day” computed with the Lorenz 3-variable
model: compare with rms (constant) error

! z2 = (zb " zt )2

=0.08

Not only the amplitude, but also the structure of B is constant in 3D-Var/OI
This is important because analysis increments occur only within the subspace
spanned by B
“Errors of the day” obtained in the NCEP reanalysis
(figs 5.6.1 and 5.6.2 in the book)

Note that the mean error went down from 10m to 8m because of
improved observing system, but the “errors of the day” (on a synoptic time
scale) are still large.
In 3D-Var/OI not only the amplitude, but also the structure of B is fixed
with time
Flow independent error covariance

If we observe only Washington, D.C,


we can get estimate for Richmond and
Philadelphia corrections through the
error correlation (off-diagonal term in B).

& ' 12 cov1, 2 ... cov1,n #


$ !
cov ' 22 cov 2,n !
B = $ 1, 2
$ ... ... ... !
$ !
$%cov1,n cov 2,n ... ' n2 !"

In OI(or 3D-Var), the scalar error correlation between two points in the
same horizontal surface is assumed homogeneous and isotropic. (p162 in
the book)
Typical 3D-Var error covariance

& ' 12 cov1, 2 ... cov1,n #


$ !
cov ' 22 cov 2,n !
B = $ 1, 2
$ ... ... ... !
$ !
$%cov1,n cov 2,n ... ' n2 !"

In OI(or 3D-Var), the error correlation between two mass points in the
same horizontal surface is assumed homogeneous and isotropic.(p162
in the book)
Suppose we have a 6hr forecast (background) and new observations

The 3D-Var Analysis doesn’t know


about the errors of the day

Observations ~105-7 d.o.f. Background ~106-8 d.o.f.

R B
With Ensemble Kalman Filtering we get perturbations pointing
to the directions of the “errors of the day”

Observations ~105-7 d.o.f. Background ~106-8 d.o.f.

3D-Var Analysis: doesn’t know Errors of the day: they lie


about the errors of the day on a low-dim attractor
Ensemble Kalman Filtering is efficient because
matrix operations are performed in the low-dimensional
space of the ensemble perturbations

Ensemble Kalman Filter Analysis:


correction computed in the low dim
ensemble space

Observations ~105-7 d.o.f. Background ~106-8 d.o.f.

3D-Var Analysis: doesn’t know Errors of the day: they lie


about the errors of the day on a low-dim attractor
After the EnKF computes the analysis and the analysis error covariance
A, the new ensemble initial perturbations ! ai are computed:
k +1 These perturbations represent the
"! a ! a
i =1
i
T
i =A analysis error covariance and are
used as initial perturbations for the
next ensemble forecast

Observations ~105-7 d.o.f.


Background ~106-8 d.o.f.

Errors of the day: they lie


on the low-dim attractor
Flow-dependence – a simple example (Miyoshi,2004)

There is a cold front in


our area…
What happens in this
case?

This is not appropriate


This does reflects the flow-dependence.
Extended Kalman Filter (EKF)

• Forecast step
x bi = Mx ia!1
Pib = Li !1Pia!1LTi !1 + Q ! ib = Li "1! ia"1 + ! im"1
*
• Analysis step
x ia = x bi + Ki (y io ! Hx if )
Ki = Pib HT [HPib HT + R]!1
**
Pia n ! n = [I " Ki H]Pib = [(Pib )"1 + HT R "1H]"1
b
• Using the flow-dependent Pi , analysis is expected to be improved
significantly

b
However, it is computational hugely expensive. Pi , L i n*n matrix, n~107
computing equation directly is impossible
*
Ensemble Kalman Filter (EnKF)
 Although the dimension of Pi f is huge, the
P = L P Li !1 + Q
i
b a
i !1 i !1
T
* f
rank ( Pi ) << n (dominated by the errors of the
day)
1 m f
P ! " (xk # x t )(xkf # x t )T
i
b

m k =1
Ideally m"!

 Using ensemble method to estimate


1 K f *
Pi !
b
#
K " 1 k =1
(xk " x f )(xkf " x f )T
Physically, 1
“errors of day” are the instabilities of = X b • X bT
the background flow. Strong instabilities K "1
have a few dominant shapes K ensemble members, K<<n
(perturbations lie in a low-dimensional
subspace).  Problem left: How to update ensemble ?
a
 It makes sense to assume that large i.e.: How to get x i for each ensemble
errors are in similarly low-dimensional member?
spaces that can be represented by a
low order EnKF.
Ensemble Update: two approaches
1. Perturbed Observations method:
An “ensemble of data assimilations”
 It has been proven that an observational
ensemble is required (e.g., Burgers et al.
1998). Otherwise Pia n ! n = [I " K i H]Pib x bi (k ) = Mx ia!1(k )
is not satisfied.
1 K b
P !
i
b
#
K " 1 k =1
(xk " x b )(xkb " x b )T
 Random perturbations are added to the
observations to obtain observations for
each independent cycle Ki = Pib HT [HPib HT + R]!1

y o
= y + noise
o x ia (k ) = x bi (k ) + Ki (y io(k ) ! Hx bi (k ) )
i (k ) i

 However, perturbing observations


introduces a source of sampling errors
(Whitaker and Hamill, 2002).
Ensemble Update: two approaches
2. Ensemble square root filter
(EnSRF)
 Observations are assimilated to x bi = Mx ia!1
update only the ensemble mean. 1 K b
P !
i
b
#
K " 1 k =1
(xk " x b )(xkb " x b )T

x ia = x bi + Ki (y io ! H x bi ) Ki = Pib HT [HPib HT + R]!1


 Assume analysis ensemble x ia = x bi + Ki (y io ! H x bi )
perturbations can be formed by
transforming the forecast ensemble X ia = Ti X bi
perturbations through a transform
x ia = x ia + X a
matrix

1 1
X ia = Ti X bi
T
X a X a = Pia n " n = [I ! K i H]Pib = [I ! K i H] X b X bT
k !1 k !1
Several choices of the transform matrix
 EnSRF, Andrews 1968, Whitaker and Hamill, 2002)
!
X a = (I ! KH)X b !
, K = "K
one observation
 EAKF (Anderson 2001) at a time

X a = AX b
 ETKF (Bishop et al. 2001)

X a = XbT
 LETKF (Hunt, 2005)
Based on ETKF but perform analysis
simultaneously in a local volume
surrounding each grid point
An Example of the analysis corrections from
3D-Var (Kalnay, 2004)
An Example of the analysis corrections from
EnKF (Kalnay, 2004)
Summary steps of LETKF
1) Global 6 hr ensemble forecast
starting from the analysis ensemble
Obs.
2) Choose the observations used for operator
each grid point
Ensemble forecast
3) Compute the matrices of forecast at obs. locations
perturbations in ensemble space Xb
4) Compute the matrices of forecast
perturbations in observation space Yb
LETKF Observations

Ensemble
5) Compute Pbin ensemble space forecast Ensemble analysis
space and its symmetric square root
6) Compute wa, the k vector of GCM
perturbation weights
7) Compute the local grid point
analysis and analysis perturbations.

8) Gather the new global analysis


ensemble. Go to 1
LETKF algorithm summary, Hunt 2005
Forecast step (done globally): advance the ensemble 6 hours, global model size n
x b(i
n
)
= M (x a(i )
n !1 ) i = 1,..., k
Analysis step (done locally). Local model dimension m, locally used obs s
X b = X ! x b (mxk) Y b = H (X) ! H (x b ) " HX b (sxk)
!1
! = " (k ! 1)I + HX ( ) R HX $
a b T !1 b
P in ensemble space (kxk)
# %
~a b
P =X X =X P X
a aT a bT
in model space (mxm)
~
X a = X b (P a )1/ 2 Ensemble analysis perturbations in model space (mxk)
~ a bT !1 o
w n = P Y R (y n ! y nb ) Analysis increments in ensemble space (kx1)
a

xna = xna + X b w an Analysis (mean) in model space (mx1)

x an = xna + X a Analysis ensemble in model space (mxk)

We finally gather all the analysis and analysis perturbations from each grid point
and construct the new global analysis ensemble (nxk) and go to next forecast step
References posted: www.atmos.umd.edu/~ekalnay
Google: “chaos weather umd”, publications

Ott, Hunt, Szunyogh, Zimin, Kostelich, Corazza, Kalnay, Patil, Yorke, 2004: Local
Ensemble Kalman Filtering, Tellus, 56A,415–428.

Kalnay, 2003: Atmospheric modeling, data assimilation and predictability,


Cambridge University Press, 341 pp. (3rd printing)

Hunt, Kalnay, Kostelich, Ott, Szunyogh, Patil, Yorke, Zimin, 2004: Four-
dimensional ensemble Kalman filtering. Tellus 56A, 273–277.

Szunyogh, Kostelich, Gyarmati, Hunt, Ott, Zimin, Kalnay, Patil, Yorke, 2005:
Assessing a local ensemble Kalman filter: Perfect model experiments with the
NCEP global model. Tellus, 57A. 528-545.

Hunt, Kostelich and Szunyogh 2007: Efficient Data Assimilation for


Spatiotemporal Chaos: a Local Ensemble Transform Kalman Filter.
Physica D.

Whitaker and Hamill, 2002: Ensemble Data Assimilation Without Perturbed


Observations. Monthly Weather Review, 130, 1913-1924.
EnKF vs 4D-Var
 EnKF is simple and model Obs.
independent, while 4D-Var requires the operator
development and maintenance of the
adjoint model (model dependent) Ensemble forecast
at obs. locations
4D-Var can assimilate asynchronous
observations, while EnKF assimilate
observations at the synoptic time. LETKF Observations
 Using the weights wa at any time 4D-
LETKF can assimilate asynchronous Ensemble
forecast Ensemble analysis
observations and move them forward or
backward to the analysis time
Disadvantage of EnKF: GCM
Low dimensionality of the ensemble in
EnKF introduces sampling errors in the
estimation of Pb . ‘Covariance
localization’ can solve this problem.

You might also like