Worksheet 1

You might also like

Download as pdf or txt
Download as pdf or txt
You are on page 1of 2

Worksheet 1 – MATH20802

Korbinian Strimmer

Week 2

Formulas:
Kullback-Leibler (KL) divergence, KL information, relative entropy:
 
f (x)
DKL (F, G) := EF log
g(x)

where F is the true distribution and G the approximation, and f and g are the corresponding densities or
probability mass functions.
Jensen’s inequality:
E(h(X)) ≥ h(E(X))
for a convex function h(x). A function is convex if h00 (x) ≥ 0. Note: if h(x) is convex, then −h(x) is
concave.
Taylor series of f (x) around fixed x0 :
1
f (x) = f (x0 ) + f 0 (x0 )(x − x0 ) + f 00 (x0 )(x − x0 )2 + . . .
2

Pearson χ2 divergence between two discrete distributions P and Q:


d
X (pi − qi )2
DPearson (P, Q) =
i=1
qi

where pi and qi are the corresponding probabilities. The Pearson χ2 divergence is part of the family of
so-called f -divergences, cf. https://en.wikipedia.org/wiki/F-divergence . The KL divergence is also an
f -divergence.
Pearson and Neyman test statistics:
Assume two categorical distributions E and O termed the “expected distribution” and the “observed
distribution”. There are d categories. By E1 , . . . , Ed we denote
Pd the corresponding
Pd counts for E and by
O1 , . . . , Od the counts for O. The total number of counts is i=1 Ei = i=1 Oi = n. The corresponding
frequencies are denoted by ei = Eni and oi = Oni . Then

d
2
X (Oi − Ei )2
XPearson =
i=1
Ei

and
d
2
X (Oi − Ei )2
XNeyman =
i=1
Oi
are test statistics to compare the corresponding distributions E and O. Both statistics are asymptotically
chi-squared distributed χ2d−1 with d − 1 degrees of freedom.

1
Questions:
1. Show that the KL divergence cannot be negative. Hint: Use Jensen’s inequality.
2. Show that the KL divergence is invariant under affine coordinate transformations y = a + bx.
3. Can this result be generalised to arbitrary coordinate transformations y = h(x), with h(x) an
invertible function. What is the implication?
4. Find the second order Taylor series expansion of p log pq . Specifically, first keep p fixed, and then
keep q fixed.
5. Use this second order Taylor expansion to approximate the KL divergence between P and Q in
terms of the Pearson χ2 divergence.
6. Show that approximating 2nDKL (O, E) and 2nDKL (E, O) with fixed expected counts yields the
Pearson test statistic, and approximating both with fixed observed counts yields the Neyman test
statistic.

You might also like