Professional Documents
Culture Documents
Machine Learning Exponential Family: Dmitrij Schlesinger
Machine Learning Exponential Family: Dmitrij Schlesinger
Exponential Family
Dmitrij Schlesinger
WS2013/2014, 22.11.2013
General form
h i
p(x; θ) = h(x) exp hη(θ), T (x)i − A(θ)
with
– x is a random variable
– θ is a parameter
– η(θ) is a natural parameter, vector (often η(θ) = θ)
– T (x) is a sufficient statistic
– A(θ) is the log-partition function
Maximum Likelihood:
Xh i
hφ(xl , y l ), wi − ln Z(w) → min
w
l
Gradient:
∂ 1 X ∂ ln Z(w)
= φ(xl , y l ) −
∂w |L| l ∂w
Gradient:
∂ 1 X 1 X ∂ ln Z(w, xl )
= φ(xl , y l ) −
∂w |L| l |L| l ∂w
Expectation:
αl (y) = p(y|xl ; w) ∀l, y
Maximization:
XX
αl (y) ln p(x, y; w) → max
w
l y
αl (y)hφ(xl , y), wi −
XX XX
= αl (y) ln Z(w) =
l y l y
∂ ln L
= Edata [φ] − Emodel [φ]
∂w