Professional Documents
Culture Documents
Input Convex Neural Networks
Input Convex Neural Networks
Introduction
Inference in ICNNs
Introduction
argminf (x, y ; θ)
y
(z) (y )
zi+1 = gi Wi zi + Wi y + bi , f (y ; θ) = zk
(z)
where
n zi denotes the layer activations
o (with z0 , W0 ≡ 0 ),
(y ) (z)
θ = W0:k−1 , W1:k−1 , b0:k−1 are the parameters, and gi are
non-linear activation functions.
Fully Convex Neural Networks
(z)
The function f is convex in y provided that all W1:k−1 are
non-negative, and all functions gi are convex and
non-decreasing.
Partially Convex Neural Networks
ui+1 = g̃i W̃i ui + b̃i
h i
(z) (zu) (z)
zi+1 = gi Wi zi ◦ Wi ui + bi +
+
(y ) (yu) (y ) (u)
Wi y ◦ Wi ui + bi + Wi ui + bi
f (x, y ; θ) = zk , u0 = x
Partially Convex Neural Networks
ui ∈ Rni and zi ∈ Rmi denote the hidden units for the” x-path”
and “y -path”, where y ∈ Rp , and where – denotes the Hadamard
product, the elementwise product between two vectors. The crucial
element here is that unlike the FICNN, we only need the W (z)
terms to be nonnegative, and we can introduce arbitrary products
between the ui hidden units and the zi hidden units.
Partially Convex Neural Networks
minimizef (x, y ; θ)
y ∈Y