Download as pdf or txt
Download as pdf or txt
You are on page 1of 10

Input Convex Neural Networks

April 19, 2023


Table of Contents

Introduction

Convex Neural Network Architecture


Fully Convex Neural Networks(FICNN)
Partially Convex Neural Networks

Inference in ICNNs
Introduction

Input convex neural networks (ICNNs) are scalar-valued neural


networks f (x, y ; θ) where x and y denotes the inputs to the
function and θ is the parameter.
This neural network is built in such a way that the network is
convex in (a subset) of inputs y .
f needs only be a convex function in y or a subset of y , and still
may be non-convex in the remaining inputs x. By this, the training
of these neural networks is still a non-convex problem. While, the
convexity will help in the time of inference.
Introduction

The benefit of ICNNs is that, it is possible to optimize the scalar


valued function f (x, y ; θ) over the convex inputs y (subset of y ),
given the other input x and possibly some fixed elements of y .
That is, it is possible to solve the problem

argminf (x, y ; θ)
y

globally and efficiently as the problem is convex.


Fully Convex Neural Networks

 
(z) (y )
zi+1 = gi Wi zi + Wi y + bi , f (y ; θ) = zk

(z)
where
n zi denotes the layer activations
o (with z0 , W0 ≡ 0 ),
(y ) (z)
θ = W0:k−1 , W1:k−1 , b0:k−1 are the parameters, and gi are
non-linear activation functions.
Fully Convex Neural Networks

(z)
The function f is convex in y provided that all W1:k−1 are
non-negative, and all functions gi are convex and
non-decreasing.
Partially Convex Neural Networks

 
ui+1 = g̃i W̃i ui + b̃i
  h i 
(z) (zu) (z)
zi+1 = gi Wi zi ◦ Wi ui + bi +
+
   
(y ) (yu) (y ) (u)
Wi y ◦ Wi ui + bi + Wi ui + bi

f (x, y ; θ) = zk , u0 = x
Partially Convex Neural Networks

ui ∈ Rni and zi ∈ Rmi denote the hidden units for the” x-path”
and “y -path”, where y ∈ Rp , and where – denotes the Hadamard
product, the elementwise product between two vectors. The crucial
element here is that unlike the FICNN, we only need the W (z)
terms to be nonnegative, and we can introduce arbitrary products
between the ui hidden units and the zi hidden units.
Partially Convex Neural Networks

A PICNN network with k layers can represent any FICNN


with k layers and any purely feedforward network with k
layers.
Inference in ICNNs

minimizef (x, y ; θ)
y ∈Y

You might also like