Professional Documents
Culture Documents
Restricted Boltzmann Machines
Restricted Boltzmann Machines
The Boltzmann machine was fully observed. Will have hidden units as
well.
A classic architecture called the restricted Boltzmann machine assumes
a bipartite graph over the visible units and hidden units:
A bipartite graph (or bigraph) is a graph whose vertices can be divided
into two disjoint and independent sets U and V, that is every edge
connects a vertex in U to one in V.
A complete bipartite graph or biclique is a special kind of bipartite
graph where every vertex of the first set is connected to every vertex of
the second set.
The hidden units learn more abstract features of the data.
RBM has binary-valued hidden and visible units, and consists of a matrix of weights w of
size m × n.
Each weight element (wi,j ) of the matrix is associated with the connection between the
visible (input) unit vi and the hidden unit hj .
There are bias weights (offsets) ai for vi and bj for hj .
Given the weights and biases,
∑ the energy
∑ of a configuration
∑∑ (pair of boolean vectors) (v,h)
is defined as E(v, h) = − i ai vi − j bj hj − i j vi wi,j hj
The joint probability distribution for the visible and hidden vectors is defined in terms of
the energy function as follows
P(v, h) = Z1 e−E(v,h)
where Z is a partition function defined as the sum of e−E(v,h) over all possible
configurations, which can be interpreted as a normalizing constant to ensure that the
probabilities sum to 1.
The marginal probability of a visible vector is the sum of P(v, h) over all possible hidden
1 ∑ −E(v,h)
layer configurations, P(v) = e ,and vice versa.
Z
{h}
The hidden unit activations are mutually independent given the visible unit activations and
vice versa.
For m visible units and n hidden units, the conditional probability of a configuration
∏m of the
visible units v, given a configuration of the hidden units h, is P(v|h) = i=1 P(vi |h).
∏n
Conversely, the conditional probability of h given v is P(h|v) = j=1 P(hj |v).