The Free Energy Approximation Equation in Restricted Boltzmann Machines

Restricted Boltzmann Machines and Free Energy

Restricted Boltzmann Machines (RBMs) are a type of generative stochastic neural network. They are used for learning probability distributions over a set of variables, and have found applications in areas like collaborative filtering, image recognition, and natural language processing.

A key concept in RBMs is the **free energy**. It’s a function that measures the energy of a configuration of visible and hidden units in the RBM, and it plays a crucial role in understanding the model’s behavior.

Free Energy Approximation Equation

The free energy of a configuration of visible units v is defined as follows:

Equation

The free energy, denoted as F(v), is given by:

F(v) = -log ∑h exp(-E(v,h))

where:

  • v: Vector representing the visible units.
  • h: Vector representing the hidden units.
  • E(v,h): Energy function of the RBM.

Interpretation

The free energy is essentially the negative logarithm of the partition function. The partition function sums over all possible configurations of hidden units, weighted by their probability of occurring given the visible units.

The free energy can be interpreted as the expected energy of the system, taking into account the probabilities of different hidden unit configurations.

Approximation

The exact calculation of free energy involves summing over all possible hidden unit configurations, which can be computationally expensive. In practice, we use an **approximation**:

F(v) ≈ - log(exp(-E(v,h*)) + ∑j exp(-E(v,h*j))

where:

  • h*: The most likely configuration of hidden units given the visible units, determined by applying the sigmoid function to the hidden units’ activation values.
  • h*j: Configurations obtained by flipping a single hidden unit from its most likely state (h*).

Code Example (Python)

import numpy as np

def free_energy(v, W, b, c):
  """
  Calculates the approximate free energy.

  Args:
    v: Vector of visible units.
    W: Weight matrix.
    b: Visible bias vector.
    c: Hidden bias vector.

  Returns:
    Approximate free energy.
  """

  h_probs = sigmoid(np.dot(W, v) + c)  # Probability of hidden units
  h_star = (h_probs > 0.5).astype(int)  # Most likely hidden state

  # Calculate energy for most likely hidden state and flipped states
  energy_star = -np.dot(v, b) - np.dot(h_star, c) - np.dot(v, np.dot(W, h_star))
  energy_flipped = -np.dot(v, b) - np.dot((1-h_star), c) - np.dot(v, np.dot(W, (1-h_star)))

  # Approximate free energy
  approx_free_energy = -np.log(np.exp(-energy_star) + np.sum(np.exp(-energy_flipped)))
  return approx_free_energy

def sigmoid(x):
  return 1 / (1 + np.exp(-x))

Uses of Free Energy in RBMs

The free energy is essential for various aspects of RBM training and inference:

  • **Training:** The contrastive divergence (CD) algorithm, a popular method for training RBMs, relies on calculating the free energy. It aims to minimize the difference between free energy for the data and free energy for model-generated data.
  • **Inference:** The free energy can be used to calculate the probability of observing a particular visible unit configuration.
  • **Generative Modeling:** RBMs can be used to generate new data by sampling from the model’s probability distribution, which can be determined from the free energy.

Conclusion

The free energy plays a fundamental role in restricted Boltzmann machines. While its exact computation is challenging, efficient approximations are available that allow us to train and utilize RBMs for various applications.


Leave a Reply

Your email address will not be published. Required fields are marked *