Information-Geometric Optimization Algorithms: A Unifying Picture via Invariance Principles

Overview

Code for the experiments performed in the above study involving the use of natural gradient descent for training RBMs.

This code is available under the BSD License.

Setup instructions

The code depends on the clap package. The clap package should be
build and ready to use before building this code.

The main igo-code/ directory should be placed alongside the clap/
package directory or the Makefile CLAP_DIRECTORY variable should
be set to the location of the clap package directory.

Building

The main executable “optim-rbms” can then be build using the following
command: $ make

Example runs can be found in the “example-runs.sh” script and can be
run using: $ ./example-runs.sh

The results of running experiments are placed in the data/ directory
and can be plotted using the scripts in the plotresults directory.
Check the plotresults/README file for more informaiton on plotting
results.

Running experiments

experiments can be run using the “optim-rbms” executable. This executable accepts many optional parameters if you wish to override their default value. Here is an example below

./optim-rbms –learningRate=0.1 –numGradientSamples=10000 –nbSteps=1 –natural=1 –numSelection=2000 –nbRuns=5

The possible parameters and their default values are given below:

int nbRuns = 10; // total number of runs on which to gather statistics.
int nbSteps = 100 // maximum number of steps for one run.
int vDim = 40; // number of RBM visible units, i.e. the input dimension.
int hDim = 1; // number of RBM hidden units.
int markovSamples = 50; // number of Gibbs iteration to sample in the RBM distribution. int numFisherSamples = 10000; // number of samples to use for Fisher matrix estimation.
int numGradientSamples = 10000; // number of gradient samples to use.
double learningRate = 0.25; // learning rate a.k.a. step size deltat.
// double momentum = 0; // deprecated.
// double lambda” = 0; // deprecated.
bool natural = true; // set to true to use the natural gradient, to false to use the vanilla gradient.
bool keepSingleRuns = true; // if set to false, only statistics are recorded.
int numSelection = 2000; // number of gradient samples.
double initialNoiseDev = .01 / vDim; // initial standard deviation for a small perturbation of the RBM initialization.

Files