Recently I try to implement RBM based autoencoder in tensorflow similar to RBMs described in Semantic Hashing paper by Ruslan Salakhutdinov and Geoffrey Hinton. It seems that with weights that were pre-trained with RBM autoencoders should converge faster. So I've decided to check this.
This post will describe some roadblocks for RBMs/autoencoders implementation in tensorflow and compare results of different approaches. I assume reader's previous knowledge of tensorflow and machine learning field. All code can be found in this repo
RBMs different from usual neural networks in some ways:
Neural networks usually perform weight update by Gradient Descent, but RMBs use Contrastive Divergence (which is basically a funky term for "approximate gradient descent" link to read). At a glance, contrastive divergence computes a difference between positive phase (energy of first encoding) and negative phase (energy of the last encoding).