Restricted Boltzmann machines (RBMs) with a binary visible layer of size N and a Gaussian hidden layer of size P have been proved to be equivalent to a Hopfield neural network (HNN) made of N binary neurons and storing P patterns xi, as long as the weights w in the former are identified with the patterns. Here we aim to leverage this equivalence to find effective initialisations for weights in the RBM when what is available is a set of noisy examples of each pattern, aiming to translate statistical mechanics background available for HNN to the study of RBM's learning and retrieval abilities. In particular, given a set of definite, structureless patterns we build a sample of blurred examples and prove that the initialisation where w corresponds to the empirical average xi over the sample is a fixed point under stochastic gradient descent. Further, as a toy application of the duality between HNN and RBM, we consider the simplest random auto-encoder (a three layer network made of two RBMs coupled by their hidden layer) and evidence that, as long as the parameter setting corresponds to the retrieval region of the dual HNN, reconstruction and denoising can be accomplished trivially, while when the system is in the spin-glass phase inference algorithms are necessary. This questions the need for larger retrieval regions which we obtain by applying a Gram-Schmidt orthogonalisation to the patterns: in fact, this procedure yields to a set of patterns devoid of correlations and for which the largest retrieval region can be accomplished. Finally we consider an application of duality also in a structured case: we test this approach on the MNIST dataset, and obtain that the network performs already similar to 67% of successful classifications, suggesting it can be exploited as a computationally-cheap pre-training.

On the effective initialisation for restricted Boltzmann machines via duality with Hopfield model

Linda Albanese;Adriano Barra
2021-01-01

Abstract

Restricted Boltzmann machines (RBMs) with a binary visible layer of size N and a Gaussian hidden layer of size P have been proved to be equivalent to a Hopfield neural network (HNN) made of N binary neurons and storing P patterns xi, as long as the weights w in the former are identified with the patterns. Here we aim to leverage this equivalence to find effective initialisations for weights in the RBM when what is available is a set of noisy examples of each pattern, aiming to translate statistical mechanics background available for HNN to the study of RBM's learning and retrieval abilities. In particular, given a set of definite, structureless patterns we build a sample of blurred examples and prove that the initialisation where w corresponds to the empirical average xi over the sample is a fixed point under stochastic gradient descent. Further, as a toy application of the duality between HNN and RBM, we consider the simplest random auto-encoder (a three layer network made of two RBMs coupled by their hidden layer) and evidence that, as long as the parameter setting corresponds to the retrieval region of the dual HNN, reconstruction and denoising can be accomplished trivially, while when the system is in the spin-glass phase inference algorithms are necessary. This questions the need for larger retrieval regions which we obtain by applying a Gram-Schmidt orthogonalisation to the patterns: in fact, this procedure yields to a set of patterns devoid of correlations and for which the largest retrieval region can be accomplished. Finally we consider an application of duality also in a structured case: we test this approach on the MNIST dataset, and obtain that the network performs already similar to 67% of successful classifications, suggesting it can be exploited as a computationally-cheap pre-training.
File in questo prodotto:
File Dimensione Formato  
LeonelliNN.pdf

solo utenti autorizzati

Descrizione: Articolo
Tipologia: Versione editoriale
Licenza: NON PUBBLICO - Accesso privato/ristretto
Dimensione 1.92 MB
Formato Adobe PDF
1.92 MB Adobe PDF   Visualizza/Apri   Richiedi una copia

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11587/488904
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 10
  • ???jsp.display-item.citation.isi??? 9
social impact