This time-extension of the previously obtained static receptive fields increase the input selectivity of each hidden unit. Consequently, each hidden unit is activated in a highly sparse manner by only specific spatio-temporal input scenarios. We have introduced a new training method for TRBMs called Temporal Autoencoding and validated it by showing a significant performance increase in modelling and generation from a sequential human motion capture dataset (Fig. 7). The gain in performance from the standard TRBM to the pre-trained aTRBM model, which are both structurally identical, suggests that our approach of
autoencoding the temporal dependencies gives the model a more meaningful temporal representation than is achievable through contrastive divergence training alone. We believe the inclusion of autoencoder training in temporal learning tasks will be beneficial ERK screening in a number of problems, as it enforces the causal structure of the data on the learned model. Trichostatin A We have shown that the aTRBM is able to learn high level structure from natural
movies and account for the transformation of these features over time. The statistics of the static filters resemble those learned by other algorithms, namely Gabor like patches showing preferential orientation of the filters along cardinal directions (Fig. 2). The distribution of preferred position, orientation and frequency (Fig. 3) is in accordance with results previously found by other methods (e.g. Cadieu and Olshausen, 2008 and Bell and Sejnowski, 1997), and the simple cell like receptive fields and cardinal selectivity (-)-p-Bromotetramisole Oxalate is supported by neurophysiological findings in primary visual cortex (Wang et al., 2003 and Coppola et al., 1998). Importantly the temporal connectivity expressed in the weights WMWM learned by the model is also qualitatively
similar to the pattern of lateral connections in this brain area. Preferential connection between orientation-selective cells in V1 with similar orientation has been reported in higher mammals (Bosking et al., 1997, Field and Hayes, 2004 and Van Hooser, 2007). These lateral connections are usually thought to underlie contour integration in the visual system. Here they arise directly from training the aTRBM model to reproduce the natural dynamics of smoothly changing image sequences. One could say that, in an unsupervised fashion, the model learns to integrate contours directly from the dataset. The aTRBM presented here can be easily embedded into a deep architecture, using the same training procedure in a greedy layer-wise fashion. This might allow us to study the dynamics of higher-order features (i.e. higher order receptive fields) in the same fashion as was done here for simple visual features. In this way one could envisage applications of our approach to pattern recognition and temporal tasks, such as object tracking or image stabilization.