Machine Learning Trick of the Day (1): Replica Trick 3

'Tricks' of all sorts are used throughout machine learning, in both research and in production settings. These tricks allow us to address many different types of data analysis problems, being roughly of either an analytical, statistical, algorithmic, or numerical flavour. Today's trick is in the analytical class and comes to us from statistical physics: the popular Replica trick. The replica ...

A Statistical View of Deep Learning: Retrospective 1

A Statistical View of Deep Learning: Retrospective
Over the past 6 months, I've taken to writing a series of posts (one each month) on a statistical view of deep learning with two principal motivations in mind. The first was as a personal exercise to make concrete and to test the limits of the way that I think about, and use deep learning in my every ...

A Statistical View of Deep Learning (VI): What is Deep? 6

A Statistical View of Deep Learning (VI): What is Deep?
Throughout this series, we have discussed deep networks by examining prototypical instances of these models, e.g., deep feed-forward networks, deep auto-encoders, deep generative models, but have not yet interrogated the key word we have been using. We have not posed the question what does 'deep' mean, and what makes a model deep. There is little in way of ...

Chinese Edition: A Statistical View of Deep Learning (III)/ 从统计学角度来看深度学习

Colleagues from the Capital of Statistics, an online statistics community in China, have been kind enough to translate my third post in this series, A Statistical View of Deep Learning (III): Memory and Kernels,  in the hope that they might be of interest to machine learning and statistics researchers in China (and to Chinese readers). Find it here: ...

Chinese Edition: A Statistical View of Deep Learning (II)/ 从统计学角度来看深度学习

Colleagues from the Capital of Statistics, an online statistics community in China, have been kind enough to translate my second post in this series, A statistical View of Deep Learning (II): Auto-encoders and Free Energy,  in the hope that they might be of interest to machine learning and statistics researchers in China (and to Chinese readers). Find ...

Chinese Edition: A statistical View of Deep Learning (I)/ 从统计学角度来看深度学习

Colleagues from the Capital of Statistics, an online statistics community in China, have been kind enough to translate my first post in this series, A statistical View of Deep Learning (I): Recursive GLMs,  in the hope that they might be of interest to machine learning and statistics researchers in China (and to Chinese readers). Find it here: ...

A Statistical View of Deep Learning (V): Generalisation and Regularisation 1

A Statistical View of Deep Learning (V): Generalisation and Regularisation
We now routinely build complex, highly-parameterised models in an effort to address the complexities of modern data sets. We design our models so that they have enough 'capacity', and this is now second nature to us using the layer-wise design principles of deep learning. But some problems continue to affect us, those that we encountered even in the low-data ...

A Statistical View of Deep Learning (IV): Recurrent Nets and Dynamical Systems 4

A Statistical View of Deep Learning (IV): Recurrent Nets and Dynamical Systems
Recurrent neural networks (RNNs) are now established as one of the key tools in the machine learning toolbox for handling large-scale sequence data. The ability to specify highly powerful models, advances in stochastic gradient descent, the availability of large volumes of data, and large-scale computing infrastructure, now allows us to apply RNNs in the most creative ...

A Statistical View of Deep Learning (III): Memory and Kernels 4

A Statistical View of Deep Learning (III): Memory and Kernels
Memory, the ways in which we remember and recall past experiences and data to reason about future events, is a term used frequently in current literature. All models in machine learning consist of a memory that is central to their usage. We have two principal types of memory mechanisms, most often addressed under the types of models ...

A Statistical View of Deep Learning (II): Auto-encoders and Free Energy 8

A Statistical View of Deep Learning (II): Auto-encoders and Free Energy
With the success of discriminative modelling using deep feedforward neural networks (or using an alternative statistical lens, recursive generalised linear models) in numerous industrial applications, there is an increased drive to produce similar outcomes with unsupervised learning. In this post, I'd like to explore the connections between denoising auto-encoders as a leading approach for unsupervised learning in deep learning, and density estimation ...