Technology Fridays: Gluon Wants to be The Ruby on Rails of the Deep Learning World
Welcome to Technology Fridays! If you are a developer embarking in your first deep learning project, the number of technologies available to you can result overwhelming. A few years ago, we had one or two machine learning frameworks that were ready for prime time. Today, not a month goes by in which we don’t hear about other shining deep learning library that is trying to streamline the implementation of deep learning models. Tensorflow, Theano, Microsoft Cognitive Toolkit(CNTK), MxNet, PaddlePaddle, Keras, Bonsai, Caffe2, Torch and the list doesn’t seem to end. Even more challenging, is the fact that all those frameworks are relatively low level and require a solid understanding of the computation graphs required in deep learning models. In order for deep learning to become more mainstream, we are going to need higher level frameworks that result more appealing to average developers. Last year, Microsoft and Amazon surprisingly partnered in an effort to create a higher level programming model that works across different deep learning frameworks. They named the project Gluon.
The goals of project Gluon are to provide a series of high level APIs that abstract the implementation of deep learning programs across different runtimes such as Apache MxNet(Amazon’s favorite), PyTorch or CNTK(Microsoft deep learning framework). The Gluon interface allow developer to quickly prototype and train deep learning models without having to understand all the mechanics of the underlying computation graph. Even more impressive is the fact that that Gluon accomplishes that without sacrificing the performance of the models. From that perspective, models developed using Gluon can achieve comparable performance that if implemented in MxNet or CNTK.
From an architecture standpoint, Gluon was designed following four key principles:
- Simple, Programming Model: Gluon offers a full set of plug-and-play neural network building blocks, including predefined layers, optimizers, and initializers.
- Flexible, Imperative Structure: Gluon does not require the neural network model to be rigidly defined, but rather brings the training algorithm and model closer together to provide flexibility in the development process.
- Dynamic Graphs: Gluon enables developers to define neural network models that are dynamic, meaning they can be built on the fly, with any structure, and using any of Python’s native control flow.
- High Performance: Gluon provides all of the above benefits without impacting the training speed that the underlying engine provides.
We already talked about the simplicity of the programming model and the native performance so let’s discuss some of the following aspects. One of the major theoretical contributions of Gluon is this idea of building dynamic neural networks on the fly that change their size and shape based on the conditions of the experiment. Additionally, because the Gluon interface brings together the training algorithm and the neural network model, developers can perform model training one step at a time which results is much easier debugging and optimization mechanisms.
From the programming standpoint, Gluon includes different relevant building blocks. The Gluon API providers The Gluon API offers a flexible interface that simplifies the process of prototyping, building, and training deep learning models without sacrificing training speed. Similarly, Gluon includes a Neural Network Layers API that provides a series of prebuilt neural network structures such as convolutional neural networks(CNNs), dropout or pooling layers which can be combined to rapidly architect deep learning models. The Recurrent Neural Network API is Gluon interface to build recurrent neural networks such as long term short term memory(LTSM) models. The Gluon Data API is responsible for abstracting the loading and pre-processing of datasets while the Autograd API focuses on gradient optimization algorithms.
The combination of the different layers makes the implementation of deep learning models using Gluon substantially simpler compared to lower level deep learning frameworks. For instance, let’s take the following example of a neural network implemented using Apache MxNet:
import mxnet as mx
from mxnet import sym,moddata = sym.Variable('data')
fc1 = sym.FullyConnected(data, name='fc1', num_hidden=128)
relu1 = sym.Activation(fc1, name='relu1', act_type="relu")
fc2 = sym.FullyConnected(relu1, name='fc2', num_hidden=64)
relu2 = sym.Activation(fc2, name='relu1', act_type="relu")
out = sym.FullyConnected(relu2, name='out', num_hidden=10)
mod = mod.Module(out)
The same model implemented using Gluon looks like the following:
import mxnet as mx
from mxnet.gluon import nnnet = nn.Sequential()
Seems way simpler. Well, that’s the promise of Gluon and the more impressive thing is that the same model can be applied across different deep learning runtimes.
Gluon is not the only effort to enable higher level programming constructs for the implementation of deep neural networks. The DeepMind team released the Sonnet library last year which accomplishes some similar for Tensorflow graphs. Similarly, Bonsai’s ideas look promising across different deep learning frameworks.