Comparison of Symbolic Deep Leaning Frameworks

Symbolic Frameworks

Frameworks for symbolic computation (MXNETTensorFlow, Theano) feature symbolic graphs of vector operations, such as matrix addition/multiplication or convolution. A layer is just a set of those operations. Thanks to the granularity of operations, users can build new complex types of layers without using low-level languages.

I have the experience of using various symbolic computation frameworks. As it turned out, there is no such thing as a perfect symbolic framework that would fit all the requirements. Each framework has its advantages and disadvantages. However, currently, I’m using Theano.

Let’s compare the three major symbolic frameworks in the table below.

Comparison of Symbolic Deep Leaning Frameworks
Comparison of Symbolic Deep Leaning Frameworks


Non-symbolic frameworks
Symbolic vs. Non-symbolic Frameworks

Pros:

  • Non-symbolic (imperative) neural frameworks, such as astorch, and caffe, have a very similar structure in terms of computation.
  • In terms of expressiveness, imperative frameworks are built pretty well; they can also have a graph-like interface

Cons:

  • Manual optimization can be regarded as a major drawback of imperative networks. For example, in-place operations need to be implemented manually.
  • Most imperative frameworks are inferior to symbolic frameworks in terms of expressiveness.

Symbolic frameworks

Pros:

  • Symbolic frameworks support automatic optimization using the dependency graphs.
  • Symbolic frameworks offer wider possibilities in terms of memory reuse (as in MXNET).
  • Symbolic frameworks can automatically compute the optimal traffic.

Cons:

  • The existing open source symbolic frameworks show lower performance than their imperative counterparts.

Adding new operations
In all the examined frameworks, adding new operations with reasonable performance is pretty complicated.

Comparison of Symbolic Deep Leaning Frameworks
Comparison of Symbolic Deep Leaning Frameworks


Code re-usability

Since training deep networks is extremely time-consuming, Caffe launched several pre-trained models (so-called “model zoos”) to be used as initial weights while transfer learning or adjusting deep networks.

Comparison of Symbolic Deep Leaning Frameworks
Comparison of Symbolic Deep Leaning Frameworks

Low-level Tensor operators
All the frameworks have a pretty efficient implementation of low-level operators. Those can be used as components for building new models, eliminating the need to write new operations.

Comparison of Symbolic Deep Leaning Frameworks
Comparison of Symbolic Deep Leaning Frameworks

 

Control flow operators
Control flow operators enhance the expressiveness and versatility of a symbolic system.

Comparison of Symbolic Deep Leaning Frameworks
Comparison of Symbolic Deep Leaning Frameworks

High-level support

Comparison of Symbolic Deep Leaning Frameworks
Comparison of Symbolic Deep Leaning Frameworks


Performance 

Single-GPU

I measured the performance of LeNet model on MNIST dataset using a single processor (NVIDIA Quadro K1200).

Comparison of Symbolic Deep Leaning Frameworks
Comparison of Symbolic Deep Leaning Frameworks

Memory
Because GPU memory is limited, this can be a major problem for large models.

Comparison of Symbolic Deep Leaning Frameworks
Comparison of Symbolic Deep Leaning Frameworks

Single-GPU speed

Theano takes too long to compile graphs, especially when it comes to complex models. TensorFlow is even slower.

Comparison of Symbolic Deep Leaning Frameworks
Comparison of Symbolic Deep Leaning Frameworks

Parallel and distributed support

Comparison of Symbolic Deep Leaning Frameworks
Comparison of Symbolic Deep Leaning Frameworks

Final considerations

Theano (with higher-level Lasagne and Keras solutions) is a great choice for deep learning models. Lasagne/Keras makes it amazingly easy to build new networks and modify the existing ones. Since I prefer Python, I choose Lasagne/Keras because of their well-developed Python interface. At the same time, these solutions don’t support R. Due to their transfer learning and fine-tuning capacities, Lasagne and Keras make it easy to modify the existing networks and adjust them with domain-specific user data.

Based on the comparison of frameworks, we can see that the winner is MXNET. It features higher performance and uses memory in a more effective way. Plus, it has a great R support. In fact, MXNET is the only platform that supports all functions on R. While MXNET has the transfer learning and fine-tuning features, they are harder to use than in Lasagne/Keras. This is why modifying the existing trained networks and working with domain-specific data in MXNET is quite a challenge.

Comparison of Symbolic Deep Leaning Frameworks
Comparison of Symbolic Deep Leaning Frameworks