Platforms Like Spark and Flink are Key to the Future of Deep Learning in the Enterprise
The recent Spark Summit brought up a lot of interesting announcements about the future of Spark. Among those, the release of Deep Learning Pipelines was particularly well received by the developer community. Created by Databricks, Deep Learning Pipelines enables the execution of deep learning models in Spark clusters.
Deep Learning Pipelines addresses one of the most important challenges in modern deep learning solutions: the absence of a sophisticated runtime. The recent generation of deep learning frameworks such as TensorFlow, Torch, Theano, Caffe2, MxNet and others have certainly simplified the implementation of deep learning applications but they remain rather limited in terms of the runtime infrastructure required to execute deep learning models at scale. Aspects such as instructing and coordinating the execution of deep learning programs across a large GPU farm are far from trivial. Those efforts require concurrency and coordination models that have little to do with deep learning algorithms. Arguably, the number one friction point for the mainstream adoption of deep learning technologies in the enterprise is the lack of robust runtimes.
Deep Learning Pipelines takes advantage of a very clever technique to bring deep learning models to the Spark platform. The framework provides a library that converts deep learning models into SQL functions which can be easily integrated with Spark MLib Pipelines in order to run on a Spark cluster. The process might not be applicable to all deep learning frameworks in the market but is certainly an interesting start.
Massive parallel processing(MPP) platforms such as Apache Spark or Flink could be the missing runtime for deep learning programs. regardless of whether you like the approach taken by the Deep Learning Pipelines Frameworks ( I have my reservations) there is an unquestionable value in creating versions of deep learning frameworks optimized for MPP runtimes such as Spark of Flink.
Some Benefits of Flink-Spark Deep Learning Runtimes
As mentioned earlier, the absence of sophisticated runtimes in the main challenge for organizations embracing deep learning frameworks. This problem is even more relevant in enterprise scenarios that rely on on-premise infrastructures in order to execute deep learning programs. Platforms such as Spark or Flink bring very unique benefits as deep learning runtimes:
1 — Parallel Runtime: Deep learning programs and natively parallel but parallelization itself it really hard to achieve at a runtime level. Platforms such as Flink or Spark natively support scalable, parallel and concurrent execution of programs.
— Tooling: By leveraging platforms such as Spark or Flink, deep learning programs can have access to a sophisticated suite of management tools and automation frameworks.
— Integration with Other Technologies: Natively supporting stacks such as Spark or Flink will allow deep learning programs to interoperate with other native technologies such as Spark MLib, Flink Streaming or Spark R that natively run on those platforms.
— Hybrid Infrastructures: Platforms such as Spark or Flink are widely supported across cloud platforms which will provide deep learning frameworks hybrid runtimes that can be very viable in enterprise settings.