Sitemap

Member-only story

The Sequence Scope: The Standard for Scalable Deep Learning Models

Weekly newsletter with over 100,000 subscribers that discusses impactful ML research papers, cool tech releases, the money in AI, and real-life implementations.

3 min readNov 28, 2021

📝 Editorial: The Standard for Scalable Deep Learning Models

Large deep learning models seem to be the norm these days. While deep neural networks with trillions of parameters are very attractive, they are nothing short of a nightmare to train. In most training techniques, the computational cost scales linearly with the number of parameters, resulting in impractical costs for most scenarios. In recent years, a mixture of experts (MoE) has emerged as a powerful alternative. Conceptually, MoE operates by partitioning a task into subtasks and aggregating the output. When applied to deep learning models, MoE has proven to scale sublinear with respect to the number of parameters, making the only viable option to scaling deep learning models to…

--

--

Jesus Rodriguez
Jesus Rodriguez

Written by Jesus Rodriguez

Co-Founder and CTO of Sentora( fka IntoTheBlock), President of LayerLens, Faktory and NeuralFabric. Founder of The Sequence , Lecturer at Columbia, Wharton

No responses yet