Member-only story

The Sequence Scope: The Standard for Scalable Deep Learning Models

Weekly newsletter with over 100,000 subscribers that discusses impactful ML research papers, cool tech releases, the money in AI, and real-life implementations.

Jesus Rodriguez
3 min readNov 28, 2021

📝 Editorial: The Standard for Scalable Deep Learning Models

Large deep learning models seem to be the norm these days. While deep neural networks with trillions of parameters are very attractive, they are nothing short of a nightmare to train. In most training techniques, the computational cost scales linearly with the number of parameters, resulting in impractical costs for most scenarios. In recent years, a mixture of experts (MoE) has emerged as a powerful alternative. Conceptually, MoE operates by partitioning a task into subtasks and aggregating the output. When applied to deep learning models, MoE has proven to scale sublinear with respect to the number of parameters, making the only viable option to scaling deep learning models to…

--

--

Jesus Rodriguez
Jesus Rodriguez

Written by Jesus Rodriguez

CEO of IntoTheBlock, President of Faktory, President of NeuralFabric and founder of The Sequence , Lecturer at Columbia University, Wharton, Angel Investor...

No responses yet