Member-only story
The Sequence Scope: The Race for Big Language Models Continues
Weekly newsletter with over 100,000 subscribers that discusses impactful ML research papers, cool tech releases, the money in AI, and real-life implementations.
📝 Editorial: The Race for Big Language Models Continues
Massively large pretrained models have become the norm in natural language processing (NLP). It seems that every other month, we achieve a new milestone in terms of the size of language models. And yet, we can’t stop writing about it because it’s so fascinating. When GPT-3 reached 175 billion parameters a few months ago, it seemed that we were close to the peak in size of language models. Since then, such models as Switch Transformer and the recently announced Wu Dao 2.0 have comfortably surpassed 1 trillion parameters. Just this week, Microsoft Research and NVIDIA announced a new generative language model with a remarkable 530 billion parameters.