The Sequence Scope: Size Matters
Weekly newsletter that discusses impactful ML research papers, cool tech releases, the money in AI, and real-life implementations.
The Sequence Scope is a summary of the most important published research papers, released technology and startup news in the AI ecosystem in the last week. This compendium is part of TheSequence newsletter. Data scientists, scholars, and developers from Microsoft Research, Intel Corporation, Linux Foundation AI, Google, Lockheed Martin, Cardiff University, Mellon College of Science, Warsaw University of Technology, Universitat Politècnica de València and other companies and universities are already subscribed to TheSequence.
(Core ML concepts + groundbreaking research papers and frameworks + AI news and trends) x 5 minutes, 3 times a week =…
📝 Editorial: Size Matters
The recent emergence of pre-trained language models and transformer architectures pushed the creation of larger and larger machine learning models. Google’s BERT presented attention mechanism and transformer architecture possibilities as the “next big thing” in ML, and the numbers seem surreal. OpenAI’s GPT-2 set a record by processing 1.5 billion parameters, followed by Microsoft’s Turing-NLG, which processed 17 billion parameters just to see the new GPT-3 processing an astonishing 175 billion parameters. To not feel complacent, just this week Microsoft announced a new release of its DeepSpeed framework (which powers Turing-NLG), which can train a model with up to a trillion parameters. That sounds insane but it really isn’t.
What we are seeing is a consequence of several factors. First, computation power and parallelization techniques have evolved to a point where it is relatively easy to train machine learning models in large clusters of machines. Second and most importantly, in the current state of machine learning, larger models have regularly outperformed smaller and more specialized models. Knowledge reusability methods like transfer learning are still in very nascent stages. As a result, it’s really hard to build small models that can operate in uncertain environments. Furthermore, as models like GPT-3 and Turing-NLG have shown, there is some unexplainable magic that happens after models go past a certain size.
Many of the immediate machine learning problems might be solved by scaling the current generation of neural network architectures. Plain and simple, when it comes to machine learning, size matters.
We would love to hear your opinions about the debate between broader-larger vs. smaller and more specialized models.
Leave a comment
🔺🔻 TheSequence Scope — our Sunday edition with the industry’s development overview — is free. To receive high-quality educational content every Tuesday and Thursday, please subscribe to TheSequence Edge. 🔺🔻
🗓 Next week in TheSequence Edge:
Edge#21: the concept of Machine Text Generation; the research behind Microsoft’s Turing-NLG, one of the largest language pre-trained models in history; AllenNLP, an open-source framework for advanced natural language research.
Edge#22: the concept of Question-Answering Models; the paper in which the Google Research team presents a new dataset for training and evaluating question-answering systems; DeepPavlov open-source framework for advanced NLU methods including question-answering.
Now, to the most important developments in the AI industry this week
🔎 ML Research
GPT-3 Falls Short in Machine Comprehension
Proposed by researchers from a few major American universities, a 57-task test to measure models’ ability to reason poses challenges even for sophisticated models like GPT-3 ->read more in the original paper
Better Text Summarization
OpenAI published a paper showing a reinforcement learning with human feedback technique that can surpass supervised models ->read more on OpenAI blog
Reinforcement Learning with Offline Datasets
Researchers from the Berkeley AI Research (BAIR) Lab published a paper unveiling a method that uses offline datasets to improve reinforcement learning models->read more on BAIR blog
🤖 Cool AI Tech Releases
New Version of DeepSpeed
Microsoft open-sourced a new version of DeepSpeed, an open-source library for parallelizable training that can scale up to models with 1 trillion parameters->read more on Microsoft Research blog
💬 Useful Tweet
Humans can’t beat chess computers. Ex world champ Vladimir Kramnik joined one — DeepMind’s AlphaZero — to test tweaks to the rules of the game that might jolt humans into fresher, more beautiful play.
AI Ruined Chess. Now, It’s Making the Game Beautiful AgainA former world champion teams up with the makers of AlphaZero to test variants on the age-old game that can jolt players into creative patterns.wired.com
September 10th 2020
4 Retweets6 Likes
💸 Money in AI
- AI-powered customer experience management platform Sprinklr has raised $200 million (kudos to our subscribers from Sprinklr 👏). Sprinklr’s “AI listening processing” solution allows companies to get structured and meaningful sentiments and insights from unstructured customer data that comes from public conversations on different websites and social platforms.
- Xometry, an on-demand industrial parts marketplace, raises $75 million in Series E funding. The company provides a digital way of creating the right combination of buyers and manufacturers.
- Another example of AI implementation into matching two sides for a deal. Real estate tech company Orchard raises $69 million in its recent funding round. Orchard aims to digitize the whole real estate market, by developing a solution that combines machine learning and rapid human assistance to smooth the search, match the right deal, and simplify buying and selling relationships.
- Cybersecurity startup Pcysys raised $25 million in its funding round. Pcysys’ platform, which doesn’t require installation or network reconfiguration, uses algorithms to scan and “ethically” attack enterprise networks.
- Robotics farming company Iron Ox raised $20 million in a funding round. The system of farming robots is still semi-autonomous, the company’s goal is to become fully autonomous.
- Insurtech company Descartes Underwriting raised $18.5 million. The company applies AI and machine learning technologies to climate risk predicting and insurance underwriting.
- Legaltech startup ThoughtRiver raised $10 million in its Series A round. Its AI solution applied to contract pre-screening aims to boost operational efficiency.
- Medtech startup Skin Analytics raised $5.1 million in Series A funding. Skin Analytics has developed a clinically validated AI system that can identify not only the important skin cancers but also precancerous lesions that can be treated, as well as a range of lesions that are benign.
Amazon, along with several government organizations and three other industry partners, helped fund the National Science Foundation, a high-priority AI research initiative. The amount of funding is not disclosed.