The Sequence Scope: The Challenge of Data-Efficient Machine Learning
This is a summary of the most important published research papers, released technology and startup news in the AI ecosystem in the last week. This compendium is part of TheSequence newsletter. Give it a try by subscribing below:
(Core ML concepts + groundbreaking research papers and frameworks + AI news and trends) x 5 minutes, 3 times a week =…
Supervised learning is the dominant school in machine learning solutions. The idea of training a model in a labeled dataset in order to master a task seems intuitive. However, in practice, many supervised learning techniques run into the challenges that require large labeled datasets in order to generalize even very simple knowledge.
This challenge is overwhelming for both startups and big companies and is one of the main roadblocks for the mainstream adoption of ML. We all love to hear about breakthroughs like AlphaGo or GPT-3 until you realize the ginormous size of the training datasets used to create those models.
The idea of building machine learning methods that can operate with smaller labeled datasets is an active area of research and there is no shortage of ideas. Techniques like semi-supervised learning attempt to use unlabeled datasets in the training process. Generative models look to create new labeled datasets from existing ones. Self-supervised learning wants to build models that learn from scratch. Transfer learning tries to reuse knowledge between tasks while meta-learning has the simple goal of building models that learn to learn from scratch. Just this week, DeepMind propose a new meta-learning technique to build more efficient reinforcement learning models. Some of the best minds in machine learning are paving the way to build more data-efficient models.
Next week in TheSequence Edge
July 28, Edge#7: the concept of generative models; Optimus, one of the most innovative research in generative models recently published by Microsoft Research; ART, an open-source framework that uses generative models for protecting neural networks.
July 30, Edge#8: the concept of generative adversarial networks; the original GAN paper by Ian Goodfellow; deep dive into TF-GANs.
To stay up-to-date and receive TheSequence Edge every Tuesday and Thursday, please consider joining our community. Till August 15 you can subscribe with a permanent 20% discount. Sunday edition of TheSequence Scope is always free.
Now, let’s review the most important developments in AI research and technology this week.
Using Meta-Learning to Generate Reinforcement Learning Models
DeepMind researchers published a paper proposing a meta-learning method that can automatically generate reinforcement learning models ->read the original paper
Self-Supervised Learning for Image Classification
Facebook AI Research (FAIR) is at the forefront of self-supervised learning. Recently, they published a paper proposing a self-supervised learning method for training image classification models ->read more in Facebook AI blog
Data-Efficient Reinforcement Learning
The prestigious Berkeley AI Lab published a couple of papers about techniques for making reinforcement learning operate with smaller training datasets ->read more in Berkeley AI Lab blog
Cool AI Tech Releases
LinkedIn’s engineering team detailed their work on LIquid, a new type of graph database ->read more in LinkedIn’s engineering blog
TensorFlow Lite XNNPACK Integration
The TensorFlow team unveiled support for the XNNPACK hardware acceleration library. The integration will allow 2–3x faster inference routines in TensorFlow Lite models ->read more in TensorFlow blog
GPT3 (generative pre-trained transformer) is the newest in the family of NLU (nature language understanding) models but it’s been around for a few months. It uses the transformer architecture that we covered in Edge#3. The hype around GPT3 is huge but the language generators are still in a very nascent stage. Great opportunities to join OpenAI and work on the development of this fascinating technology.
Money in AI
- AI-based crowdsourcing startup StuffThatWorks raised a $9 million seed round. Its idea is to let people find the most effective treatments collaborating with each other. The human input is enhanced by ML algorithms that are programmed to look for valuable insights.
- Autonomous technology company and intelligence systems provider Sea Machines Robotics raised $15 million to accelerate deployment of its AI-powered situational awareness mechanism in the unmanned naval boat and ship market. They are hiring.
- Educational platform Riiid has just raised quite a large sum of $41.8 million for AI-test prep solutions. After successful concept validation in Korea, Japan, and Vietnam, Riiid is going to expand across the U.S., South America, and the Middle East.
- Big data analytics startup Quantexa that built a machine learning platform “Contextual Decision Intelligence” (CDI) has raised $64.7 million. The platform gathers scattered data points, analyses it to uncover risky activities, as well as enhance customer intelligence, and deal with credit risk and fraud challenges.