The Sequence Scope: Go Big First, Then Compress
Weekly newsletter with over 80,000 subscribers that discusses impactful ML research papers, cool tech releases, the money in AI, and real-life implementations.
The Sequence Scope is a summary of the most important published research papers, released technology and startup news in the AI ecosystem in the last week. This compendium is part of TheSequence newsletter. Data scientists, scholars, and developers from Microsoft Research, Intel Corporation, Linux Foundation AI, Google, Lockheed Martin, Cardiff University, Mellon College of Science, Warsaw University of Technology, Universitat Politècnica de València and other companies and universities are already subscribed to TheSequence.
TheSequence explains the main machine learning concepts and keeps you up-to-date with the most relevant projects and…
📝 Editorial: Go Big First, Then Compress
Bigger models are better tells us the conventional wisdom in machine learning(ML). In the current state of the ML ecosystem dominated by supervised learning models, the mantra is to go big. Bigger deep learning models tend to outperform smaller versions in most deep learning scenarios. However, bigger models are also slow, expensive to run and really difficult to operate. Model compression is one of the techniques that helps address those limitations. As it names indicates, model compression tries to reduce the size of a given model without drastically sacrificing its performance.
Despite its importance, model compression is one of those aspects of ML solutions that are often overlooked until it becomes a problem. Most ML production infrastructures don’t even include model compression components. However, the research in this area of ML is advancing really fast. In 2019, the ML world was shocked by the publication of The Lottery Ticket Hypothesis, a famous paper that argue that for every large neural network, we can find a subnetwork that when trained in isolation matches the performance of its parent. Crazy huh? The lottery ticket hypothesis basically tells us that parts of any big neural network are redundant. The publication of the lottery ticket hypothesis sparked a lot of interest in model compression and pruning, Just this week, Microsoft Research published a paper proposing a compression method that provides a very interesting alternative to the lottery ticket hypothesis. While this is happening, mainstream deep learning frameworks are rapidly incorporating model compression techniques. So when comes to large scale ML problems, definitely go big first but then compress 😉
🗓 Next week in TheSequence Edge:
Edge#75: the concept of N-Shot Learning; how OpenAI uses N-Shot Learning to teach AI agents to play; Learn2learn open-source meta-learning framework.
Edge#76: Google’s model search is a new, open-source framework for finding optimal machine learning models.
🔎 ML Research
More Efficient Transformers
Google Research published a paper proposing a transformer method that scales linearly with the input datasets maintaining manageable levels of memory and computation costs.
Transformers as Universal Computation Engines
The Berkeley AI Research(BAIR) lab published a paper outlining a method to transfer computation modules in transformer models across different domains.
Compressing Neural Networks Using Factorized Layers
Microsoft Research published a phenomenal paper proposing a new neural network compression technique that expands on the famous Lottery Ticket Hypothesis.
🤖 Cool AI Tech Releases
OpenAI published a blog post highlighting some of the most impressive applications powered by its GPT-3 API.
Uber’s Data Architecture
The Uber engineering team published an insightful blog post detailing some of the best practices in their enterprise data architecture.
💸 Money in AI
- Construction monitoring startup Avvir raised $10 million in a funding round. The company uses laser scans and AI to catch construction mistakes, automatically update client’s building information modeling, and monitor construction progress.
- Supply chain visibility platform FourKites raised $100 million in series D financing. FourKites uses data science to improve supply chain performance by predicting estimated time to arrival.
- Risk management platform Feedzai raised $200 million in Series D round. Their goal is to fight financial crime by leveraging big data and advanced ML.
- Translation service Language I/O raised $5 million in A round funding. Their AI technology allows generating accurate, company-specific translations of all user-generated content (UGC) including jargon, slang, abbreviations and misspellings into over 100 languages via chat, email, article and social support channels.
- Dataminr raised $475 million in new funding. Dataminr’s real-time AI platform detects the earliest signals of high-impact events and emerging risks from within a mix of 100,000 public data sources.
- Edge AI startup LGN raised $2 million in funding. Their solutions allow edge AI systems to operate resiliently, in the real world, at scale. Delivering edge AI at scale is the first step in LGN’s mission to create networked AI, meaning AI-to-AI communication, without human in between, element, to speed up the decision-making and action processes.
- AI-powered marketing optimization platform Sellforte raised $4.78 million in funding. Their data science models calculate comparable ROI for every marketing investment across different media channels and campaigns, generating continuous recommendations for growth.
- Identity verification startup Jumio raised $150 million in a funding round. Leveraging AI, biometrics, machine learning, liveness detection and automation, Jumio helps organizations fight fraud and financial crime, onboard good customers faster and meet regulatory compliance including KYC, AML and GDPR.
- Workflow and decision automation startup Camunda raised $100 million in a Series B round. The company offers process automation with a developer-friendly approach that is standards-based, highly scalable and collaborative for business and IT.