The Sequence Scope: Building Machine Learning with Machine Learning: Myth or Reality?
Weekly newsletter that discusses impactful ML research papers, cool tech releases, the money in AI, and real-life implementations.
The Sequence Scope is a summary of the most important published research papers, released technology and startup news in the AI ecosystem in the last week. This compendium is part of TheSequence newsletter. Data scientists, scholars, and developers from Microsoft Research, Intel Corporation, Linux Foundation AI, Google, Lockheed Martin, Cardiff University, Mellon College of Science, Warsaw University of Technology, Universitat Politècnica de València and other companies and universities are already subscribed to TheSequence.
(Core ML concepts + groundbreaking research papers and frameworks + AI news and trends) x 5 minutes, 3 times a week =…
📝 Editorial: Building Machine Learning with Machine Learning: Myth or Reality?
Neural architecture search (NAS) and automated machine learning (AutoML) are some of the hottest areas of research in the artificial intelligence (AI) space. NAS and AutoML center on the promise of using machine learning methods to automate the creation of machine learning models for a given problem. Platforms such as Google Cloud and Microsoft Azure have enabled NAS and AutoML capabilities as part of their machine learning services. The idea seems too good to be true. Can we have machine learning do the work for us? The answer is YES, but we also need a reality checkpoint.
Disciplines such as NAS and AutoML are definitely going to play an important role in the next decade of machine learning solutions, but today there is still a lot of work to be done. While you can find NAS and AutoML capabilities in mainstream machine learning frameworks and platforms, their usage remains limited to basic use cases. Most NAS or AutoML methods struggle when applied to more complex data science scenarios that require sophisticated neural network architectures. Even though research shows a lot of promise, you need to test one of those techniques in real-world scenarios to understand its limitations. However, the machine learning community continues to make steady progress. Just this week, Microsoft Research open-sourced a project that unifies some of the top NAS techniques under a common programming model. In short, NAS and AutoML methods should definitely be something to consider in your machine learning projects but, for now, you shouldn’t expect too much from them.
What do you think? How much machine learning would be capable of automating the creation of machine learning models?
🔺🔻TheSequence Scope — our Sunday edition with the industry’s development overview — is free. To receive high-quality educational content every Tuesday and Thursday, please subscribe to TheSequence Edge 🔺🔻
🗓 Next week in TheSequence Edge:
Edge#27: the concept of contrastive learning; Google’s research on view selection for contrastive learning; a review of Uber’s impressive open-source machine learning contributions.
Edge#28: debugging machine learning models; Uber’s research in the Manifold architecture for machine learning debugging; Microsoft’s TensorWatch framework.
Now, let’s review the most important developments in the AI industry this week.
🔎 ML Research
Detecting Signs in Video Calls
Google Research published a paper proposing a method for real-time sign detection model that can be applied to video conference calls ->read more on Google AI blog
Racism and Sexism in Language Pre-trained Models
The Allen Institute for AI (AI2) published a paper and open-sourced a new dataset that shows how pre-trained language models, such as GPT-3 or BERT, are prompted to generating racist, sexist and other toxic texts ->read more on AI2 blog
Adversarial Robustness Toolbox
IBM published an insightful blog post about lessons learned last year about the adversarial robustness toolbox stack->read more on IBM Research blog
🤖 Cool AI Tech Releases
Microsoft Research open-sourced Archai, a new framework that combines state of the art neural architecture search algorithms in a common code base ->read more on Microsoft Research blog
Facebook AI Research (FAIR) and AI startup Hugging Face open-sourced Retrieval Augmented Generation (RAG), a natural language processing technique that interprets contextual information to a target task ->read more on FAIR blog
LinkedIn open-sourced Generalized Deep Mixed model (GDMix), a framework for training large ranking models used in personalized recommendation systems ->read this blog from the LinkedIn Engineering team
Solving the ‘space junk’ problem
IBM open-sourced two new projects, Space Situational Awareness and Kubesat. Enthusiasts can learn more about space tech and help IBM improve communication between satellites, as well as predict the path of space junk ->read more on IBM blog
💸 Money in AI
- Baidu’s division, focused on voice assistants and smart devices, is valued at $2.9 billion after the recently raised undisclosed round. The company’s conversational artificial intelligence system DuerOS is built into Xiaodu’s (also Baidu’s company) smart devices. This voice assistant enables users to communicate with hardware by speaking to it.
- AI-enhanced drug discovery biotech firm XtalPi has raised a $319 million round C. Its platform built on quantum physics, artificial intelligence, and high-performance cloud computing algorithms, helps with the industry’s research efficiency and improves drug development by making accurate predictions of physiochemical and pharmaceutical properties of small-molecule candidates for drug design.
- Marketing automation startup SendinBlue raised $160 million in its funding round. Competing with Mailchimp, SendinBlue focuses on automation. Leveraging AI and machine learning, its email bot extracts relevant content from emails, prequalifies them, and handles specific actions to optimize response time.
- Sales enablement software Seismic has raised $92 million in a Series F funding round. Its platform leverages AI to automate parts of the sales and marketing cycle and personalize documents for sales reps. It also recommends things such as the most popular materials to be used as a sales funnel.
- Warehouse automation and industrial robots are usually an attractive deal for investors. Robotics startup Exotec has raised $90 million in its funding round. It claims that what makes their system unique and agile is scalability to meet storage and flow requirements independently, as well as keep pace with business growth.
- Smart video intercom system ButterflyMX raised $35 million in a growth equity round. Basically, it enables people to open and manage doors from their smartphones. Though very useful for the tenants, such systems raised discussions about data collection and use of data, and other privacy matters when face recognition is involved.
- Software analytics company Coralogix raised $25 million in the new funding round. The team built a machine learning engine that observes software logs in real-time, automatically detects production problems, improves the stability of the system, and makes the maintenance process easier and less costly.
- Visual task automation startup Cogniac has raised $10 million. The flows from machine vision cameras, security cameras, drones, smartphones, and other sources go to Cogniac’s AI platform. Using deep convolutional neural networks, it defines objects and conditions of interest, such as surface damage, real-time physical threat detection, accident prediction and others, and can deliver alerts and notifications to the customers.
- Autonomous truck startup Einride raised $10 million. Its Autonomous Electric Transport system allows the management of both individual self-driving pods and their fleets. The AI-enhanced freight mobility platform enables shippers and drivers to improve routing efficiency while lowering energy use.
- Digital writing and editing tool Writer has raised $5 million. Competing with Grammarly, it leverages natural language processing to help clients not only fix grammar but also differentiate between styles, building the brand’s own content guidelines and style guides. Currently, it is available only in English, as a NLP model that is built in depth for one language can not be generalized to another language.