Member-only story
The Sequence Scope: Multi-Modal Learning is Becoming Real
Weekly newsletter with over 100,000 subscribers that discusses impactful ML research papers, cool tech releases, the money in AI, and real-life implementations.
📝 Editorial: Multi-Modal Learning is Becoming Real
One of the cognitive marvels of the human brain is its ability to simultaneously process information from different sensorial inputs such as speech, touch, or vision in order to accomplish a specific task. Since we are babies, we learn to develop representations of the world based on many different modalities such as objects, sounds, verbal descriptions, and others. Recreating the ability to learn from different modalities simultaneously has long been a goal of ML, but most of those efforts remained constrained to research exercises. For decades, most supervised ML models have been highly optimized for a single representation of the information. That’s rapidly changing now. Multimodal ML is becoming a reality.