Member-only story
The Sequence Scope: Meta AI’s Make-A-Video
Weekly newsletter with over 120,000 subscribers that discusses impactful ML research papers, cool tech releases, the money in AI, and real-life implementations.
📝 Editorial: Meta AI’s Make-A-Video
Generative models based on textual inputs are experiencing tremendous momentum. Models such as DALL-E, Midjourney, and Stable Diffusion have captured the imagination of not only the AI community but artists, designers, gamers, and creative minds across many different domains. When thinking about the next milestone for text-to-image synthesis models, video creation is often cited on the top of the list. Obviously, video generation presents significant challenges compared to static images. For starters, video requires significantly more training resources, and there are very few high-quality datasets available that works with supervised methods. Also, the feature representation space of videos is considerably more complex than images. Just like text-to-image…