Member-only story
Inside Meta AI’s Make-A-Video: The New Super Model that can Generate Videos from Textual Inputs
The new model builds on the principles of text-to-image methods to produce visually astonishing videos.
I recently started an AI-focused educational newsletter, that already has over 125,000 subscribers. TheSequence is a no-BS (meaning no hype, no news etc) ML-oriented newsletter that takes 5 minutes to read. The goal is to keep you up to date with machine learning projects, research papers and concepts. Please give it a try by subscribing below:
Text-to-Video(T2V) is considered the next frontier for generative artificial intelligence(AI) models. While the text-to-image(T2I) space is experiencing a revolutions with models like DALL-E, Stable Diffusion or Midjouney, TTV still remains a monumental challenge. Recently, researchers from Meta AI unveiled Make-A-Video, a T2V model able to create realistic shot video clips from textual inputs.