Member-only story

Inside Meta AI’s Make-A-Video: The New Super Model that can Generate Videos from Textual Inputs

The new model builds on the principles of text-to-image methods to produce visually astonishing videos.

Jesus Rodriguez
3 min readOct 3, 2022
Image Credit: Meta AI

I recently started an AI-focused educational newsletter, that already has over 125,000 subscribers. TheSequence is a no-BS (meaning no hype, no news etc) ML-oriented newsletter that takes 5 minutes to read. The goal is to keep you up to date with machine learning projects, research papers and concepts. Please give it a try by subscribing below:

Text-to-Video(T2V) is considered the next frontier for generative artificial intelligence(AI) models. While the text-to-image(T2I) space is experiencing a revolutions with models like DALL-E, Stable Diffusion or Midjouney, TTV still remains a monumental challenge. Recently, researchers from Meta AI unveiled Make-A-Video, a T2V model able to create realistic shot video clips from textual inputs.

--

--

Jesus Rodriguez
Jesus Rodriguez

Written by Jesus Rodriguez

CEO of IntoTheBlock, President of Faktory, President of NeuralFabric and founder of The Sequence , Lecturer at Columbia University, Wharton, Angel Investor...

No responses yet