close
close

Meta's Movie Gen AI video generator is capable of creating real movies, including music

Meta's AI journey would inevitably lead to the emerging field of AI video. Now, The Mark Zuckerberg-led company has Movie Gen, another video generator that can create realistic videos from a short text prompt. Meta claims this is just as useful for Hollywood as it is for the average Instagrammer, even though it's not available to anyone outside of Meta. Movie Gen can create audio and create it the most powerful deep fake generator we've seen to date.

In a blog post, Meta showed some example videos, including a happy baby hippo swimming underwater, somehow floating just below the surface and apparently having no trouble holding its breath. Other videos show penguins in “Victorian” outfits with sleeves and skirts that are too short to be representative of the period. There's another video where a woman DJs next to a cheetah, who is too distracted by the beat to care about the present danger.

Meta says it used the prompt: “A fluffy koala bear is surfing. It has gray and white fur and a round nose. The surfboard is yellow. The koala bear holds on to the surfboard with his paws. The koala bear's facial expression is focused. The sun is shining.” © Gif: Meta

Everyone is getting into the AI-generated video space. Already this year, Microsoft's VASA-1 and OpenAI's Sora promised “realistic” videos generated from simple text prompts. Although Sora was teased back in February, she hasn't seen the light of day yet. Metas Movie Gen offers a few more features than the competition, including editing existing videos with a text prompt, creating videos based on an image, and adding AI-generated sound to the created video.

The video editing suite seems particularly novel. It works on both generated videos and real-world recordings. Meta claims that its model “retains the original content” while adding elements to the footage, be it backgrounds or outfits for the scene's main characters. Meta showed how you can also take pictures of people and insert them into generated films.

Meta already has music and sound generation models, but the social media giant showed off some examples of the 13B parameter audio generator, which adds sound effects and soundtracks in addition to videos. Text input could be as simple as “rustling leaves and snapping twigs” to add to the generated video of a snake squirming across the forest floor. The audio generator is currently limited to 45 seconds, so it cannot provide sound for entire films. At least it won't be that far yet.

And no, sorry, you can't use it yet. Chris Cox, Meta's chief product officer, wrote in Threads: “We're not ready to release this as a product any time soon – it's still expensive and the generation time is too long.”

Published by @chriscox

Show in threads

In its white paper about Movie Gen, Meta said that the entire software suite consists of several basic models. The company's largest video model is a 30B parameter transformer model with a maximum context length of 73,000 video tokens. The audio generator is a 13B parameter base model that can convert both video-to-audio and text-to-audio.

It's hard to compare this to the video generators from the biggest AI companies, especially since OpenAI claims Sora uses “data called patches, each of which resembles a token in GPT.” Meta is one of the few major companies still publishing data with its new AI tools, a practice that has fallen by the wayside amid the over-commercialization of AI. Still, Meta's white paper doesn't offer much of an idea of ​​where the company got its training data for Movie Gen. In all likelihood, part of the data set comes from Facebook users' videos. Meta also uses the photos you take with Meta Ray-Ban smart glasses to train its AI models.

You can't use Movie Gen yet. Instead, other AI movie generators like RunwayML's Gen 3 offer a limited number of tokens for creating small clips before you have to start paying. A report from 404 Media earlier this year suggested that Runway trained its AI on thousands of YouTube videos and, like most AI startups, never asked for permission before deleting that content.

Meta said it has worked closely with filmmakers and video producers to develop this model and will continue to do so as it works on Movie Gen. Reports from earlier this year suggest that studios are already making friends with AI companies. Independent darling A24 has recently been working with AI-focused VC firms, some of which are tied to OpenAI. On the other hand, Meta is reportedly in talks with Hollywood stars like Judi Dench and Awkwafina about using their voices for future AI projects.