close
close

Meta's text-to-video GenAI tool is official

A few months ago, OpenAI surprised the world with an incredible text-to-video AI tool called Sora. The whole concept is very simple: tell the AI ​​what video to create and Sora will generate it for you. OpenAI has not yet released it to the general public as it is currently under development. I expected Sora to be released in early October, but that didn't happen. OpenAI's DevDay came and went, and Sora is still not available to users.

Meanwhile, Meta announced his own Sora alternative. It's called Movie Gen and it's the third iteration of generative AI products for image and video editing. Movie Gen, just like Sora, can generate video and audio with a single text prompt. You can also edit existing videos with text prompts.

Movie Gen is also not yet available to users as Meta continues to develop it. But one day it could be added to the meta-AI capabilities available in social apps, changing the way we use Instagram, WhatsApp and Facebook forever.

Meta explained in a blog post that it has optimized its Movie Gen model for both text-to-image and text-to-video modes.

It is a 30 billion parameter transformer model that can create 16-second videos at 16 frames per second. “These models can reason about object motion, subject-object interactions, and camera motion, and learn plausible motion for a variety of concepts – making them state-of-the-art models in their category,” Meta said.

A Movie Gen prompt to create an AI video. Image source: Meta

Meta also demonstrated Movie Gen's ability to create personalized videos. You can simply upload a person's image and combine it with a text prompt to create an AI video “that includes the reference person and rich visual details informed by the text prompt.” Meta said its “model achieves state-of-the-art results when it comes to creating personalized videos that preserve human identity and movement.”

This is where I tell you that such features could be misused to create fake videos that could go viral and spread misinformation. This is probably one of the reasons why we are not releasing Movie Gen at this time, even though Meta doesn't mention abuse in the blog post.

Movie Gen can create videos with a real person's photo and a text prompt.
Movie Gen can create videos with a real person's photo and a text prompt. Image source: Meta

In addition, Movie Gen can also be used to edit real videos. You would submit your own clip and then direct the AI ​​to make edits. Movie Gen offers “advanced image editing, performing local edits such as adding, removing or replacing elements, as well as global changes such as background or style changes.” The model preserves the original content and only targets the pixels that need to be changed.

This also opens the door for abuse in the real world. Some people will use it to create funny clips for entertainment, while others may want to distort the truth through AI edits on real videos.

Movie Gen uses a real video and a text prompt to edit the clip.
Movie Gen uses a real video and a text prompt to edit the clip. Image source: Meta

The last Movie Gen feature that Meta detailed is audio generation. Meta trained a 13 billion parameter audio generation model that can look at a prompt with a video up to 45 seconds long and a text prompt to generate ambient noise, sound effects and instrumental background music. Everything syncs to work together.

Meta also wrote an article about Movie-Gen technology. In human testing, Meta said its models outperformed competitors, including OpenAI's Sora.

As for when Movie Gen will be available, Meta says it will work with filmmakers and creatives to incorporate their feedback. Eventually, Movie Gen will be available on Meta social apps. The screenshots above are from Meta's AI-generated and AI-edited clips. You can check out all the Movie Gen examples on Meta's blog at this link.