close
close

TikTok Maker ByteDance Unveils Powerful New AI Video Generators

ByteDance, the tech titan behind TikTok, just fired a massive salvo in the AI ​​video generation arms race as the company's cloud division unveiled two video generators: PixelDance and Seaweed.

The generators, unveiled at an event in Shenzhen last week, are still in private beta and are only available to a limited number of users. However, depending on the outcome of the US general election, the models could be publicly available next month, claimed YouTuber Tim Simmons, who focuses on AI tools for content creators.

“I spoke to [an anonymous source] About that and the best thing I can say is don't hold your breath until after November because…politics,” he said in a video review of the models.

The demo videos were first shown on a Chinese website, WeiXin.

PixelDance focuses on AI-driven character animation, creating 10-second videos with stunningly lifelike human movements. The model delivers fluid, natural performances – characters walk, turn, pick up objects and interact with their surroundings in ways previously unimaginable for AI.

But the real magic of PixelDance lies in its multi-shot capabilities. The model maintains remarkable consistency in appearance, proportions and scene details across different camera angles. This feature solves a major problem in AI video generation, where maintaining visual coherence between shots has long been a problem. For this reason, most modern video generators focus on creating smooth motion in a single video sequence.

PixelDance's camera controls are also comparable to other major models such as Pika, Runway's Gen 3 or Kling, making it a great addition for AI cinematography without compromise. With a single, simple text prompt, users can control complex camera movements such as 360-degree panning, zooming, tracking shots, etc.

For example, the prompt for the following video roughly translates to: In black and white, the camera rotates around the woman in sunglasses, moves from the side to the front, and finally focuses on a close-up of the woman's face.

On other models, the camera is controlled via the UI interface, with buttons and sliders.

Seaweed, PixelDance's brother, takes environmental generation and consistency to new heights. The model extends video production to a full 30 seconds – and can potentially be extended to almost 2 minutes of consistent recordings.

ByteDance's timing couldn't be more strategic. There has been excitement in the AI ​​video generation landscape since OpenAI's Sora was announced in February. Sora's alleged ability to generate up to 60 seconds of high-quality video from text prompts sent shockwaves across the tech world. However, Sora still hasn't been released to the public and other companies are scrambling to fill that space.

Kuaishou, another Chinese tech giant, made a splash in June with the launch of Kling AI, a model that many reviewers put at the top of their lists for AI video quality. Kling AI is integrated into Kuaishou's video editing app and can also create two-minute videos, surpassing even Sora's capabilities. The tool quickly reached over 2.6 million users, who generated a total of 27 million videos. However, it generates single recordings, making it comparable in quality to Bytedance's offering, but slightly less versatile in terms of features.

On Tuesday, Pika Labs – another OG in the generative video scene – released its new Pika 1.5 model, expanding the capabilities of its already good and widely used video generator. “With more realistic movements, large screen captures, and stunning Pika effects that break the laws of physics, there’s more to love about Pika than ever before,” Pika Labs said in an official tweet

Pika 1.5 is available for testing on Pika's official website, and social media is already abuzz with videos showing how Pika can wildly alter scenes by smashing and exploding people and objects – or slicing them open to kill to reveal the virtual cake contained therein.

ByteDance built its latest video apps on the Doubao family of foundational models, based on a proprietary Document Image Transformer (DiT) architecture. They are believed to share similarities with the technology that powers Sora. The company claims to have optimized DiT for business applications, potentially lowering the cost barrier to creating AI videos.

The explosive growth of the Doubao AI family since its launch in May highlights the models’ potential. Daily token processing has skyrocketed from 120 billion to 1.3 trillion, reflecting a tenfold increase in usage. Doubao now processes over 50 million images and 850,000 language hours daily, as reported by Kr-Asia.

ByteDance's aggressive pricing strategy has driven this growth. Since May, the company has reduced its cost per 1,000 tokens to fractions of a cent, sparking a fierce price war between major players like Alibaba and Tencent.

ByteDance's strategy of focusing heavily on AI when generating algorithms on TikTok is clearly paying off. TikTok and Douyin, its Chinese version, have been the fastest-growing social media platforms in recent years, but the fact that they are owned by a Chinese tech company is worrying Western countries.

It's unclear whether ByteDance will integrate its generative AI models into its apps – similar to how Meta integrates its Llama-based LLMs and generators into Instagram and WhatsApp – and even more uncertain whether US citizens will have access to them once they do be published publicly.

Edited by Andrew Hayward

Generally intelligent newsletter

A weekly AI journey narrated by Gen, a generative AI model.