xAI Launches Imagine v0.9 Video Generation Model: Completes Creation in 15 Seconds, Free for All Users
Abstract
xAI officially released its video generation AI model, Imagine v0.9, on October 7, 2025 (ET), making it freely available to all users. This marks the first major upgrade since the launch of Imagine v0.1 in July this year. The new version features significant improvements in visual quality, dynamic motion, and audio generation, capable of producing audio effects synchronized with video actions.
Technological Breakthroughs and Core Features
Imagine v0.9 has undergone extensive upgrades in visual quality, motion effects, and audio generation. One of the model's most striking features is its native integrated audio-video generation capability, allowing it to directly create cinematic videos with synchronized sound effects, eliminating the need for post-editing.
In official demo videos, generated dragons emit realistic roars when opening and closing their mouths, robots can speak with human lip-sync, and the model can even generate expressive singing content.
Motion Control and Visual Effects
Version v0.9 has made significant strides in motion control, capable of smoothly reproducing complex dynamic actions like ski jumps, with no distortion from takeoff to landing. Additionally, the model supports dynamic camera effects such as intelligent focus shifting, which can blur street scenes based on camera position changes to highlight the main subject.
Generation Speed Advantage
Elon Musk stated on social platform X that Imagine v0.9 can complete video generation within 15 seconds. This speed offers a significant advantage over competitor OpenAI's Sora 2. Reportedly, Sora 2 may take one to two minutes to generate a single video.
Accessibility and Product Integration
Imagine v0.9 has been integrated into Grok's video generation feature and is freely available to all users, including free users. Users can access this feature by visiting grok.com/imagine.
Musk also encouraged users to try Grok's voice-first interface; by enabling the "Voice Mode Open App" feature in settings, users can directly create videos, images, and text via voice, without needing to type.
Controversial Features Retained
Notably, Grok's video generation feature includes a "Spicy" mode, which allows for the generation of content that might be blocked by Google or OpenAI's video generation AI. This feature has been retained in v0.9, sparking discussions about deepfake risks and content moderation.
A significant upgrade in v0.9 is the ability for users to add custom voices to videos. Once this technology matures, users could potentially upload photos of public figures and generate realistic videos of them saying specific things, posing a deepfake risk.
Market Competition Landscape
The release of Imagine v0.9 comes amidst intensifying competition in the AI video generation sector. OpenAI released its flagship video and audio generation model, Sora 2, on September 30. xAI's update is seen as a direct response to Sora 2.
Unlike Sora 2's invitation-based system, Imagine v0.9 is freely accessible to all users, attracting significant traffic by offering free access.
Current Limitations
Testing has revealed several issues with Imagine v0.9 in practical use, including misinterpreting prompts, inconsistencies between video and audio, lack of warnings regarding deepfake risks, and inability to process Chinese language, among others. Furthermore, some users have reported that the web version is temporarily not functioning correctly.
Despite these limitations, Imagine v0.9 still represents a significant advancement in AI video generation technology, offering content creators a fast and free video production tool. With continuous technological iteration, the model is expected to further enhance its capabilities and quality in the coming months.