OpenAI Launches Sora 2 Model and Social App of the Same Name: AI Video Generation Ushers in the Era of Audio-Visual Synchronization
Abstract
OpenAI officially released its next-generation AI video generation model, Sora 2, on September 30th, along with its accompanying iOS social application, Sora. The new model for the first time achieves synchronized AI-generated video and audio, including automatic generation of dialogue and sound effects. The app features a TikTok-like short video feed design, is currently only available in the United States and Canada, and operates on an invitation-only basis.
Technical Breakthrough: Significant Improvement in Physical Accuracy
Sora 2 has achieved significant improvements in physical simulation, realism, and controllability. Unlike earlier video models, which often suffered from object deformation and violations of physical laws, Sora 2 can handle complex action scenes such as gymnastics routines, skateboarding tricks, and diving, adhering to real-world physical rules.
OpenAI research team members Bill Peebles, Rohan Sahai, and Thomas Dimson demonstrated the model's capabilities in a YouTube livestream. The demo videos included scenes like beach volleyball matches, skateboarding trick performances, and gymnastics routines, showcasing unprecedented fluidity and realism.
Audio-Video Synchronization: Addressing a Key Shortcoming
The most striking update is that Sora 2 for the first time supports AI audio generation that matches the video footage, including synchronized dialogue and sound effects. This feature addresses a significant shortcoming of the original Sora model. When OpenAI first unveiled the Sora model in early 2024, it caused a sensation in the industry, but it wasn't made available to the public until December 2024. During this period, competitors such as Runway, Luma, and Kling successively launched video models with audio generation capabilities.
Social App: Challenging Short Video Platforms
Released concurrently with Sora 2 is the iOS app named Sora, which features an algorithm-recommended short video feed design. The app's most distinctive feature is "Cameo," which allows users, after authorization, to insert their own and their friends' likenesses into AI-generated videos. OpenAI stated that it has established strict identity protection measures to prevent the unauthorized use of others' likenesses.
The app currently operates on a free model, which OpenAI says is to allow users to freely explore its features. The only planned charge is for additional video generation during peak hours.
Copyright Disputes Emerge
On the first day of the Sora app's release, users generated videos featuring copyrighted game characters like Mario and Pikachu, raising concerns among copyright experts. Mark McKenna, a law professor at UCLA, pointed out that if OpenAI allows the output of copyrighted content without an opt-out mechanism for users, this practice might not comply with copyright law.
Furthermore, some users generated videos depicting OpenAI CEO Sam Altman stealing computer parts from a store, highlighting the technology's potential risk in creating false content. To address these issues, OpenAI stated that all videos generated through the Sora app or website will carry a moving watermark and be marked in their metadata as AI-generated.
User Creation Boom and Parody Phenomenon
Following the release of Sora 2, a wave of AI video creation swept across Chinese social media. Users generated videos on various themes, including historical dramas, modern urban dramas, and sports events. Some netizens even created fictional scenes of the Chinese men's national football team winning the World Cup, as well as various parody videos targeting Sam Altman.
Market Positioning: Model + Product Combination Strategy
Analysts point out that OpenAI's strategy has shifted from pure model competition to a "model + product" combination. Once a technology crosses the usability threshold, OpenAI quickly launches accompanying applications to lock in users with product barriers. This strategy has previously been validated with ChatGPT and the code generation tool Codex.
The Sora app has currently become the most downloaded application in the Photos & Videos category of the iOS App Store. OpenAI stated its hope to expand its services to more countries and regions as soon as possible.
Readers using this technology should be mindful of complying with relevant laws and regulations, and respecting others' privacy and intellectual property rights.