DeepMind Launches SIMA 2: Breakthrough in Reasoning and Autonomous Learning by a Gemini-Powered AI Agent in Virtual Worlds

November 15, 2025
Google DeepMind
5 min

Abstract

On November 13, 2025 (Eastern Time), Google DeepMind launched SIMA 2 (Scalable Instructable Multiworld Agent), a next-generation AI agent powered by the Gemini model. This system not only executes instructions within 3D virtual worlds but also possesses reasoning, conversational, and self-learning capabilities, marking a significant advancement in Artificial General Intelligence (AGI) research. SIMA 2 demonstrates substantially improved task-completion rates compared to its predecessor and can operate effectively in gaming environments it has never encountered during training, laying the groundwork for future robotics technologies.


Technical Breakthrough: From Instruction Following to Reasoning and Decision-Making

The first version of SIMA was introduced in March 2024, capable of performing over 600 basic skills—such as "turn left," "climb a ladder," and "open the map"—across multiple commercial video games. The system operates by "watching" the screen and using virtual keyboard and mouse inputs, mimicking how human players interact with games.

SIMA 2 achieves a qualitative leap by integrating the Gemini 2.5 Flash-Lite model. According to Joe Marino, Senior Research Scientist at DeepMind, during a media briefing, SIMA 2 represents a "step-change improvement" over its predecessor. Rather than merely responding to commands, the system now understands high-level objectives, performs complex reasoning, and explains its intended actions and execution steps to users.

In testing, SIMA 2 significantly outperformed its predecessor. On complex tasks, SIMA 1 achieved a success rate of only 31%, compared to 71% for human players. SIMA 2 dramatically narrowed this gap, approaching human-level performance across multiple evaluation tasks.

Cross-Environment Generalization

One of SIMA 2’s most remarkable features is its exceptional generalization capability. The system functions not only in the eight commercial games it was trained on—including No Man’s Sky, Valheim, and Goat Simulator 3—but also successfully completes tasks in entirely unseen game environments.

In tests involving the Viking survival game ASKA and MineDojo (a research-oriented implementation of Minecraft), SIMA 2 demonstrated powerful transfer learning abilities. It could apply the concept of "mining" learned in one game to a "harvesting" scenario in another—a form of conceptual transfer that serves as a foundational element for human-like cognition.

Even more impressively, when combined with Genie 3—another DeepMind research achievement capable of generating real-time 3D simulated worlds from a single image or text prompt—SIMA 2 can rapidly orient itself, interpret instructions, and execute meaningful actions within entirely new virtual environments.

Self-Improvement Mechanism

SIMA 2 introduces a revolutionary innovation through its self-learning capability. Unlike SIMA 1, which relied entirely on human gameplay data for training, SIMA 2 initially uses human demonstrations as a baseline before transitioning into an autonomous learning mode.

The system employs another Gemini model to generate novel tasks, while an independent reward model evaluates the agent’s performance. Using this self-generated experience data, SIMA 2 learns from its own mistakes and continuously improves through trial and error—essentially teaching itself new behaviors guided by AI-generated feedback rather than human supervision.

This self-improvement loop paves the way for future AI development, enabling agents to learn and evolve with minimal human intervention, positioning them as open-ended learners in the field of embodied AI.

Multimodal Interaction Experience

SIMA 2 supports multiple interaction modalities: users can control the agent via text chat, voice conversation, or by drawing directly on the game screen. The system understands instructions in various languages and can even correctly interpret emoji to carry out tasks.

According to Jane Wang, Senior Research Scientist at DeepMind, in an interview with TechCrunch, SIMA 2’s applications extend far beyond gaming. The research team views this work as a critical step toward developing more general-purpose agents and advancing real-world robotics.

A Bridge to Robotics

DeepMind considers SIMA 2 a key enabler for next-generation agents capable of performing open-ended tasks in environments far more complex than web browsers. In the long term, this technology aims to power real-world robotic systems.

Frederic Besse, Senior Research Engineer, explained during the media briefing that SIMA 2 should be viewed as a high-level decision-maker rather than a low-level motion controller. “From a robotics perspective, it addresses ‘what to do and why,’ not ‘how to control joint torques.’” This layered architecture mirrors how many labs currently build systems: a planning layer on top, with perception and control layers underneath.

The skills learned by SIMA 2—navigation, tool use, and collaborative task execution—are precisely the foundational building blocks required for future real-world robotic companions.

Current Limitations and Future Directions

Despite significant progress, SIMA 2 still faces challenges. The system struggles with long-horizon, complex tasks requiring extensive multi-step reasoning and goal verification. Additionally, its interaction memory remains relatively short, necessitating a limited context window to maintain low-latency responsiveness. Precise low-level operations via virtual keyboard and mouse interfaces, along with robust visual understanding in complex 3D scenes, remain open research challenges for the broader field.

Development Pathway

DeepMind emphasized its commitment to responsible development of SIMA 2. The team worked closely with its Responsible Development and Innovation group to release SIMA 2 as a limited research preview, granting early access only to a select group of researchers and game developers. This approach aims to gather critical feedback and interdisciplinary perspectives while continuing to deepen understanding of potential risks and appropriate mitigation strategies as this new domain is explored.

According to official information, a full technical report will be released shortly. The project received collaborative support from multiple game studios, including Coffee Stain, Hello Games, and Thunderful Games, and was trained and evaluated across several commercial titles such as No Man’s Sky, Valheim, Goat Simulator 3, and Teardown.

The launch of SIMA 2 marks a pivotal shift in AI research—from specialized systems toward general-purpose agents—laying a solid foundation for the intelligent evolution of future digital assistants and physical robots.