Google Officially Unveils Gemini 3 Series: AI Reasoning Leaps Forward, Milestone Integration with Search Engine on Launch Day
Abstract
On November 18, 2025, Eastern Time, Google officially launched the Gemini 3 series of AI models—the company’s most intelligent models to date. Gemini 3 Pro topped the LMArena leaderboard with a score of 1501, outperforming competitors across all 19 benchmark tests. The new model achieves significant breakthroughs in reasoning, multimodal understanding, and code generation, while introducing innovative features such as Generative UI. Simultaneously, Google unveiled the new Gemini Agent functionality and the Antigravity code development platform, further solidifying its position in the AI race.
Core Performance Breakthroughs
Gemini 3 Pro achieved a groundbreaking score of 1501 on the industry-recognized LMArena leaderboard, surpassing its predecessor, Gemini 2.5 Pro, which scored 1451. According to data disclosed by Google, the model secured top scores in 19 out of 20 major benchmarks, demonstrating comprehensive technical superiority.
In academic evaluations, Gemini 3 Pro attained 37.5% accuracy on "Humanity's Last Exam"—a challenging test comprising 2,500 questions spanning over 100 disciplines—surpassing OpenAI’s GPT-5.1 (26.5%) by approximately 11 percentage points. On the PhD-level scientific benchmark GPQA Diamond, Gemini 3 Pro scored 91.9%, exceeding the previous record of 87.6% held by GPT-5.1.
In mathematical reasoning, the model set a new record of 23.4% on MathArena Apex. For multimodal reasoning, Gemini 3 Pro scored 81% on MMMU-Pro and 87.6% on Video-MMMU, establishing clear leadership in visual comprehension.
Exceptional Performance of Deep Think Mode
Google also announced the launch of Gemini 3 Deep Think mode—a version optimized for deep reasoning—which will be rolled out to Google AI Ultra subscribers ($249.99/month) within the coming weeks. Deep Think achieved 41.0% accuracy on "Humanity's Last Exam" and 93.8% on GPQA Diamond.
Most notably, Gemini 3 Deep Think achieved an unprecedented 45.1% on the ARC-AGI benchmark (with code execution enabled), while Gemini 3 Pro reached 31.1%. In contrast, the second-ranked model, GPT-5.1 Thinking (High), scored only 17.6%, highlighting a performance gap of 2–3 times. ARC-AGI is widely regarded as a key indicator of general AI intelligence and the ability to solve novel problems.
Product Integration and Availability
Starting November 18, Gemini 3 Pro has been globally available to all users. Users can access the new model via the Gemini app, AI Mode, and AI Overviews by selecting the "Thinking" mode in the model selector. Subscribers to Google AI Plus, Pro, and Ultra plans enjoy higher usage limits.
Developers can access Gemini 3 Pro through Google AI Studio, the Gemini API, and Vertex AI. API pricing is set at $2 per million input tokens and $12 per million output tokens (for prompts under 200,000 tokens), reflecting a modest increase compared to Gemini 2.5 Pro.
Innovation in Generative UI
Gemini 3 introduces the concept of "Generative UI"—interactive interfaces dynamically generated by the model in real time. The system can automatically design and customize complete user experiences—including websites, games, tools, and applications—based on user prompts.
Two experimental features are already rolling out in the Gemini app: Visual Layout creates immersive magazine-style views with photos and modules, while Dynamic View designs and codes fully customized interactive responses for each prompt. For example, explaining the microbiome to a 5-year-old versus an adult requires different content and functionality, and the system automatically adapts the interface accordingly.
Gemini Agent Empowers Automated Tasks
Gemini Agent is an experimental feature initially available to Google AI Ultra subscribers. It handles multi-step tasks directly within the Gemini app, leveraging Gemini 3’s advanced reasoning capabilities, real-time web browsing, and tool integration—including Canvas, Deep Research, Gmail, and Google Calendar.
Users can ask Gemini Agent to “organize my inbox,” prompting it to group related emails and offer quick options to archive or mark as read. Another example: “Using details from my emails, book a midsize SUV under $80 per day for next week’s trip.” Gemini would locate flight information, research rental options within budget, and prepare the booking—all while seeking user confirmation before executing critical actions like purchases or sending emails.
Antigravity Development Platform
Google simultaneously launched Google Antigravity, a new AI agent development platform enabling developers to code “at a higher, task-oriented level.” This integrated development environment combines a ChatGPT-like prompt window, a command-line interface, and a browser window to showcase the real-world effects of code changes.
Josh Woodward, Vice President of Product at Google, described Gemini 3 as the company’s “best-ever ambient coding model.” Agents can autonomously plan and execute complex end-to-end software tasks across editors, terminals, and browsers while validating their own code.
Market Positioning and Competitive Landscape
Gemini 3’s release comes less than a week after OpenAI launched GPT-5.1 and just two months after Anthropic released Claude Sonnet 4.5, underscoring the intense pace of competition in cutting-edge AI model development.
In a blog post, Google CEO Sundar Pichai wrote: “In just two years, AI has evolved from simply reading text and images to truly ‘reading the room.’” He announced, “Starting today, we’re deploying Gemini at Google scale.”
Data shows that the Gemini app currently has 650 million monthly active users, while AI Overviews reaches 2 billion monthly users. By comparison, OpenAI reported in August that ChatGPT had reached 700 million weekly active users. Over 70% of Google Cloud customers use its AI services, and 13 million developers have already built with its generative models.
Response Quality Optimization
According to Demis Hassabis, CEO of Google DeepMind, AI responses powered by Gemini 3 will “replace clichés and flattery with genuine insight—telling you what you need to hear, not what you want to hear.” This shift addresses widespread criticism of current AI chatbots for being overly deferential.
Google emphasizes that Gemini 3 Pro delivers responses that are “smart, concise, and direct,” with significantly improved contextual understanding and intent recognition, enabling users to “get what they need with fewer prompts.”
Third-Party Integrations
Gemini 3 is already supported by multiple third-party developer tools, including Cursor, GitHub, JetBrains, Manus, and Cline. Nik Pash, Head of AI at Cline, stated: “Cline is using Gemini 3 to enable autonomous code generation within developers’ IDEs. Gemini 3 Pro can handle complex, long-horizon tasks across entire codebases, maintaining context during multi-file refactoring, debugging sessions, and feature implementation. It uses long context more effectively than Gemini 2.5 Pro and solves issues that plague other leading models.”
Future Outlook
Google stated it will soon release additional models in the Gemini 3 series, empowering users to accomplish even more with AI. The company is also extending one year of free Google AI Pro access to U.S. college students, ensuring they can leverage top-tier Google AI services, including Gemini 3.
With the launch of Gemini 3, Google’s full-stack approach—spanning leading infrastructure, world-class research and models, and products reaching billions globally—is accelerating the deployment of advanced capabilities to market. This AI arms race is advancing at an unprecedented pace, and the release of Gemini 3 is undoubtedly one of the most significant AI events of 2025.