Alibaba-NLP/WebAgentPlease refer to the latest official releases for information GitHub Homepage

An intelligent web agent system developed by Alibaba Tongyi Lab, comprising three components: WebWalker, WebDancer, and WebSailor, focusing on autonomous information search and web navigation tasks.

MITPython 3.1kAlibaba-NLPWebAgent Last Updated: 2025-07-10

WebAgent - Intelligent Web Agent System

Project Overview

WebAgent is an innovative intelligent web agent system developed by Alibaba Tongyi Lab, focusing on autonomous information search and web navigation tasks. This project integrates multiple advanced components, aiming to build intelligent agents capable of autonomously performing complex information retrieval and web traversal tasks.

Key Components

1. WebWalker (ACL 2025)

Functionality: A benchmarking tool for Large Language Models (LLMs) in web traversal tasks.
Key Features:
- Provides a standardized web traversal evaluation framework.
- Supports multi-agent collaborative information search.
- Offers quantitative evaluation metrics for LLM's web navigation capabilities.

2. WebDancer (Preprint 2025)

Functionality: An end-to-end training framework for autonomous information search agents.
Key Features:
- Native intelligent search reasoning model, utilizing the ReAct framework.
- Enables autonomous information search agents and deep research-type models.
- Four-stage training paradigm:
  1. Browsing Data Construction
  2. Trajectory Sampling
  3. Supervised Fine-tuning (for effective cold start)
  4. Reinforcement Learning (to improve generalization capability)

3. WebSailor

Functionality: Extends the functional scope of web agents.
Key Features: Provides broader web operation and navigation capabilities.

Technical Features

Data-Centric Approach

Trajectory-Level Supervised Fine-tuning: Trains models using precise trajectory data.
Reinforcement Learning Integration: Employs DAPO (Data-Augmented Policy Optimization) technology.
Scalable Training Pipeline: Supports both SFT (Supervised Fine-tuning) and RL (Reinforcement Learning) training modes.

Autonomous Learning Capabilities

Intelligent agents can autonomously acquire search and reasoning skills.
Supports long-span, multi-step complex reasoning tasks.
Achieves end-to-end processing for web traversal, information search, and Q&A.

Performance

According to project documentation, WebDancer demonstrates excellent performance in standard benchmarks:

GAIA Benchmark: Achieved a Pass@3 score of 61.1%.
WebWalkerQA Benchmark: Achieved a Pass@3 score of 54.6%.

Application Scenarios

Supported Task Types

Web Traversal: Intelligent navigation and page exploration.
Information Search: Autonomous information collection and organization.
Question Answering Systems: Complex Q&A based on web content.
Long-Span Reasoning: Multi-step complex logical reasoning tasks.

Demo Environments

The project provides multiple demo environments:

WebWalkerQA Demo
GAIA Benchmark Demo
Daily Usage Scenario Demo

Technical Architecture

Training Paradigm

1. Browsing Data Construction → 2. Trajectory Sampling → 3. Supervised Fine-tuning → 4. Reinforcement Learning

Core Technology Stack

Base Framework: ReAct (Reasoning and Acting)
Training Methods: SFT + RL (Supervised Fine-tuning + Reinforcement Learning)
Data Processing: DAPO (Data-Augmented Policy Optimization)

Conclusion

WebAgent represents the latest advancements in intelligent web agent technology. By integrating multiple advanced components and adopting a data-centric training approach, it achieves autonomous information search and navigation capabilities in complex web environments. This project has not only made significant impacts in academia but also provides a strong technical foundation for practical applications.