Home
Login

II-Agent is an open-source intelligent assistant framework designed to simplify and enhance workflows across multiple domains, capable of independently executing complex tasks.

Apache-2.0Python 2.5kIntelligent-Internetii-agent Last Updated: 2025-06-25

II-Agent Project Detailed Introduction

Project Overview

II-Agent is an open-source intelligent assistant designed to simplify and enhance workflows across multiple domains. It represents a significant advancement in how we interact with technology—moving from passive tools to intelligent systems capable of independently executing complex tasks.

Project Address: https://github.com/Intelligent-Internet/ii-agent

Core Features

II-Agent is built around providing a proxy interface for Anthropic Claude models, offering the following functionalities:

  • CLI Interface: Direct command-line interaction
  • WebSocket Server: Support for modern React frontends
  • Google Cloud Vertex AI Integration: Access to Anthropic models via API

Application Areas and Functions

Domain II-Agent Functionality
Research & Fact-Checking Multi-step web searches, triangulation of information sources, structured note-taking, rapid summarization
Content Generation Blog and article drafts, lesson plans, creative essays, technical manuals, website creation
Data Analysis & Visualization Data cleaning, statistical analysis, trend detection, chart creation, automated report generation
Software Development Code synthesis, refactoring, debugging, test writing, multi-language step-by-step tutorials
Workflow Automation Script generation, browser automation, file management, process optimization
Problem Solving Problem decomposition, alternative path exploration, step-by-step guidance, troubleshooting

System Architecture

The II-Agent system employs a sophisticated approach to construct a versatile AI agent, with core methodologies including:

1. Core Agent Architecture and LLM Interaction

  • Dynamically customized system prompts
  • Comprehensive interaction history management
  • Intelligent context management to handle token limits
  • Systematized LLM calls and function selection
  • Iterative optimization through execution cycles

2. Planning and Reflection

  • Structured reasoning for complex problem-solving
  • Problem decomposition and sequential thinking
  • Transparent decision-making processes
  • Hypothesis formation and testing

3. Execution Capabilities

  • File system operations with intelligent code editing
  • Command-line execution in a secure environment
  • Advanced web interaction and browser automation
  • Task completion and reporting
  • Specialized functions for various modalities (experimental): PDF, audio, image, video, slides
  • Deep research integration

4. Context Management

  • Token usage estimation and optimization
  • Strategic truncation for long interactions
  • File-based archiving for large outputs

5. Real-time Communication

  • Interactive interface based on WebSocket
  • Isolated agent instances per client
  • Streaming operation events for a responsive user experience

Performance Evaluation

II-Agent has been evaluated on the GAIA benchmark, which assesses LLM-based agents operating in real-world scenarios, covering multiple dimensions including multi-modal processing, tool utilization, and web search.

Several issues were identified with the GAIA benchmark during the evaluation process:

  • Annotation Errors: Several incorrect annotations in the dataset
  • Outdated Information: Some questions referenced websites or content that were no longer accessible
  • Linguistic Ambiguity: Unclear wording leading to different interpretations of the questions

Despite these challenges, II-Agent performed well in the benchmark, particularly in areas requiring complex reasoning, tool use, and multi-step planning.

Installation and Configuration

System Requirements

  • Python 3.10+
  • Node.js 18+ (for the frontend)
  • Google Cloud project with Vertex AI API enabled or Anthropic API key

Environment Configuration

Create a .env file in the root directory:

# Image and video generation tools
OPENAI_API_KEY=your_openai_key
OPENAI_AZURE_ENDPOINT=your_azure_endpoint

# Search providers
TAVILY_API_KEY=your_tavily_key
#JINA_API_KEY=your_jina_key
#FIRECRAWL_API_KEY=your_firecrawl_key

# For image search and better search results, use SerpAPI
#SERPAPI_API_KEY=your_serpapi_key

STATIC_FILE_BASE_URL=http://localhost:8000/

# If using Anthropic client
ANTHROPIC_API_KEY=

# If using Google Vertex (recommended, extra throughput if you have permissions)
#GOOGLE_APPLICATION_CREDENTIALS=

Frontend environment configuration, create a .env file in the frontend directory:

NEXT_PUBLIC_API_URL=http://localhost:8000

Installation Steps

  1. Clone the repository

  2. Set up the Python environment:

python -m venv .venv
source .venv/bin/activate  # Windows: .venv\Scripts\activate
pip install -e .
  1. Set up the frontend (optional):
cd frontend
npm install

Usage

CLI Usage

Using the Anthropic client:

python cli.py

Using Vertex:

python cli.py --project-id YOUR_PROJECT_ID --region YOUR_REGION

CLI Options:

  • --project-id: Google Cloud project ID
  • --region: Google Cloud region (e.g., us-east5)
  • --workspace: Workspace directory path (default: ./workspace)
  • --needs-permission: Requires permission before executing commands
  • --minimize-stdout-logs: Reduce the amount of logs printed to stdout

Web Interface Usage

  1. Start the WebSocket server:

Using the Anthropic client:

export STATIC_FILE_BASE_URL=http://localhost:8000
python ws_server.py --port 8000

Using Vertex:

export STATIC_FILE_BASE_URL=http://localhost:8000
python ws_server.py --port 8000 --project-id YOUR_PROJECT_ID --region YOUR_REGION
  1. Start the frontend (in a separate terminal):
cd frontend
npm run dev
  1. Open your browser and visit http://localhost:3000

Project Structure

  • cli.py: Command-line interface
  • ws_server.py: Frontend WebSocket server
  • src/ii_agent/: Core agent implementation
    • agents/: Agent implementations
    • llm/: LLM client interfaces
    • tools/: Tool implementations
    • utils/: Utility functions

Technical Features

The II-Agent framework is architected around the reasoning capabilities of large language models such as Claude 3.7 Sonnet, presenting a comprehensive and robust approach to building versatile AI agents. Through the synergistic combination of a powerful LLM, a rich set of execution capabilities, explicit planning and reflection mechanisms, and intelligent context management strategies, II-Agent is capable of handling a wide range of complex, multi-step tasks.

Summary

II-Agent represents a significant advancement in intelligent agent technology, with its open-source nature and extensible design providing a solid foundation for continued research and development in the rapidly evolving field of agent AI. Through its multi-domain application capabilities and robust technical architecture, II-Agent provides users with a comprehensive and easy-to-use intelligent assistant platform.

Star History Chart