A community-driven deep research framework based on large language models, integrating professional tools such as web search, crawlers, and Python execution.
DeerFlow - Deep Research Automation Framework
Project Overview
DeerFlow (Deep Exploration and Efficient Research Flow) is a community-driven deep research framework open-sourced by ByteDance. Built upon the outstanding work of the open-source community, this project aims to combine large language models with specialized tools, including web search, web crawling, and Python code execution, while giving back to the open-source community.
Project Address: https://github.com/bytedance/deer-flow
Core Features
🤖 LLM Integration
- Supports integration of most models via litellm
- Supports open-source models such as Qwen
- OpenAI-compatible API interface
- Multi-layered LLM system for varying task complexities
🔍 Search & Retrieval
- Web search via Tavily, Brave Search, etc.
- Web crawling using Jina
- Advanced content extraction capabilities
🔗 Seamless MCP Integration
- Extends functionalities like private domain access, knowledge graphs, and web browsing
- Facilitates the integration of diverse research tools and methodologies
🧠 Human-in-the-Loop
- Supports interactive modification of research plans using natural language
- Supports automatic acceptance of research plans
📝 Report Post-Editing
- Supports Notion-like block editing
- Allows AI refinement, including AI-assisted polishing, sentence shortening, and expansion
- Built on tiptap
🎙️ Podcast & Presentation Generation
- AI-driven podcast script generation and audio synthesis
- Automatically creates simple PowerPoint presentations
- Customizable templates for personalized content
Technical Architecture
DeerFlow implements a modular multi-agent system architecture designed for automated research and code analysis. The system is built on LangGraph, enabling flexible state-based workflows through a well-defined message-passing system where components communicate with each other.
Workflow Components
The system employs a simplified workflow comprising the following components:
Coordinator
- Entry point for managing the workflow lifecycle
- Initiates the research process based on user input
- Delegates tasks to the Planner when appropriate
- Serves as the primary interface between the user and the system
Planner
- Strategic component for task decomposition and planning
- Analyzes research objectives and creates a structured execution plan
- Determines if sufficient context is available or if more research is needed
- Manages the research flow and decides when to generate the final report
Research Team
A collection of specialized agents executing the plan:
- Researcher: Conducts web searches and information gathering using tools like web search engines, web crawlers, and even MCP services
- Coder: Handles code analysis, execution, and technical tasks using Python REPL tools
Each agent has access to specific tools optimized for its role and operates within the LangGraph framework.
Reporter
- Final-stage processor of research output
- Summarizes findings from the Research Team
- Processes and structures collected information
- Generates a comprehensive research report
Installation and Quick Start
System Requirements
- Python 3.12+
- Node.js (for Web UI)
- Recommended to use
uv
,nvm
, andpnpm
Installation Steps
# Clone the repository
git clone https://github.com/bytedance/deer-flow.git
cd deer-flow
# Install dependencies, uv will handle Python interpreter and virtual environment creation, and install required packages
uv sync
# Configure environment variables
cp .env.example .env
# Configure your API keys:
# - Tavily: https://app.tavily.com/home
# - Brave Search: https://brave.com/search/api/
# - volcengine TTS: Add your TTS credentials if you have them
# Configure LLM models and API keys
cp conf.yaml.example conf.yaml
# Install marp for PPT generation
brew install marp-cli
Running the Project
Console UI (Fastest Way)
uv run main.py
Web UI
# First install Web UI dependencies
cd deer-flow/web
pnpm install
# Run backend and frontend servers (development mode)
# macOS/Linux
./bootstrap.sh -d
# Windows
bootstrap.bat -d
Then visit http://localhost:3000 to experience the Web UI.
Supported Search Engines
DeerFlow supports multiple search engines, configurable using the SEARCH_API
variable in the .env
file:
- Tavily (default): Search API designed for AI applications
- DuckDuckGo: Privacy-focused search engine, no API key required
- Brave Search: Privacy-focused search engine with advanced features
- Arxiv: Dedicated to scientific paper search for academic research
Configuration example:
# Choose one: tavily, duckduckgo, brave_search, arxiv
SEARCH_API=tavily
Text-to-Speech Integration
DeerFlow includes text-to-speech (TTS) functionality, allowing you to convert research reports into speech. This feature uses the volcengine TTS API to generate high-quality audio, supporting customizable speed, volume, and pitch.
API Call Example
curl --location 'http://localhost:8000/api/tts' \
--header 'Content-Type: application/json' \
--data '{
"text": "This is a test of the text-to-speech functionality.",
"speed_ratio": 1.0,
"volume_ratio": 1.0,
"pitch_ratio": 1.0
}' \
--output speech.mp3
Development and Debugging
Running Tests
# Run all tests
make test
# Run a specific test file
pytest tests/integration/test_workflow.py
# Run coverage tests
make coverage
# Run code checks
make lint
# Format code
make format
LangGraph Studio Debugging
DeerFlow uses LangGraph as its workflow architecture. You can use LangGraph Studio to debug and visualize workflows in real-time.
# Install uv package manager (if you don't have it)
curl -LsSf https://astral.sh/uv/install.sh | sh
# Install dependencies and start the LangGraph server
uvx --refresh --from "langgraph-cli[inmem]" --with-editable . --python 3.12 langgraph dev --allow-blocking
After starting the server, you can access:
- API: http://127.0.0.1:2024
- Studio UI: https://smith.langchain.com/studio/?baseUrl=http://127.0.0.1:2024
- API Documentation: http://127.0.0.1:2024/docs
Usage Examples
Command-Line Arguments
# Run a specific query
uv run main.py "What are the factors influencing the adoption of AI in healthcare?"
# Run with custom planning parameters
uv run main.py --max_plan_iterations 3 "How does quantum computing affect cryptography?"
# Run in interactive mode
uv run main.py --interactive
# View all available options
uv run main.py --help
Human-in-the-Loop
DeerFlow includes a human-in-the-loop mechanism, allowing you to review, edit, and approve research plans before execution:
- Plan Review: When human-in-the-loop is enabled, the system will present the generated research plan for your review before execution.
- Provide Feedback: You can:
- Accept the plan by replying with
[ACCEPTED]
- Edit the plan by providing feedback (e.g.,
[EDIT PLAN] Add more steps about technical implementation
)
- Accept the plan by replying with
- Automatic Acceptance: You can enable automatic acceptance to skip the review process.
Example Reports
The project includes several example reports showcasing DeerFlow's capabilities:
- OpenAI Sora Analysis Report
- Google's Agent-to-Agent Protocol Report
- Comprehensive Analysis of MCP (Model Context Protocol)
- Bitcoin Price Fluctuation Analysis
- Deep Exploration of Large Language Models
- Best Practices for Deep Research with Claude
- Factors Influencing AI Adoption in Healthcare
- Impact of Quantum Computing on Cryptography
- Cristiano Ronaldo Performance Highlights
Acknowledgements
DeerFlow is built upon the outstanding work of the open-source community, with special thanks to:
- LangChain: For providing an excellent framework for LLM interaction and chaining
- LangGraph: For its innovative approach to multi-agent orchestration
These projects exemplify the transformative power of open-source collaboration, and we are honored to build DeerFlow on their foundation.