Home
Login

An autonomous research agent powered by large language models, capable of conducting in-depth local and web research on any topic and generating detailed reports with citations.

Apache-2.0Python 22.2kassafelovicgpt-researcher Last Updated: 2025-06-26

GPT Researcher Project Detailed Introduction

Project Overview

GPT Researcher is an open-source deep research agent designed for conducting web and local research on any given task. The project aims to generate detailed, objective, and unbiased research reports with complete citations. It offers a full suite of customization options for creating tailored and domain-specific research agents.

Core Features

Key Functionalities

  • 📝 Generates detailed research reports using web and local documents
  • 🖼️ Intelligent image grabbing and filtering capabilities
  • 📜 Generates detailed reports exceeding 2000 words
  • 🌐 Aggregates over 20 information sources to derive objective conclusions
  • 🖥️ Provides lightweight (HTML/CSS/JS) and production-ready (NextJS + Tailwind) frontend versions
  • 🔍 Supports JavaScript-enabled web scraping
  • 📂 Maintains memory and context throughout the research process
  • 📄 Supports exporting reports in formats like PDF, Word, etc.

Deep Research Feature

GPT Researcher now includes Deep Research - an advanced recursive research workflow capable of exploring topics with agent depth and breadth. This feature employs a tree-like exploration pattern, delving into subtopics while maintaining a comprehensive view of the research subject.

Deep Research Features:

  • 🌳 Configurable depth and breadth of tree-like exploration
  • ⚡️ Concurrent processing for faster results
  • 🤝 Intelligent context management across research branches
  • ⏱️ Approximately 5 minutes per deep research
  • 💰 Approximately $0.4 cost per research (using o3-mini with "high" inference effort)

Technical Architecture

Core Idea

The core idea of the project is to utilize "Planner" and "Executor" agents. The Planner generates research questions, while the Executor agents gather relevant information. The Publisher then aggregates all findings into a comprehensive report.

Workflow

  1. Create a task-specific agent based on the research query
  2. Generate a set of questions that can form an objective view of the task
  3. Use crawler agents to collect information for each question
  4. Summarize each resource and track sources
  5. Filter and aggregate summaries into the final research report

Problems Solved

GPT Researcher aims to address the following research challenges:

  • Time Cost: Manual research to derive objective conclusions can take weeks, requiring significant resources
  • Information Staleness: LLMs trained on outdated information may produce hallucinations, irrelevant to current research tasks
  • Token Limits: Current LLM token limits are insufficient for generating long-form research reports
  • Limited Information Sources: Limited web sources in existing services lead to misinformation and shallow results
  • Bias Issues: Selective web sources may introduce bias in research tasks

Installation and Usage

Quick Start

Environment Requirements:

  • Install Python 3.11 or higher

Steps:

  1. Clone the project and navigate to the directory:
git clone https://github.com/assafelovic/gpt-researcher.git
cd gpt-researcher
  1. Set up API keys:
export OPENAI_API_KEY={Your OpenAI API Key here}
export TAVILY_API_KEY={Your Tavily API Key here}
  1. Install dependencies and start the server:
pip install -r requirements.txt
python -m uvicorn main:app --reload
  1. Access http://localhost:8000 to start using

PIP Package Installation

pip install gpt-researcher

Code Example:

from gpt_researcher import GPTResearcher

query = "why is Nvidia stock going up?"
researcher = GPTResearcher(query=query, report_type="research_report")

# Conduct research on the given query
research_result = await researcher.conduct_research()

# Write the report
report = await researcher.write_report()

Docker Deployment

  1. Install Docker
  2. Clone the '.env.example' file, add API keys, and save as '.env'
  3. Comment out services you don't want to run in the docker-compose file
  4. Run:
docker-compose up --build

By default, two processes will start:

  • Python server running on localhost:8000
  • React application running on localhost:3000

Local Document Research

GPT Researcher supports research tasks based on local documents. Currently supported file formats include: PDF, plain text, CSV, Excel, Markdown, PowerPoint, and Word documents.

Setup Steps:

  1. Add the environment variable DOC_PATH pointing to the folder containing the documents:
export DOC_PATH="./my-docs"
  1. Select "My Documents" from the "Report Source" dropdown in the frontend application, or set the report_source parameter to "local" when using the PIP package.

Multi-Agent System

As AI evolves from prompt engineering and RAG to multi-agent systems, GPT Researcher introduces a new multi-agent assistant built on LangGraph.

By using LangGraph, the research process can be significantly improved in depth and quality by leveraging multiple agents with specialized skills. Inspired by the recent STORM paper, the project demonstrates how a team of AI agents can collaborate on research for a given topic, from planning to publishing.

Averages running to generate 5-6 page research reports, supporting multiple formats such as PDF, Docx, and Markdown.

Frontend Interface

GPT Researcher now features an enhanced frontend interface to improve user experience and streamline the research process. The frontend provides:

  • An intuitive interface for entering research queries
  • Real-time progress tracking of research tasks
  • Interactive display of research findings
  • Customizable settings for a tailored research experience

Two deployment options are available:

  • A lightweight static frontend served by FastAPI
  • A feature-rich NextJS application with advanced functionalities

Technical Features

Bias Control

  • Reduce erroneous and biased facts by scraping multiple websites
  • Reduce the probability of all information being wrong by selecting the most frequent information
  • Does not aim to eliminate bias, but to minimize it as much as possible
  • Scrape multiple perspectives, evenly explain diverse viewpoints

Performance Optimization

  • Provide stable performance and increase speed by parallelizing agent work
  • Asynchronous processing improves efficiency compared to synchronous operations
  • Intelligent context management ensures research coherence

Disclaimer

GPT Researcher is an experimental application provided "as is" without any express or implied warranties. The code is shared under the Apache 2 license for academic purposes. The content here is not academic advice and is not recommended for use in academic or research papers.

Star History Chart