HKUDS/LightRAG View GitHub Homepage for Latest Official Releases

LightRAG 是一個簡單快速的檢索增強生成框架，支持多種查詢模式和知識圖譜構建

MITPythonLightRAGHKUDS 26.8k Last Updated: December 28, 2025

LightRAG - 簡單快速的檢索增強生成框架

項目概述

LightRAG 是一個"簡單快速的檢索增強生成"框架，由香港大學數據科學學院（HKUDS）開發。該項目旨在為開發者提供一套完整的 RAG（Retrieval-Augmented Generation）解決方案，支持文檔索引、知識圖譜構建和智能問答功能。

核心特性

🔍 多種檢索模式

LightRAG 支持五種不同的檢索模式，滿足不同場景需求：

naive 模式: 基礎搜索，不使用高級技術
local 模式: 專注於上下文相關信息的檢索
global 模式: 利用全局知識進行檢索
hybrid 模式: 結合本地和全局檢索方法
mix 模式: 集成知識圖譜和向量檢索，提供最全面的答案

🎯 知識圖譜構建

自動從文檔中提取實體和關係
支持知識圖譜的可視化展示
提供實體和關係的增刪改查功能
支持實體合併和去重

🚀 靈活的模型支持

OpenAI 模型: 支持 GPT-4 等 OpenAI 系列模型
Hugging Face 模型: 支持本地部署的開源模型
Ollama 模型: 支持本地運行的量化模型
LlamaIndex 集成: 通過 LlamaIndex 支持更多模型提供商

📊 多樣化存儲後端

向量數據庫: 支持 Faiss、PGVector 等
圖數據庫: 支持 Neo4j、PostgreSQL+Apache AGE
默認存儲: 內置 NetworkX 圖存儲

安裝方式

從 PyPI 安裝

pip install "lightrag-hku[api]"

從源碼安裝

# 創建 Python 虛擬環境（如有必要）
# 以可編輯模式安裝，包含 API 支持
pip install -e ".[api]"

基礎使用示例

初始化和查詢

import os
import asyncio
from lightrag import LightRAG, QueryParam
from lightrag.llm.openai import gpt_4o_mini_complete, openai_embed
from lightrag.kg.shared_storage import initialize_pipeline_status
from lightrag.utils import setup_logger

setup_logger("lightrag", level="INFO")

async def initialize_rag():
    rag = LightRAG(
        working_dir="your/path",
        embedding_func=openai_embed,
        llm_model_func=gpt_4o_mini_complete
    )
    await rag.initialize_storages()
    await initialize_pipeline_status()
    return rag

def main():

    rag = asyncio.run(initialize_rag())
    

    rag.insert("Your text")
    

    result = rag.query(
        "What are the top themes in this story?",
        param=QueryParam(mode="mix")
    )
    print(result)

if __name__ == "__main__":
    main()

高級功能

對話歷史支持

# Create conversation history
conversation_history = [
    {"role": "user", "content": "What is the main character's attitude towards Christmas?"},
    {"role": "assistant", "content": "At the beginning of the story, Ebenezer Scrooge has a very negative attitude towards Christmas..."},
    {"role": "user", "content": "How does his attitude change?"}
]

# Create query parameters with conversation history
query_param = QueryParam(
    mode="mix",  # or any other mode: "local", "global", "hybrid"
    conversation_history=conversation_history,  # Add the conversation history
    history_turns=3  # Number of recent conversation turns to consider
)

# Make a query that takes into account the conversation history
response = rag.query(
    "What causes this change in his character?",
    param=query_param
)

知識圖譜管理

# Create new entity
entity = rag.create_entity("Google", {
    "description": "Google is a multinational technology company specializing in internet-related services and products.",
    "entity_type": "company"
})

# Create another entity
product = rag.create_entity("Gmail", {
    "description": "Gmail is an email service developed by Google.",
    "entity_type": "product"
})

# Create relation between entities
relation = rag.create_relation("Google", "Gmail", {
    "description": "Google develops and operates Gmail.",
    "keywords": "develops operates service",
    "weight": 2.0
})

LightRAG Server

Web UI 功能

LightRAG Server 提供了完整的 Web 界面，包括：

文檔索引管理
知識圖譜可視化
簡單的 RAG 查詢界面
支持重力佈局、節點查詢、子圖過濾等功能

API 接口

提供 RESTful API 接口
兼容 Ollama API 格式
支持 AI 聊天機器人集成（如 Open WebUI）

配置參數

核心參數

working_dir: 工作目錄路徑
embedding_func: 嵌入函數
llm_model_func: 大語言模型函數
vector_storage: 向量存儲類型
graph_storage: 圖存儲類型

性能調優參數

embedding_batch_size: 嵌入批處理大小（默認 32）
embedding_func_max_async: 最大併發嵌入進程數（默認 16）
llm_model_max_async: 最大併發 LLM 進程數（默認 4）
enable_llm_cache: 是否啟用 LLM 緩存（默認 True）

數據導出和備份

支持多種格式的數據導出：

#Export data in CSV format
rag.export_data("graph_data.csv", file_format="csv")

# Export data in Excel sheet
rag.export_data("graph_data.xlsx", file_format="excel")

# Export data in markdown format
rag.export_data("graph_data.md", file_format="md")

# Export data in Text
rag.export_data("graph_data.txt", file_format="txt")

Token 使用跟踪

內置 Token 消耗監控工具：

from lightrag.utils import TokenTracker

# Create TokenTracker instance
token_tracker = TokenTracker()

# Method 1: Using context manager (Recommended)
# Suitable for scenarios requiring automatic token usage tracking
with token_tracker:
    result1 = await llm_model_func("your question 1")
    result2 = await llm_model_func("your question 2")

# Method 2: Manually adding token usage records
# Suitable for scenarios requiring more granular control over token statistics
token_tracker.reset()

rag.insert()

rag.query("your question 1", param=QueryParam(mode="naive"))
rag.query("your question 2", param=QueryParam(mode="mix"))

# Display total token usage (including insert and query operations)
print("Token usage:", token_tracker.get_usage())

適用場景

企業知識管理

內部文檔檢索和問答
知識庫構建和維護
技術文檔智能助手

學術研究

文獻檢索和分析
知識圖譜構建研究
RAG 系統性能評估

內容創作

寫作輔助和素材檢索
多文檔內容整合
智能內容推薦

項目優勢

易於集成: 提供簡單的 Python API 和 REST API
高度可定制: 支持多種模型和存儲後端
性能優化: 支持批處理和異步處理
可視化: 內置知識圖譜可視化功能
企業級: 支持 PostgreSQL 等企業級數據庫

總結

LightRAG 是一個功能全面、易於使用的 RAG 框架，特別適合需要構建智能問答系統和知識管理平台的場景。其靈活的架構設計和豐富的功能特性，使其成為 RAG 領域的優秀開源解決方案。