Python SDK for LlamaCloud services, providing knowledge agents and cloud data management solutions.

MITTypeScriptllama_cloud_servicesrun-llama 4.2k Last Updated: October 06, 2025

Detailed Introduction to LlamaCloud Services Project

Project Overview

LlamaCloud Services is a Python SDK developed by the LlamaIndex team for interacting with LlamaCloud cloud services. This project provides a comprehensive suite of knowledge agent and data management tools, specifically designed for Large Language Model (LLM) application scenarios, including core functionalities such as intelligent document parsing, structured data extraction, and cloud-based index management.

Core Service Components

🔍 LlamaParse - AI-Native Document Parser

LlamaParse is the world's first GenAI-native document parser, built specifically for LLM use cases, featuring:

Supported Formats:

  • Supports 130+ file formats (PDF, DOCX, PPTX, XLSX, ODT, ODS, HTML, EPUB, images, EML, etc.)
  • Specifically optimized for parsing tables and charts in complex PDF documents
  • Supports multimodal parsing, using LLMs and LVMs to process complex documents

Parsing Modes:

  • Cost Effective: Optimized for speed and cost, suitable for text-heavy, simply structured documents
  • Agentic: Default option, suitable for documents containing images and charts
  • Agentic Plus: Highest fidelity, suitable for complex layouts, tables, and visual structures
  • Use-case Oriented: Dedicated parsing options for specific document types (invoices, forms, technical resumes, scientific papers)

Technical Features:

  • Markdown output that preserves the document's semantic structure
  • Advanced table, chart, and layout extraction
  • Visual referencing capabilities, traceable back to the original document location
  • Layout-aware parsing, breaking down pages into visual blocks

📊 LlamaExtract - Intelligent Data Extractor

LlamaExtract is a pre-built intelligent data extractor that converts data into a structured JSON representation.

Core Functions:

  • Extracts structured data based on user-defined schemas
  • Supports agentic data extraction workflows
  • Handles scenarios such as resume screening and form data extraction
  • Automated data validation and cleaning

Use Cases:

  • Resume and job application processing
  • Financial document data extraction
  • Form and survey data structuring
  • Contract and legal document information extraction

🗂️ LlamaCloud Index - Cloud Indexing Service

LlamaCloud Index is a highly customizable, fully automated document ingestion pipeline that also provides retrieval capabilities.

Features:

  • Automated document ingestion and indexing
  • Supports integration with various data sources
  • Provides retrieval API services
  • Scalable cloud storage solutions

📋 LlamaReport - Intelligent Report Generator

LlamaReport is a pre-built intelligent report builder that can construct reports from multiple data sources (currently in beta/invite-only phase).

Installation and Usage

Basic Installation

pip install llama-cloud-services

Basic Usage

from llama_cloud_services import (
    LlamaParse,
    LlamaExtract,
    LlamaCloudIndex,
    LlamaReport
)

# Document Parsing
parser = LlamaParse(api_key="YOUR_API_KEY")
result = parser.parse("./document.pdf")

# Data Extraction
extract = LlamaExtract(api_key="YOUR_API_KEY")
agent = extract.create_agent(name="data-extraction", data_schema=your_schema)

# Cloud Indexing
index = LlamaCloudIndex(
    "my_index",
    project_name="default",
    api_key="YOUR_API_KEY"
)

# Report Generation
report = LlamaReport(api_key="YOUR_API_KEY")

Command Line Tools

# Set environment variable after obtaining API key
export LLAMA_CLOUD_API_KEY='llx-...'

# Parse document to text
llama-parse my_file.pdf --result-type text --output-file output.txt

# Parse document to Markdown
llama-parse my_file.pdf --result-type markdown --output-file output.md

# Output raw JSON
llama-parse my_file.pdf --output-raw-json --output-file output.json

Integration and Compatibility

LlamaIndex Integration

from llama_cloud_services import LlamaParse
from llama_index.core import SimpleDirectoryReader

parser = LlamaParse(api_key="YOUR_API_KEY")

# Direct integration into SimpleDirectoryReader
reader = SimpleDirectoryReader(
    input_files=["./document.pdf"],
    file_extractor={".pdf": parser}
)
documents = reader.load_data()

Multilingual and Regional Support

# EU region support
from llama_cloud_services import LlamaParse, EU_BASE_URL

parser = LlamaParse(
    api_key="YOUR_API_KEY",
    base_url=EU_BASE_URL,
    language="en"  # Supports multiple languages
)

Technical Features

🚀 Performance Optimization

  • Multi-worker parallel processing
  • Asynchronous parsing support
  • Batch file processing capability
  • Intelligent caching mechanism

🔧 High Customizability

  • Flexible parsing parameter configuration
  • Custom data schema definition
  • Multiple output format options
  • Configurable quality levels

🛡️ Enterprise-Grade Features

  • Data privacy protection
  • High-availability cloud services
  • API rate limiting and quota management
  • Detailed usage statistics

Pricing Model

LlamaParse Pricing

  • Free Plan: Up to 1000 pages per day
  • Paid Plan: 7000 free pages per week + additional pages at $0.003/page
  • Enterprise Plan: Supports high volume and on-premise deployment

Usage Limits

  • Maximum support for approximately 3000 pages per single file
  • Maximum supported file size varies by format
  • API call frequency limits

Application Scenarios

📚 Intelligent Document Processing

  • Academic paper parsing and knowledge extraction
  • Technical document structuring
  • Legal contract information extraction
  • Financial report data analysis

🏢 Enterprise Data Management

  • Building internal document knowledge bases
  • Customer profile data extraction
  • Business process automation
  • Compliance document processing

🔬 Research and Development

  • Scientific literature data mining
  • Patent document analysis
  • Technical report processing
  • Dataset construction and cleaning

Development and Deployment

Development Environment Setup

  1. Register for a LlamaCloud account: https://cloud.llamaindex.ai/
  2. Obtain an API key
  3. Install the Python SDK
  4. Configure environment variables

Production Environment Deployment

  • Supports cloud API calls
  • Integrates into existing data pipelines
  • Supports batch processing workflows
  • Provides monitoring and logging capabilities

MCP (Model Context Protocol) Support

LlamaCloud Services also provides MCP server support, allowing integration with MCP-enabled clients (e.g., Claude Desktop):

# MCP Server Integration Example
from llamacloud_mcp import LlamaCloudMCPServer

server = LlamaCloudMCPServer(
    api_key="YOUR_API_KEY",
    indexes=["your_index_name"],
    agents=["your_agent_name"]
)

Community and Support

Future Development

LlamaCloud Services is continuously improving in the following areas:

  • Support for more file formats
  • Enhanced chart and table parsing capabilities
  • Better multilingual support
  • Advanced AI agent functionalities
  • More enterprise-grade features

This project represents cutting-edge technology in the field of document processing and knowledge management, providing robust data infrastructure support for building high-quality LLM applications.

Star History Chart