chroma-core/chroma-mcp

Model Context Protocol (MCP) server implementation providing database functionality for Chroma, supporting AI data retrieval capabilities such as vector search, full-text search, and metadata filtering.

Apache-2.0Python 188chroma-core Last Updated: 2025-06-11

https://github.com/chroma-core/chroma-mcp

Chroma MCP - Model Context Protocol Server

Project Overview

Chroma MCP is a server implementation based on the Model Context Protocol (MCP), specifically designed to provide robust database functionality for the Chroma vector database. As an extension of the open-source embedding database Chroma, it offers standardized data retrieval and management capabilities for LLM applications.

The Model Context Protocol is an open protocol aimed at enabling seamless integration between LLM applications and external data sources or tools, providing AI models with the necessary contextual information. Chroma MCP leverages this protocol, allowing AI models to easily create data collections, store user input and generated data, and retrieve this data through various search methods.

Core Features and Characteristics

🔧 Flexible Client Types

Chroma MCP supports multiple client configurations to meet the needs of different scenarios:

Ephemeral Client: Suitable for testing and development environments, data is stored in memory.
Persistent Client: File-based storage, data is persisted locally.
HTTP Client: Connects to a self-hosted Chroma instance.
Cloud Client: Integrates with Chroma Cloud service, automatically connects to api.trychroma.com.

📁 Collection Management Features

Provides complete collection lifecycle management:

Creation and Configuration: Create new collections and configure HNSW parameters for optimized vector search.
Modification and Deletion: Supports modification of collection names and metadata, as well as complete deletion.
Information Query: Retrieve detailed collection information, statistics, and document counts.
Paginated List: Supports paginated collection listing functionality.
Embedding Function Selection: Choose different embedding functions when creating a collection.

📄 Document Operation Capabilities

Comprehensive document management and operation features:

Document Addition: Supports adding documents with optional metadata and custom IDs.
Semantic Query: Use semantic search to query documents, supporting advanced filtering.
Document Retrieval: Retrieve documents by ID or filter, supporting pagination.
Document Update: Update the content, metadata, or embeddings of existing documents.
Document Deletion: Delete specific documents from a collection.
Full-Text Search: Provides powerful full-text search capabilities.
Advanced Filtering: Supports advanced filtering based on metadata and document content.

🤖 Diverse Embedding Function Support

Chroma MCP supports a variety of embedding functions, providing choices for different application scenarios:

default: Default embedding function
cohere: Cohere embedding service
openai: OpenAI embedding service
jina: Jina AI embedding service
voyageai: Voyage AI embedding service
roboflow: Roboflow embedding service

🔍 Rich API Tools

Provides a complete set of API tools:

chroma_list_collections: Paginated collection listing
chroma_create_collection: Create a new collection with optional HNSW configuration
chroma_peek_collection: View sample documents in a collection
chroma_get_collection_info: Get detailed collection information
chroma_get_collection_count: Get the number of documents in a collection
chroma_modify_collection: Update collection name or metadata
chroma_delete_collection: Delete a collection
chroma_add_documents: Add documents with metadata and custom IDs
chroma_query_documents: Query documents using semantic search and advanced filtering
chroma_get_documents: Retrieve documents by ID or filter
chroma_update_documents: Update document content, metadata, or embeddings
chroma_delete_documents: Delete specific documents

Configuration and Deployment

Claude Desktop Integration Configuration

Ephemeral Client Configuration:

"chroma": {
  "command": "uvx",
  "args": ["chroma-mcp"]
}

Persistent Client Configuration:

"chroma": {
  "command": "uvx",
  "args": [
    "chroma-mcp",
    "--client-type", "persistent",
    "--data-dir", "/full/path/to/your/data/directory"
  ]
}

Cloud Client Configuration:

"chroma": {
  "command": "uvx",
  "args": [
    "chroma-mcp",
    "--client-type", "cloud",
    "--tenant", "your-tenant-id",
    "--database", "your-database-name",
    "--api-key", "your-api-key"
  ]
}

HTTP Client Configuration:

"chroma": {
  "command": "uvx",
  "args": [
    "chroma-mcp",
    "--client-type", "http",
    "--host", "your-host",
    "--port", "your-port",
    "--custom-auth-credentials", "your-custom-auth-credentials",
    "--ssl", "true"
  ]
}

Environment Variable Configuration

Supports configuration via environment variables, providing more flexible deployment options:

# General Variables
export CHROMA_CLIENT_TYPE="http"
export CHROMA_DATA_DIR="/full/path/to/your/data/directory"

# Cloud Client Configuration
export CHROMA_TENANT="your-tenant-id"
export CHROMA_DATABASE="your-database-name"
export CHROMA_API_KEY="your-api-key"

# HTTP Client Configuration
export CHROMA_HOST="your-host"
export CHROMA_PORT="your-port"
export CHROMA_SSL="true"

# Embedding Function API Keys
export CHROMA_COHERE_API_KEY="your-cohere-key"
export CHROMA_OPENAI_API_KEY="your-openai-key"

Technical Features

Embedding Function Persistence

Starting from Chroma v1.0.0, embedding function persistence is supported. Once a collection is created using a specific embedding function, the configuration will be persisted, and subsequent query and insertion operations will automatically use the same embedding function without the need for repeated specification.

Security Considerations

For security reasons, it is recommended to use the --dotenv-path parameter to specify the path to the environment configuration file, avoiding direct exposure of API keys in command-line arguments.

Advanced Search Capabilities

Vector Search: Vector search based on semantic similarity
Full-Text Search: Traditional text matching search
Metadata Filtering: Precise filtering based on document metadata
Hybrid Search: Compound queries combining multiple search methods

Application Scenarios

Shared Knowledge Base

Build a shared knowledge base for teams or organizations, supporting intelligent retrieval and knowledge discovery.

Context Window Memory

Add long-term memory capabilities to LLM applications, extending the limitations of the context window.

Document Question Answering System

Build an intelligent question answering system based on a document library, supporting semantic search and precise retrieval.

Personal Knowledge Management

Create a personal knowledge management system, supporting multi-modal data storage and intelligent retrieval.

Project Summary

Chroma MCP is a powerful and flexible vector database server implementation that combines the robust capabilities of Chroma with the standardized advantages of the Model Context Protocol. By providing multiple client types, rich document operation features, and flexible configuration options, it provides developers with a solid data infrastructure for building intelligent AI applications.

Whether it's the ephemeral client for prototype development or cloud service integration for production environments, Chroma MCP provides a consistent API experience and high-performance data retrieval capabilities. Its support for multiple embedding functions and advanced search features make it an ideal choice for building modern AI applications.

The project's open-source nature and active community support ensure its continued development and improvement. For developers looking to integrate powerful data retrieval capabilities into LLM applications, Chroma MCP is undoubtedly an excellent solution to consider.