Home
Login
mendableai/firecrawl-mcp-server

Official Firecrawl MCP server - adds powerful web crawling capabilities for Cursor, Claude, and other LLM clients

MITJavaScript 3.4kmendableai Last Updated: 2025-06-04
https://github.com/mendableai/firecrawl-mcp-server

Firecrawl MCP Server Detailed Introduction

Project Overview

Firecrawl MCP Server is the official Model Context Protocol (MCP) server implementation developed by the Mendable AI team, specifically designed to provide powerful web crawling capabilities for Large Language Model (LLM) clients. This project seamlessly integrates Firecrawl's web crawling capabilities into mainstream AI development tools such as Cursor and Claude Desktop, enabling AI assistants to acquire and analyze web content in real-time.

Project Features:

  • 🎯 Official Support: Officially maintained by the Firecrawl team
  • 🔌 Plug-and-Play: Easily integrated into various LLM clients via the MCP protocol
  • High Performance: Supports JavaScript rendering and intelligent batch processing
  • 🛡️ Enterprise-Grade: Built-in retry mechanisms, rate limiting, and error handling

Core Features

🕷️ Web Crawling and Scraping

  • Single Page Crawling: Quickly retrieves the complete content of a specified webpage
  • JavaScript Rendering: Handles dynamically loaded modern web applications
  • Batch Crawling: Efficiently processes multiple URLs with built-in parallel processing and rate limiting
  • Deep Crawling: Supports recursive crawling of multi-level website structures
  • Mobile Support: Simulates mobile and desktop device perspectives

🔍 Intelligent Search and Discovery

  • Web Search: Integrates search engine functionality to automatically discover relevant content
  • URL Discovery: Intelligently identifies and extracts links from webpages
  • Content Filtering: Supports tag inclusion/exclusion for precise control over crawled content
  • Deduplication: Automatically identifies and handles similar URLs

🧠 AI-Powered Content Extraction

  • Structured Extraction: Uses LLMs to extract structured data from webpages
  • Custom Prompts: Supports custom extraction rules and data schemas
  • Deep Research: Combines crawling, searching, and AI analysis for comprehensive research capabilities
  • llms.txt Generation: Generates standardized LLM interaction files for websites

🔧 Technical Features

  • Automatic Retries: Exponential backoff algorithm handles failed requests
  • Rate Limiting: Intelligent queue and throttling mechanisms
  • Credit Monitoring: Real-time tracking of API usage and costs
  • Multi-Environment Support: Supports both cloud API and self-hosted instances
  • SSE Support: Server-Sent Events for real-time communication

Supported Client Platforms

Cursor IDE

  • Version Requirement: 0.45.6+
  • Integration Method: Configuration via MCP server
  • Functionality: Composer Agent automatically calls web crawling functions

Claude Desktop

  • Integrated via configuration file
  • Supports environment variable configuration
  • Full feature support

VS Code

  • Supported via MCP extension
  • Configurable workspace-level settings
  • Supports team collaboration configuration

Windsurf

  • Native MCP support
  • Simple JSON configuration

Main Tool Functions

1. firecrawl_scrape

Single page content scraping, supports advanced options:

  • Multiple output formats (Markdown, HTML, structured data)
  • Main content extraction only
  • Custom wait times and timeout settings
  • Tag filtering and mobile device simulation

2. firecrawl_batch_scrape

Batch crawling of multiple URLs:

  • Parallel processing for increased efficiency
  • Built-in rate limiting protection
  • Unified configuration options application

3. firecrawl_search

Web search and content extraction:

  • Multi-language and region support
  • Automatic extraction of search result content
  • Configurable result quantity limit

4. firecrawl_crawl

Website deep crawling:

  • Recursive crawling of multiple layers of pages
  • Intelligent URL deduplication
  • External link control

5. firecrawl_extract

AI-driven structured data extraction:

  • Custom JSON Schema
  • LLM intelligent analysis
  • Batch data processing

6. firecrawl_deep_research

Comprehensive research analysis:

  • Multi-source information aggregation
  • Time and depth limits
  • AI-generated research reports

7. firecrawl_generate_llmstxt

Standardized file generation:

  • Website LLM interaction specifications
  • Automated documentation generation
  • Full and simplified version support

Configuration and Deployment

Environment Variable Configuration

# Required configuration (Cloud API)
FIRECRAWL_API_KEY=your-api-key

# Optional configuration (Self-hosted)
FIRECRAWL_API_URL=https://firecrawl.your-domain.com

# Retry mechanism configuration
FIRECRAWL_RETRY_MAX_ATTEMPTS=3
FIRECRAWL_RETRY_INITIAL_DELAY=1000
FIRECRAWL_RETRY_MAX_DELAY=10000
FIRECRAWL_RETRY_BACKOFF_FACTOR=2

# Credit monitoring configuration
FIRECRAWL_CREDIT_WARNING_THRESHOLD=1000
FIRECRAWL_CREDIT_CRITICAL_THRESHOLD=100

Quick Start

# Run directly using npx
env FIRECRAWL_API_KEY=fc-YOUR_API_KEY npx -y firecrawl-mcp

# Global installation
npm install -g firecrawl-mcp

# Start in SSE mode
env SSE_LOCAL=true FIRECRAWL_API_KEY=fc-YOUR_API_KEY npx -y firecrawl-mcp

Advanced Features

Intelligent Retry Mechanism

  • Exponential Backoff Algorithm: Automatically adjusts retry intervals
  • Maximum Retry Attempts: Configurable failure handling strategy
  • Intelligent Error Identification: Distinguishes between transient and permanent errors

Performance Optimization

  • Parallel Processing: Simultaneous processing of multiple URLs for increased efficiency
  • Intelligent Queue: Request prioritization and load balancing
  • Memory Management: Resource optimization for large batch tasks

Monitoring and Logging

  • Detailed Logs: Operation status, performance metrics, error tracking
  • Credit Monitoring: Real-time usage tracking and alerts
  • Rate Monitoring: API call frequency and limit status

Application Scenarios

Content Research and Analysis

  • Competitive analysis and market research
  • News and information aggregation
  • Academic research material collection
  • Trend analysis and data mining

Data Extraction and Organization

  • Batch extraction of product information
  • Contact information and directory organization
  • Price monitoring and comparison
  • Structured data generation

AI Assistant Enhancement

  • Real-time information query capabilities
  • Webpage content understanding and summarization
  • Multi-source information integration and analysis
  • Automated research report generation

Development and Integration

  • API data source supplementation
  • Content management system integration
  • Automated test data preparation
  • Documentation and knowledge base construction

Technical Advantages

Reliability

  • Fault Tolerance: Multi-level error handling and recovery
  • Stability Guarantee: Verified in large-scale production environments
  • Compatibility: Supports various deployment environments and configurations

Scalability

  • Modular Design: Functional components can be configured and used independently
  • API Compatibility: Supports both cloud and self-hosted modes
  • Plugin Architecture: Easy to extend and customize

Performance

  • High Concurrency: Optimized asynchronous processing architecture
  • Low Latency: Intelligent caching and pre-processing mechanisms
  • Resource Efficiency: Optimized use of memory and network resources

Community and Support

Open Source Community

  • MIT License: Fully open source, commercially friendly
  • Active Maintenance: Continuous updates and support from the official team
  • Community Contributions: Developers are welcome to participate in improvements

Technical Support

  • Detailed Documentation: Complete installation and usage guide
  • Sample Code: Rich usage examples and best practices
  • Issue Reporting: Quick response mechanism via GitHub Issues

Summary

Firecrawl MCP Server is a powerful and well-designed web crawling solution specifically designed for the development needs of the AI era. It not only provides the basic functions of traditional crawlers, but more importantly, it achieves seamless integration with various LLM clients through the MCP protocol, enabling AI assistants to acquire and understand web content in real-time.

Core Value:

  • Lowering the Barrier: Simplifies the complexity of web data acquisition in AI applications
  • Improving Efficiency: Intelligent batch processing and error handling mechanisms
  • Ensuring Quality: Enterprise-grade stability and reliability design
  • Promoting Innovation: Provides powerful data acquisition capabilities for AI application development

Whether for individual developers or enterprise teams, whether for simple content extraction or complex data research, Firecrawl MCP Server can provide professional, efficient, and reliable solutions, making it an indispensable component in the modern AI application development toolchain.