Home
Login
microsoft/playwright-mcp

Playwright-based Model Context Protocol (MCP) server providing browser automation capabilities for LLMs

Apache-2.0TypeScript 11.8kmicrosoft Last Updated: 2025-06-13
https://github.com/microsoft/playwright-mcp

Microsoft Playwright MCP Project Details

Overview

Microsoft Playwright MCP is a server based on the Model Context Protocol (MCP) that leverages Playwright to provide powerful browser automation capabilities for Large Language Models (LLMs). The core innovation of this project lies in interacting with web pages through structured accessibility snapshots, completely bypassing the need for traditional screenshots or visually tuned models.

This project represents a new paradigm for AI agent and web interaction, enabling LLMs to operate browsers and perform complex web automation tasks in a more efficient and precise manner.

Core Features and Characteristics

🚀 Core Technical Advantages

  • Fast and Lightweight: Uses Playwright's accessibility tree structure instead of pixel-based input.
  • LLM-Friendly: Operates purely on structured data, eliminating the need for visual models.
  • Deterministic Tool Application: Avoids the ambiguity commonly associated with screenshot-based methods.
  • High Reliability: Provides stable and predictable automation results.

📋 Key Application Scenarios

  1. Web Navigation and Form Filling

    • Automated web browsing
    • Intelligent form data population
    • Multi-step operation workflows
  2. Structured Content Data Extraction

    • Web data scraping
    • Content parsing and extraction
    • Data structuring
  3. LLM-Driven Automated Testing

    • Intelligent test case generation
    • Automated regression testing
    • User behavior simulation
  4. General Browser Interaction for Agents

    • AI agent web page operation
    • Automated workflows
    • Intelligent web page assistant

Installation and Configuration

VS Code Integration Installation

Method 1: Configuration File Method

{
  "mcpServers": {
    "playwright": {
      "command": "npx",
      "args": [
        "@playwright/mcp@latest"
      ]
    }
  }
}

Method 2: Command Line Installation

# VS Code
code --add-mcp '{"name":"playwright","command":"npx","args":["@playwright/mcp@latest"]}'

# VS Code Insiders
code-insiders --add-mcp '{"name":"playwright","command":"npx","args":["@playwright/mcp@latest"]}'

Running Modes

Headed Mode (Default)

Standard browser mode with a graphical interface, suitable for development and debugging:

{
  "mcpServers": {
    "playwright": {
      "command": "npx",
      "args": ["@playwright/mcp@latest"]
    }
  }
}

Headless Mode

Suitable for background or batch processing:

{
  "mcpServers": {
    "playwright": {
      "command": "npx",
      "args": [
        "@playwright/mcp@latest",
        "--headless"
      ]
    }
  }
}

SSE Transport Mode

For display-less systems or IDE worker processes:

npx @playwright/mcp@latest --port 8931

Configuration file:

{
  "mcpServers": {
    "playwright": {
      "url": "http://localhost:8931/sse"
    }
  }
}

Detailed Interaction Modes

Snapshot Mode (Recommended by Default)

  • Uses accessibility snapshots
  • Better performance and reliability
  • Structured data interaction

Vision Mode

  • Uses screenshots for visual interaction
  • Suitable for operations requiring visual input
  • Requires models that support computer vision

Enable Vision Mode:

{
  "mcpServers": {
    "playwright": {
      "command": "npx",
      "args": [
        "@playwright/mcp@latest",
        "--vision"
      ]
    }
  }
}

Available Tool APIs

Basic Interaction Tools

Page Operations

  • browser_click - Performs a click operation
  • browser_hover - Hovers over an element
  • browser_drag - Performs a drag operation
  • browser_type - Enters text
  • browser_select_option - Selects an option from a dropdown

Navigation Control

  • browser_navigate - Navigates to a URL
  • browser_navigate_back - Goes back
  • browser_navigate_forward - Goes forward

Tab Management

  • browser_tab_list - Lists all tabs
  • browser_tab_new - Creates a new tab
  • browser_tab_select - Selects a tab
  • browser_tab_close - Closes a tab

Advanced Functionality Tools

Content Capture

  • browser_snapshot - Accessibility snapshot (recommended)
  • browser_take_screenshot - Page screenshot
  • browser_screen_capture - Screen capture

File Operations

  • browser_file_upload - File upload
  • browser_pdf_save - Save as PDF

System Interaction

  • browser_press_key - Key press operation
  • browser_handle_dialog - Handles browser dialogs
  • browser_resize - Resizes the window
  • browser_wait - Waits for a specified time

Screen Coordinate Operations (Vision Mode)

  • browser_screen_move_mouse - Moves the mouse
  • browser_screen_click - Clicks at coordinates
  • browser_screen_drag - Drags on the screen
  • browser_screen_type - Types on the screen

Debugging Tools

  • browser_console_messages - Gets console messages
  • browser_install - Installs the browser

User Data Management

Playwright MCP creates browser profiles in the following locations:

  • Windows: %USERPROFILE%\AppData\Local\ms-playwright\mcp-chrome-profile
  • macOS: ~/Library/Caches/ms-playwright/mcp-chrome-profile
  • Linux: ~/.cache/ms-playwright/mcp-chrome-profile

All login information is stored in this profile, and you can delete it between sessions to clear offline status.

Programming Integration

For scenarios requiring programmatic integration, you can use the following method:

import { createServer } from '@playwright/mcp';

const server = createServer({
  launchOptions: { headless: true }
});

transport = new SSEServerTransport("/messages", res);
server.connect(transport);

Summary

The Microsoft Playwright MCP project is a significant innovation in the field of AI agent browser automation. It redefines the way LLMs interact with the Web through the following key advantages:

🎯 Technical Innovation Points

  1. Structured Interaction Paradigm: Abandoning the traditional screenshot + visual recognition approach, adopting an accessibility tree structure to provide a more precise and efficient interaction experience.

  2. LLM Native Design: Specifically optimized for Large Language Models, eliminating the need for additional visual processing capabilities, reducing system complexity and resource consumption.

  3. Microsoft Official Support: As an official Microsoft project, it provides enterprise-grade reliability and continuous maintenance guarantees.

🌟 Application Value

  • Improved Development Efficiency: Provides developers with powerful automated testing and web operation tools.
  • Enhanced AI Agents: Enables AI agents to have truly practical web operation capabilities.
  • Optimized Cost-Effectiveness: Reduces computing resource requirements through structured methods.

🚀 Future Prospects

This project represents the future development direction of AI and Web interaction. As the MCP ecosystem continues to improve, it is expected to play an important role in the following areas:

  • Intelligent customer service and user support automation
  • Intelligent processing of complex business processes
  • Large-scale Web data collection and analysis
  • Automated testing of cross-platform applications

Microsoft Playwright MCP is not only a technical tool, but also an important milestone in expanding the capabilities of AI agents, providing a solid technical foundation for building more intelligent and practical AI applications.