Playwright MCP Server — Cybernauten

Microsoft's official Playwright MCP Server gives AI agents full browser control via accessibility tree snapshots — no vision model needed, 10-100x faster than screenshot-based approaches.

stable browser automation updated 2026-01

install

npx @playwright/mcp@latest

npm: @playwright/mcp

↗ GitHub

capabilities

✓ Navigate to URLs and interact with web pages ✓ Click, hover, drag, fill forms, select dropdowns ✓ Structured accessibility tree snapshots (not screenshots) ✓ Screenshot capture when needed ✓ Browser tab management (open, close, resize) ✓ Console log and network request monitoring ✓ Wait for content and page load states

compatible with

Claude DesktopClaude CodeCursorWindsurfVS Code CopilotAny MCP-compatible client

The Fetch MCP Server handles static web content. The Playwright MCP Server handles everything else — JavaScript-rendered pages, login flows, form submissions, multi-step interactions, and live web applications.

It’s Microsoft’s official MCP implementation, using the same Playwright engine that powers automated browser testing for thousands of companies. The key architectural choice that separates it from other browser automation tools: it uses the accessibility tree, not screenshots.

Accessibility Tree vs Screenshots

Most browser automation tools for AI agents work by taking screenshots and feeding them to vision models. This has two problems: cost and speed. Screenshots are 500KB–2MB of image data per interaction. Vision models need to parse pixel coordinates, infer element positions, and reason about layout.

Playwright MCP uses structured accessibility snapshots instead — just 2–5KB of structured text describing every interactive element on the page. Claude gets element IDs, labels, roles, and states. No vision model needed. The result is 10–100x faster interactions and significantly lower token costs per browser action.

When you do need a visual — verifying a page rendered correctly, capturing a chart — the browser_take_screenshot tool is available. But most workflows don’t need it.

Tools

The server exposes 25+ tools grouped by function:

Navigation

browser_navigate — Go to a URL
browser_navigate_back / browser_navigate_forward — History navigation
browser_wait_for — Wait for text, element, or load state

Interaction

browser_click — Click any element by reference
browser_fill — Fill text inputs and textareas
browser_select_option — Select dropdown values
browser_hover — Hover over elements
browser_drag — Drag and drop
browser_key_press — Keyboard input

Content

browser_snapshot — Capture accessibility tree (primary tool)
browser_take_screenshot — Pixel screenshot when needed
browser_get_console_logs — Retrieve JS console output
browser_network_requests — Inspect network activity

Tab management

browser_tab_new / browser_tab_close / browser_tab_list

Installation

Node.js 18+ required. No global installation needed — run directly with npx:

{
  "mcpServers": {
    "playwright": {
      "command": "npx",
      "args": ["-y", "@playwright/mcp@latest"]
    }
  }
}

Add to Claude Desktop config at:

macOS: ~/Library/Application Support/Claude/claude_desktop_config.json
Windows: %APPDATA%\Claude\claude_desktop_config.json

Docker (isolated, multi-arch amd64/arm64):

{
  "mcpServers": {
    "playwright": {
      "command": "docker",
      "args": ["run", "-i", "--rm", "--init", "--pull=always", "mcr.microsoft.com/playwright/mcp"]
    }
  }
}

Claude Agent SDK (as shown in Anthropic’s official docs):

options: {
  mcpServers: {
    playwright: { command: "npx", args: ["@playwright/mcp@latest"] }
  }
}

What It’s Good For

Scraping JavaScript-heavy pages: Single-page apps, React/Vue sites, dynamically-loaded content — all pages that the Fetch MCP Server can’t reach.

Automated testing: Run end-to-end tests described in plain English. Claude navigates, interacts, verifies — using the same Playwright infrastructure your QA team uses.

Web research with login requirements: Navigate a login flow, then access paywalled or authenticated content. Session state persists across tool calls within a conversation.

Form automation: Fill and submit multi-step forms, handle CAPTCHAs where accessible (note: some CAPTCHA types intentionally block automation), extract results.

API exploration via browser: Some APIs are only accessible through web UIs. Playwright MCP can interact with them the same way a human would.

Headless vs Headed Mode

By default the server runs headless (no visible browser window). For debugging, pass --headed to see what’s happening:

"args": ["-y", "@playwright/mcp@latest", "--headed"]

The browser extension option connects to an existing Chrome or Edge session — useful when you need to reuse an authenticated session from your normal browser.

Our Take

The Playwright MCP Server is the complement to the Fetch MCP Server — together they cover essentially all web content access scenarios. Fetch handles static pages quickly and cheaply; Playwright handles everything that needs a real browser.

The accessibility-tree-first architecture is a meaningful technical improvement over screenshot-based approaches. For agents doing significant amounts of browser work, the token cost difference compounds quickly.

The only friction: first-run setup downloads Playwright’s browser binaries (~100MB). Subsequent runs are fast.

Rating: 9.0/10

By Cybernauten · Mar 3, 2026 ← mcp directory