Skip to content

Playwright: From Test Runner to AI Agent Interface

11 min read
Bart Waardenburg

Bart Waardenburg

AI Agent Readiness Expert & Founder

In 2017, Google built Puppeteer to automate Chrome. In 2020, the two lead developers left Google, joined Microsoft, and built Playwright. Same idea, but cross-browser from day one. By April 2025, Playwright had overtaken Cypress in npm downloads. Just another testing tool winning the framework wars. Or so it seemed.

Then in March 2025, Microsoft released Playwright MCP, an MCP server that lets AI agents control a browser using the same API you write tests with. Within a year: 25,000+ GitHub stars, built into Claude Code, Cursor, VS Code, and GitHub Copilot. The testing tool quietly became the standard interface between AI agents and the web.

Why? Because with data-testid you're testing whether your own label still matches. With getByRole you're testing whether your UI actually works, for users, for screen readers, and now for AI agents. Three birds, one selector. And your tests are just as stable.

Playwright by the Numbers

What started as a Puppeteer alternative has become the dominant browser automation framework. The numbers are hard to argue with:

WEEKLY NPM DOWNLOADS
33M
GITHUB STARS
80,000+
YEAR-OVER-YEAR GROWTH
0
QA ADOPTION RATE
0

Weekly npm downloads in millions. Source: npm registry. Playwright went from 56K weekly downloads in 2020 to 34M+ in 2026. More than 12,400 companies use it, including Amazon, Apple, Microsoft, and NVIDIA.

But the number that matters most isn't on npm. It's the 25,000+ stars on the Playwright MCP repository, a separate project that turned Playwright from a testing tool into an AI agent runtime.

What Is Playwright MCP?

MCP (Model Context Protocol) is the open standard for connecting AI assistants to external tools. Playwright MCP is Microsoft's official MCP server that gives AI agents a browser they can control. Released March 2025, open source, and now the default way AI agents interact with websites.

It exposes 70+ tools organized into capability groups: navigation, clicking, typing, form filling, tab management, PDF generation, and more. But the most important tool is browser_snapshot.

browser_snapshot: How Agents Read a Page

When an AI agent needs to understand what's on a page, it doesn't take a screenshot. It calls browser_snapshot, which returns the browser's accessibility tree , a YAML-serialized representation of the page showing only semantically meaningful elements.

# browser_snapshot output for a login form
- heading "Welcome back" [level=2]
- textbox "Email" [required] (ref=1)
- textbox "Password" [required] (ref=2)
- button "Sign in" (ref=3)
- link "Forgot password?" (ref=4)
- link "Create account" (ref=5)

That's it. No CSS classes, no layout divs, no inline styles. Just what the page contains and what you can do with it. Each interactive element gets a reference number. The agent says "click ref 3" to sign in. Clean, predictable, cheap in tokens.

Playwright MCP runs in two modes:

Snapshot Mode (Default)

Uses the accessibility tree for all interactions. No vision models needed. Fast, reliable, and token-efficient. This is how most AI agents interact with websites today.

Vision Mode (Opt-in)

Falls back to screenshots for visually complex pages. Adds mouse coordinate tools (click_xy, move_xy, drag_xy). Used when the accessibility tree doesn't capture enough context.

The default is snapshot mode. Not vision. Apparently screenshots of websites are not the best way for AI to understand them. Who knew.

Who's Using Playwright MCP?

Playwright MCP isn't a niche experiment. It's integrated into the tools most developers use daily:

Claude Code

One command to add: claude mcp add playwright -- npx @playwright/mcp@latest. Opens a visible Chrome window. You can log in yourself while the agent watches, and cookies persist for the session.

Cursor

Global MCP config in ~/.cursor/mcp.json. Cursor uses Playwright MCP to preview changes, verify UI behavior, and navigate documentation sites.

VS Code Copilot

MCP support added in VS Code 1.99 (March 2025). Agent Mode uses Playwright MCP to verify code changes directly in the browser.

GitHub Copilot Coding Agent

Built-in, zero configuration. The coding agent uses Playwright MCP to open your app, navigate, and verify its own changes. Uses the exact same selectors as your test suite.

Every major AI coding tool now has browser access via Playwright MCP. Your website isn't just visited by users and crawlers anymore. It's navigated by AI agents using the same tool you test with.

data-testid Is a Code Smell (You Just Can't Smell It Yet)

With data-testid you're testing whether your own label still matches. The test passes, sure. But you haven't verified that a user can actually find that button, that a screen reader can announce it, or that an AI agent knows it exists. You're testing your own bookkeeping.

data-testid

Tests whether your own label still matches. Stable, but doesn't validate that users, screen readers, or AI agents can actually find the element.

getByRole / getByLabel

Tests whether the element actually works. If getByRole can find it, so can a screen reader and an AI agent. Three birds, one selector.

With getByRole your test fails when the button isn't a real button. That forces you to fix the markup. And that fixed markup shows up in the accessibility tree, the same tree that AI agents read via Playwright MCP's browser_snapshot.

The accessibility tree doesn't include test attributes. When Playwright MCP calls browser_snapshot, your carefully placed data-testid="submit-btn" simply doesn't exist. What does exist are roles, labels, and names, exactly what getByRole and getByLabel query.

// Your Playwright test
await page.getByRole('button', { name: 'Add to cart' }).click();
await page.getByLabel('Email').fill('user@example.com');
await page.getByRole('link', { name: 'Checkout' }).click();

// What an AI agent sees via Playwright MCP browser_snapshot
// - button "Add to cart" (ref=12)
// - textbox "Email" (ref=15)
// - link "Checkout" (ref=18)

// Same selectors. Same elements. Same accessibility tree.

The test selector and the agent selector are the same thing. With getByRole you're simultaneously testing your UI, validating accessibility, and verifying that AI agents can navigate your site. With data-testid you're testing a string you made up yourself.

Playwright's own documentation has been nudging teams in this direction for years. Their locator priority guide explicitly recommends getByRole as the first choice and data-testid as a last resort. At the time, the reasoning was "test like a user." Now the reasoning extends to "build for agents."

The Flywheel: Testing, Accessibility, Agent Readiness

Here's what happens when you use Playwright with semantic selectors:

  1. You write Playwright tests with accessibility-first selectors like getByRole, getByLabel, and getByText
  2. This forces accessible HTML. If your getByRole selector can't find the element, it's not a proper button. Your test fails, so you fix the markup
  3. Accessible HTML produces a rich accessibility tree, the same tree that AI agents consume via Playwright MCP
  4. AI agents can navigate your site. They find buttons, fill forms, follow links, and complete tasks
  5. Your tests validate what agents see. If your test suite passes with getByRole selectors, you've verified the accessibility tree is correct

Teams with a solid Playwright test suite using semantic selectors already have two things they didn't know they had: accessible interfaces and AI-agent-ready websites. Teams that skipped testing, or relied on visual regression tests and testid's? Neither.

PLAYWRIGHT LOCATOR PRIORITY #1
getByRole
AI AGENT PRIMARY INTERFACE
Accessibility Tree
OVERLAP
0

ARIA Snapshots: Testing the Tree Itself

Playwright took this connection even further with ARIA snapshot testing, a feature that lets you write assertions directly against the accessibility tree. Instead of checking if an element exists in the DOM, you verify the semantic structure that AI agents and screen readers actually see.

// ARIA snapshot test — validates the accessibility tree directly
await expect(page.getByRole('navigation')).toMatchAriaSnapshot(`
  - link "Home"
  - link "Products"
  - link "Pricing"
  - link "Contact"
`);

// This validates:
// 1. A navigation landmark exists
// 2. It contains four links with these exact names
// 3. The accessibility tree is correct for both screen readers and AI agents

This is the testing equivalent of "what you see is what agents get." If your ARIA snapshot test passes, you know exactly what an AI agent sees when it calls browser_snapshot via Playwright MCP. The test output format and the agent input format are the same YAML structure.

Playwright also introduced Test Agents, built-in AI that uses the accessibility tree to automatically plan, generate, and heal tests. The planner explores your app via accessibility snapshots, the generator writes tests using getByRole selectors, and the healer fixes broken tests when the UI changes by re-reading the accessibility tree. It's agents all the way down.

Beyond Playwright MCP: The Browser Agent Landscape

Playwright MCP isn't the only way AI agents browse the web, but it set the pattern that others follow. Here's how the broader landscape looks:

Agent / Framework Approach Uses Accessibility Tree Built on Playwright
Playwright MCP Accessibility snapshots (YAML) Yes (primary) Yes
browser-use Snapshot + numbered refs Yes (primary) Yes
Amazon Nova Act Direct Playwright integration Yes Yes
Vercel Agent Browser Streamlined accessibility refs Yes (primary) Yes
OpenAI CUA / Operator Hybrid (screenshots + tree) Yes (supplementary) Partial
Claude Computer Use Hybrid (screenshots + read_page) Yes (supplementary) No

Four out of six are built directly on Playwright. All six use the accessibility tree. The framework you chose for testing has become the runtime for AI agents.

What This Means for Your Team

If you're already using Playwright, you're closer to AI agent readiness than you think. But the specifics of how you use it matter:

You're ready if...

Your tests use getByRole, getByLabel, and getByText as primary locators. Your UI uses semantic HTML. Form fields have proper labels. Buttons have accessible names.

You have work to do if...

Your tests rely on data-testid, CSS selectors, or XPath. Your UI is div-soup with ARIA band-aids. Form fields use placeholder text instead of labels.

Practical Steps

  1. Audit your locator strategy. Grep your test suite for data-testid vs getByRole. The ratio tells you how agent-ready your tests (and your UI) are
  2. Migrate critical paths first. Start with your most important user flows: login, search, checkout, signup. Rewrite those tests to use getByRole selectors. Fix the HTML where tests fail
  3. Add ARIA snapshot tests. For key pages, add snapshot tests that validate the accessibility tree structure. These serve as a contract: if the snapshot passes, agents can navigate the page
  4. Try Playwright MCP yourself. Install it in Claude Code or Cursor and point an agent at your own site. Watch what it sees. The gaps become obvious immediately
  5. Scan for AI agent readiness. Use our free scanner to check semantic HTML, form labels, heading hierarchy, and interactive surface coverage across 47 checkpoints

The Bigger Picture: Testing as an AI Agent Contract

Playwright tests written with accessibility selectors are, functionally, a contract that defines what AI agents can do on your site. If your test says "click the Add to cart button," that's a capability you're exposing to agents.

Compare this to WebMCP , the W3C proposal where websites explicitly register tools for AI agents. WebMCP does it with tool definitions. Playwright MCP does it implicitly through the accessibility tree. Different mechanism, same result: your site becomes something machines can actually use.

Organizations that invested in Playwright testing with semantic selectors, whether for accessibility compliance, testing best practices, or just because the docs said so, accidentally built the foundation for AI agent interaction. The same interface that makes software testable makes it agent-ready. Not by design. Just by doing the work properly.

Conclusion

Playwright went from Puppeteer alternative to the most popular testing framework to the default browser runtime for AI agents. Each step followed logically from the last: cross-browser automation, accessibility-first selectors, and now MCP-based agent control.

With data-testid you test whether your own label still matches. With getByRole you test whether your UI actually works, for users, for screen readers, and for AI agents. Three birds, one selector. And your tests are just as stable.

Test like a user, build for an agent. Same selector.

Sources

Ready to check?

SCAN YOUR WEBSITE

Get your AI agent readiness score with actionable recommendations across 5 categories.

  • Free instant scan with letter grade
  • 5 categories, 47 checkpoints
  • Code examples for every recommendation

RELATED ARTICLES

Continue reading about AI agent readiness and web optimization.

Vercel's agent-browser: Why a CLI Beats MCP for Browser Automation
10 min read

Vercel's agent-browser: Why a CLI Beats MCP for Browser Automation

Vercel's agent-browser hit 22,000 GitHub stars in two months. It's a CLI, not an MCP server, and the data shows why: 94% fewer tokens, 3.5x faster execution, 100% success rate. We break down how it works, why it uses the accessibility tree, and what the 'less is more' finding means for your website.

ai-agents web-standards accessibility
How AI Agents See Your Website: The Accessibility Tree Explained
12 min read

How AI Agents See Your Website: The Accessibility Tree Explained

AI agents don't see your website the way humans do. They navigate via the accessibility tree — a browser-generated structure originally built for screen readers. We explain how it works, which AI frameworks use it, and why accessible websites outperform in the age of AI agents.

ai-agents web-standards accessibility
What Is agents.json? Advertising AI Agent Capabilities on Your Website
10 min read

What Is agents.json? Advertising AI Agent Capabilities on Your Website

agents.json is the emerging complement to robots.txt - a machine-readable file that tells AI agents what your website can do. We cover the Wildcard specification, compare it to A2A, MCP, and OpenAPI, and show you how to implement it step by step.

ai-agents web-standards agent-protocols

EXPLORE MORE

Most websites score under 45. Find out where you stand.

RANKINGS
SEE HOW OTHERS SCORE

RANKINGS

Browse AI readiness scores for scanned websites.
COMPARE
HEAD TO HEAD

COMPARE

Compare two websites side-by-side across all 5 categories and 47 checkpoints.
ABOUT
HOW WE MEASURE

ABOUT

Learn about our 5-category scoring methodology.