How AI Agents See Your Website: The Accessibility Tree Explained
51% of all web traffic is now automated (Imperva Bad Bot Report 2025). Bots, crawlers, and increasingly AI agents navigating the web on behalf of users. These agents don't see your website the way you do. They can't admire your hero image or intuitively scan your navigation. They rely on a structured representation of your page that was originally built for a completely different audience: people using screen readers.
That structure is the accessibility tree, and it has quietly become the most important interface between your website and the AI agents that want to use it. Understanding how this tree works, and how to optimize for it, is now a key part of AI agent readiness .
How AI Agents Actually See Your Website
AI agents that browse the web use one of three approaches to understand what's on a page. Each has real tradeoffs in accuracy, speed, and token cost:
Vision-Based (Screenshots)
Takes a screenshot and uses multimodal AI to interpret it visually. Used by Anthropic's Computer Use and Google's Project Mariner (83.5% on WebVoyager). Expensive in tokens and prone to misreading dense layouts.
DOM Parsing (Raw HTML)
Reads the full Document Object Model — every div, span, script, and style tag. A typical page produces 15,000+ tokens of noise. The signal-to-noise ratio makes this impractical for complex agent tasks.
Accessibility Tree
A browser-generated simplified view showing only semantically meaningful elements — roles, names, states, and descriptions. Strips away visual noise. Typically reduces a page to ~200-400 tokens.
The industry is converging on the accessibility tree as the primary interface. The full DOM contains every div, span, style, and script tag. The accessibility tree strips away the noise and presents only what matters: interactive elements, their labels, their states, and their relationships. Think of it as the difference between reading a novel and reading its table of contents. Same content, wildly different efficiency for navigation.
What Is the Accessibility Tree? A Technical Breakdown
The accessibility tree is a simplified representation of the DOM that browsers automatically generate from your HTML. Originally designed to expose page structure to assistive technologies like screen readers, it captures exactly the information AI agents need to understand and interact with a page.
Every node in the accessibility tree has four core properties:
Role
What the element is: button, link, heading, textbox, navigation, form. Derived from HTML semantics or explicit ARIA roles.
Name
The accessible label: button text, alt text, aria-label, or associated label element. This is what the agent 'reads' to understand each element.
State
Current condition: checked, expanded, disabled, selected, required. Tells the agent what can be interacted with and how.
Description
Additional context from aria-describedby or title attributes. Provides supplementary information beyond the name.
To see this in practice, here's how the DOM and the accessibility tree represent the same login form:
DOM vs Accessibility Tree: A Comparison
<!-- What the DOM sees: 47 nodes, deeply nested -->
<div class="flex flex-col gap-4 p-6 bg-white rounded-xl shadow-lg
border border-gray-200 max-w-md mx-auto">
<div class="text-center mb-2">
<h2 class="text-2xl font-bold text-gray-900">Welcome back</h2>
<p class="text-sm text-gray-500 mt-1">Sign in to your account</p>
</div>
<div class="space-y-4">
<div class="relative">
<label class="block text-sm font-medium text-gray-700 mb-1"
for="email">Email</label>
<input type="email" id="email" name="email" required
class="w-full px-3 py-2 border border-gray-300 rounded-lg
focus:ring-2 focus:ring-blue-500 focus:border-blue-500
placeholder-gray-400"
placeholder="you@example.com" />
</div>
<div class="relative">
<label class="block text-sm font-medium text-gray-700 mb-1"
for="password">Password</label>
<input type="password" id="password" name="password" required
class="w-full px-3 py-2 border border-gray-300 rounded-lg" />
</div>
<button type="submit"
class="w-full py-2 px-4 bg-blue-600 hover:bg-blue-700
text-white font-medium rounded-lg transition-colors">
Sign in
</button>
</div>
</div>
# What the accessibility tree sees: 5 meaningful nodes
- heading "Welcome back" level=2
- textbox "Email" required focused
- textbox "Password" required
- button "Sign in"
- text "Sign in to your account"
47 nodes of styling classes, layout divs, and visual attributes in the DOM. Five nodes in the accessibility tree. That's the information an agent needs to fill in credentials and submit the form. This is why accessibility tree parsing is 93% more token-efficient than raw DOM parsing, and why every major AI agent framework is converging on this approach. It's also why AI systems like ChatGPT and Claude prefer to cite websites with clean semantic structure .
You can inspect the accessibility tree yourself in Chrome DevTools: open the Elements panel, find the "Accessibility" tab in the sidebar, and you'll see the computed accessibility properties for any element. This is exactly what AI agents see when they interact with your page.
Which AI Agent Frameworks Use the Accessibility Tree?
The shift to accessibility-tree-first browsing isn't theoretical. The most widely used AI agent frameworks already use it as their primary page understanding mechanism:
Playwright MCP (Microsoft)
Uses accessibility snapshots as the default page representation. The browser_snapshot tool returns a YAML-formatted accessibility tree. All element interactions use getByRole, getByLabel, and getByText — never CSS selectors.
browser-use
Achieves 89.1% on WebVoyager using a Snapshot+Refs system. Generates numbered accessibility snapshots where each interactive element gets a unique ref ID. The agent refers to elements by number, not by CSS path.
OpenAI CUA / Atlas
Uses a hybrid approach combining screenshots with accessibility tree data. The Computer-Using Agent extracts semantic roles and labels from the accessibility tree to ground its visual understanding.
Claude Computer Use
Anthropic's computer use tool includes a read_page action that extracts the accessibility tree. This provides the structured page representation Claude uses for web navigation tasks.
All four converge on the same insight: semantic structure matters more than visual appearance. For a deeper look at how these agent protocols work, see our guide on the Model Context Protocol (MCP) .
Here's what a Playwright MCP accessibility snapshot actually looks like for a navigation menu:
# Playwright MCP browser_snapshot output
- navigation "Main":
- link "Home" (ref=1)
- link "Products" (ref=2)
- link "Enterprise" (ref=3)
- link "Startup" (ref=4)
- link "Pricing" (ref=5)
- link "Documentation" (ref=6)
- link "Contact Sales" (ref=7)
- main:
- heading "Build faster with AI" level=1
- text "Deploy intelligent agents that understand your codebase"
- link "Get Started Free" (ref=8)
- link "View Demo" (ref=9)
Each element has a semantic role, a human-readable name, and a reference number. The agent says "click ref 8" to click the "Get Started Free" link. No CSS selectors, no pixel coordinates, no visual parsing needed. Clean, efficient, and reliable.
The Research: Accessible = Better AI Performance
Academic research confirms what the frameworks are showing us. Websites that follow accessibility best practices produce dramatically better results for AI agents. The most comprehensive evidence comes from the CHI 2026 study "Is the Web Accessible for AI Agents?", which systematically tested how accessibility barriers affect agent performance.
When AI agents have access to a fully accessible page with proper semantic structure, they succeed 78.33% of the time. Force them through accessibility barriers like keyboard-only navigation without visible focus indicators, or magnified views that hide context, and performance drops by nearly half or more. The same barriers that prevent disabled users from using your site also prevent AI agents from completing tasks.
The broader benchmark landscape tells a similar story. AI agent performance on web tasks has improved rapidly, but the gains concentrate on well-structured, accessible websites:
| Benchmark | Year | Best Score | Key Insight |
|---|---|---|---|
| WebArena | 2023 | 14.41% | Early baseline — agents struggled with real websites |
| WebArena | 2025 | 62%+ | 4x improvement in 2 years, driven by better tree parsing |
| WebVoyager | 2025 | 89.1% | browser-use with accessibility snapshots leads the board |
| WebVoyager | 2025 | 83.5% | Google Project Mariner using hybrid vision+tree approach |
The CHI 2026 researchers identified three categories of failure when agents encounter inaccessible pages:
Perception Gaps
Missing alt text, unlabeled buttons, and absent ARIA labels mean agents literally can't 'see' interactive elements. If it's not in the accessibility tree, it doesn't exist.
Cognitive Gaps
Poor heading hierarchy, missing landmarks, and unclear form labels force agents to guess at page structure. They waste tokens navigating aimlessly instead of completing tasks.
Action Gaps
Custom widgets without keyboard support, click-only interactions, and missing focus management prevent agents from actually performing actions — even when they understand the page.
The WCAG–AI Agent Readiness Overlap
The connection between web accessibility (WCAG) and AI agent readiness isn't a coincidence. It's structural. Both screen readers and AI agents consume the same accessibility tree, which means every WCAG best practice directly improves AI agent performance. Here's how the mapping works:
| Accessibility Practice | WCAG Criterion | AI Agent Benefit | Scanner Checkpoint |
|---|---|---|---|
| Semantic HTML elements | 1.3.1 Info & Relationships | Agents identify page regions (nav, main, footer) without guessing | 3.3 Semantic HTML |
| Heading hierarchy (h1-h6) | 1.3.1 / 2.4.6 Headings | Agents build mental model of content structure and topic flow | 3.2 Heading Hierarchy |
| Image alt text | 1.1.1 Non-text Content | Agents understand image context without vision models | 3.5 Alt Text |
| ARIA landmarks & labels | 1.3.1 / 4.1.2 Name, Role | Custom widgets become discoverable and operable | 3.4 ARIA Usage |
| Language attribute | 3.1.1 Language of Page | Agents select correct language model for content processing | 3.6 Language Attribute |
| Form labels | 1.3.1 / 3.3.2 Labels | Agents know what data goes in which field | 4.6 Form Quality |
| Keyboard navigation | 2.1.1 Keyboard | Agents can operate all controls without mouse simulation | 4.8 Interactive Surfaces |
| Server-side rendering | N/A (best practice) | Content is available before JavaScript execution | 3.1 SSR Detection |
| Descriptive link text | 2.4.4 Link Purpose | Agents understand where links lead without following them | 3.7 Descriptive Links |
As Jason Taylor, Chief Accessibility Innovation Strategist at UsableNet, puts it: "Optimizing for accessibility is like optimizing for AI agents — same tree, same rules." The accessibility tree is the shared interface that both screen readers and AI agents depend on. When you make your site WCAG-compliant, you're simultaneously making it AI-ready.
Our scanner measures exactly these overlapping signals. Every checkpoint in our Content & Semantics category maps directly to a WCAG success criterion. The same structural quality that enables accessibility enables AI comprehension.
The ARIA Controversy: Use With Caution
If the accessibility tree is the interface, you might think the answer is to add ARIA attributes everywhere. Several prominent AI voices, including OpenAI, have suggested that developers should add
aria-label
attributes to improve agent understanding. The accessibility community has a strong counter-message: most ARIA usage on the web today is harmful, not helpful.
According to WebAIM's annual Million analysis, pages that use ARIA attributes average 57 accessibility errors, compared to just 27 errors on pages without any ARIA. More than double. ARIA is overwhelmingly misused, added as a band-aid over non-semantic markup rather than as a genuine enhancement to well-structured HTML.
Accessibility expert Adrian Roselli has been particularly vocal about this: OpenAI's own guidance encouraging developers to add ARIA labels contradicts the first rule of ARIA, which states: "If you can use a native HTML element or attribute with the semantics and behavior you require already built in, instead of re-purposing an element and adding an ARIA role, state or property to make it accessible, then do so."
The right approach for both accessibility and AI agent readiness:
-
Use semantic HTML first
—
<button>instead of<div onclick>,<nav>instead of<div class="nav"> - Add ARIA only when HTML falls short, for custom widgets, dynamic content regions, or complex interaction patterns
- Test your accessibility tree, verify that what the browser generates matches what you intend agents and screen readers to see
-
Never use ARIA to override correct HTML semantics. A
<button role="link">confuses both screen readers and agents
Here's a concrete example of the right and wrong approach:
<!-- WRONG: div soup with ARIA band-aids -->
<div role="navigation" aria-label="Main navigation">
<div role="list">
<div role="listitem">
<div role="link" tabindex="0" aria-label="Home"
onclick="navigate('/')">Home</div>
</div>
<div role="listitem">
<div role="link" tabindex="0" aria-label="Products"
onclick="navigate('/products')">Products</div>
</div>
</div>
</div>
<!-- RIGHT: semantic HTML that needs zero ARIA -->
<nav aria-label="Main navigation">
<ul>
<li><a href="/">Home</a></li>
<li><a href="/products">Products</a></li>
</ul>
</nav>
The semantic version is shorter, produces a cleaner accessibility tree, works with keyboard navigation by default, and requires zero ARIA. AI agents can parse it instantly. The div-soup version requires 10 ARIA attributes to achieve the same result. Poorly.
The Business Case: 95% of Websites Are Failing
If the accessibility tree is the AI agent interface, then the state of web accessibility is also the state of AI agent readiness. And the numbers are sobering.
WebAIM's 2025 Million analysis found that 94.8% of the top 1,000,000 websites have detectable WCAG failures. The average page has 51 errors. But here's where it gets interesting: 96% of all detected errors fall into just six categories:
- Low contrast text (81% of pages) — doesn't directly affect AI agents but signals poor attention to standards
- Missing alt text (54.5%) — agents can't understand images without it
- Missing form labels (45.9%) — agents can't fill in forms without knowing what each field expects
- Empty links (44.6%) — agents can't navigate when links have no accessible name
- Empty buttons (28.1%) — agents can't click buttons they can't identify
- Missing document language (17.1%) — agents may process content with the wrong language model
Five of these six issues directly degrade the accessibility tree that AI agents depend on. Fixing them doesn't require a redesign. Add alt text, label forms, name links and buttons, set a language attribute. Quick, measurable improvements that simultaneously fix accessibility compliance and AI agent readiness.
The opportunity is clear. While 95% of websites are failing at accessibility (and therefore at AI agent readiness), the 5% that get it right will be the ones AI agents can actually use, recommend, and transact with. As AI agents become more prevalent in web commerce and information discovery, being in that 5% becomes a real competitive advantage.
Practical Checklist: Making Your Site AI-Agent-Accessible
Based on the research, framework requirements, and accessibility data, here's a three-tier approach to improving how AI agents see your site:
Tier 1: Quick Wins (Fix the Six Most Common Issues)
Add Alt Text to All Images
Every meaningful image needs descriptive alt text. Decorative images get alt='' (empty). This alone fixes issues on 54.5% of failing pages.
Label All Form Fields
Every input needs a visible <label> with a matching for/id pair. Placeholder text is not a label — agents need explicit associations.
Name All Links and Buttons
No empty links, no icon-only buttons without aria-label. If a screen reader can't announce it, an AI agent can't click it.
Set the Language Attribute
Add lang='en' (or appropriate code) to your <html> element. Agents use this to select the right processing model.
Tier 2: Structural Improvements
Fix Heading Hierarchy
Use a single h1, then h2 for sections, h3 for subsections. Never skip levels. Agents use headings to build a mental model of your content.
Use Semantic HTML Landmarks
Replace div-based layouts with <header>, <nav>, <main>, <aside>, <footer>. These create navigation landmarks in the accessibility tree.
Enable Server-Side Rendering
AI crawlers and agents often don't execute JavaScript. SSR ensures your content appears in the initial HTML response.
Use Descriptive Link Text
Replace 'click here' and 'read more' with descriptive text. Agents should understand a link's destination from its text alone.
Tier 3: Advanced AI Agent Optimization
Add Structured Data (JSON-LD)
Schema.org markup gives agents explicit entity information. Organization, Product, FAQPage, and BreadcrumbList schemas are particularly valuable.
Implement WebMCP
Expose your site's capabilities as structured tool definitions that agents can discover and invoke. See our WebMCP guide for details.
Create llms.txt
A plain-text file at /llms.txt that tells AI systems what your site offers, in a format optimized for language models.
Publish agents.json
Advertise your AI agent capabilities so other agents can discover and interact with your services programmatically.
For detailed implementation guides on the advanced tier, see our posts on WebMCP , agents.json , and the Google A2A protocol .
Conclusion: One Tree, Two Audiences
The accessibility tree was designed to make the web usable for people with disabilities. Nobody intended it to become the primary interface between AI agents and websites. But that's exactly what happened, and it makes perfect sense. Both screen readers and AI agents need the same thing: a structured, semantic representation of what a page contains and what actions are possible.
This convergence means that every investment in web accessibility is also an investment in AI agent readiness. Semantic HTML, proper heading hierarchy, labeled forms, descriptive links, ARIA where needed. These aren't compliance checkboxes. They're the building blocks of how AI agents discover, understand, and interact with your website.
With 94.8% of websites currently failing at basic accessibility, the opportunity is right there. The sites that fix their accessibility tree today will be the ones AI agents can use tomorrow, and the ones that get recommended, cited, and transacted with in the emerging agent economy.
Sources
- Imperva 2025 Bad Bot Report — 51% of web traffic is now automated
- "Is the Web Accessible for AI Agents?" — CHI 2026 Study — Systematic testing of accessibility barriers on AI agent performance
- WebAIM Million 2025 — Annual Accessibility Analysis — 94.8% of top 1M sites have WCAG failures, 51 average errors per page
- browser-use — AI Agent Browser Automation Framework — 89.1% WebVoyager score with accessibility snapshot approach
- Anthropic Computer Use — Claude Web Navigation — read_page tool using accessibility tree extraction
- Playwright MCP — Microsoft's Accessibility-First Browser Automation — YAML accessibility snapshots as default page representation
- Google Project Mariner — AI Web Agent — 83.5% WebVoyager using hybrid vision + accessibility tree
- WebArena: A Realistic Web Environment for Building Autonomous Agents — Benchmark showing progression from 14% to 62%+ success
- Adrian Roselli — OpenAI and Accessibility — Criticism of ARIA overuse guidance from AI companies
- W3C: Using ARIA — First Rule of ARIA — "If you can use a native HTML element… then do so"
- IsAgentReady: What Is AI Agent Readiness?
- IsAgentReady: SEO vs AEO — How Traditional Search Differs from AI Optimization
- IsAgentReady: What Is MCP? The Model Context Protocol for AI Agents
- IsAgentReady: What Is WebMCP and Why Your Website Needs It
- IsAgentReady: What Is agents.json? Advertising AI Agent Capabilities
- IsAgentReady: What Is Google A2A? Agent-to-Agent Communication Protocol