How Claude Search Selects Sources to Cite
Claude's web search has something ChatGPT and Google AI Overviews don't: it relies almost entirely on a single search backend. Brave Search, with an 86.7% alignment between citations and Brave's organic results. Understand that relationship and you understand how to get cited by Claude.
I dug through Anthropic's crawler documentation, API specs, third-party studies, and technical disclosures to figure out how Claude picks its sources and what makes it different from the rest.
The Brave Search Backbone
In March 2025, TechCrunch confirmed that Claude's web search runs on Brave Search. That changes quite a bit about how you optimize for Claude visibility.
The BrightEdge analysis put numbers on it: 86.7% of Claude's cited results overlap with Brave's top non-sponsored organic results. For comparison, ChatGPT shows only 26.7% alignment with Bing's top results. Claude trusts its search backend way more than ChatGPT trusts Bing.
Bottom line: ranking well in Brave Search is ranking well in Claude. But Brave's index has a threshold. Content needs visits from at least 20 unique Brave browser users with data-sharing enabled before becoming eligible for indexing. That gives established domains with diverse traffic an automatic head start.
Three Crawlers, Three Purposes
Anthropic operates three separate crawlers , each with its own purpose. The documentation was last updated February 20, 2026, when the newest crawler was added.
CLAUDE-SEARCHBOT
Indexes and evaluates content quality for search results. Blocking this reduces your visibility and accuracy in Claude-powered search.
CLAUDEBOT
Crawls content for AI model training data. Can be blocked independently from search without affecting visibility.
CLAUDE-USER
Fetches pages when users explicitly ask Claude to read a specific URL. Still honors robots.txt, unlike OpenAI's equivalent.
The big difference from OpenAI: all three of Anthropic's crawlers still honor robots.txt, including Claude-User. OpenAI stopped respecting robots.txt for their ChatGPT-User bot in December 2025. Anthropic still plays by the rules. They also support the non-standard
Crawl-delay
directive and don't try to bypass CAPTCHAs.
As Search Engine Journal reported , this three-bot system gives site owners finer-grained control than any other AI platform. You can allow search indexing while blocking training, or allow user browsing while restricting automated crawling.
The recommended
robots.txt
for granular Anthropic control:
# Allow Claude search indexing
User-agent: Claude-SearchBot
Allow: /
# Allow Claude user-initiated browsing
User-agent: Claude-User
Allow: /
# Block AI model training (optional)
User-agent: ClaudeBot
Disallow: /
How Claude Actually Selects Sources
Claude's source selection is a multi-step process, documented in Anthropic's API documentation :
- Decision to search: Claude autonomously decides whether to search based on three criteria - freshness (does the query need current info?), specificity (how targeted is the question?), and intent (what's the underlying purpose?)
- Query execution: The Brave Search API returns the top organic results
- Content evaluation: Claude filters and evaluates results based on relevance, clarity, and extractability
- Iteration: This cycle can repeat up to ten times in a single conversation turn, refining the search as Claude learns more
An interesting detail from the Groundy analysis : Claude favors content that is "concise, current, and aligns closely with the user's phrasing and intent." Pages need to match conversational query patterns. Content written in a natural, question-answering style performs better than keyword-stuffed copy. Fits right into the shift from traditional SEO toward AI Engine Optimization (AEO) , where writing for how people ask questions matters more than keyword density.
Dynamic Filtering: Why Clean HTML Matters
In February 2026, Anthropic shipped something that caught my attention: dynamic filtering . Claude can now write and execute Python code to post-process raw HTML before it reaches the context window.
In practice, Claude actively strips away:
- Navigation menus and sidebars
- Footer content and boilerplate
- Advertising and tracking markup
- Irrelevant metadata
The numbers speak for themselves:
Dynamic filtering is currently available only on Opus 4.6 and Sonnet 4.6 via the Claude API and Azure (not Vertex AI), and requires the code execution tool to be enabled alongside web search.
How Claude Cites Sources
Claude uses inline citations with clickable source links. Similar to ChatGPT, but different from Perplexity's footnote-heavy approach. Every web-sourced claim includes:
- URL: The source page URL
- Title: The source page title
- Cited text: Up to 150 characters of the specific content being cited
- Encrypted index: A reference for maintaining citations in multi-turn conversations
Nice detail from the API docs: citation metadata (cited_text, title, url) does not count toward input or output token usage. That saves money when building applications with Claude's web search.
Claude "only cites what it can verify" and avoids hallucinated citations. Can't verify a claim against search results? It'll either omit the citation or qualify the statement. No making up plausible-looking references.
The Citations API: A Separate Feature
A distinction that's easy to miss. Claude's web search citations (discussed above) are different from the separate Citations API launched in January 2025. That API lets developers ground Claude's responses in user-provided documents (PDFs, plain text, custom content) with precise character-level references.
Internal evaluations show the Citations API increases recall accuracy by up to 15% compared to custom prompt-based implementations. But it's a developer tool for supplied documents. It has no impact on how your website gets discovered or cited in Claude's web search.
No Publisher Licensing Deals
OpenAI has formal licensing deals with AP, Conde Nast, Financial Times, News Corp, The Atlantic, Springer, and Washington Post. Anthropic? No announced publisher licensing partnerships.
What Anthropic does have is a $1.5 billion copyright settlement (September 2025). The largest in U.S. history, covering roughly 500,000 copyrighted works. The settlement covers only past use (before August 25, 2025) and explicitly is not a licensing deal for future use.
For website owners, this means there's no "preferred publisher" list for Claude citations. Every site competes on equal footing through Brave Search rankings and content quality. That makes the technical optimization covered in this article all the more relevant.
LEVEL PLAYING FIELD
No publisher licensing deals means every website competes on equal terms. Your visibility in Claude depends entirely on Brave Search rankings and content quality, not on corporate partnerships.
How Claude Compares to ChatGPT and Google
| Dimension | Claude | ChatGPT | Google AI Overviews |
|---|---|---|---|
| Search backend | Brave Search | Bing (+ Google for paid) | Google Search |
| Backend alignment | 86.7% with Brave | 26.7% with Bing | Native integration |
| User-browsing bot | Honors robots.txt | Ignores robots.txt (since Dec 2025) | N/A |
| Content processing | Dynamic filtering (strips boilerplate) | Direct content ingestion | Full index processing |
| Publisher deals | None | AP, Conde Nast, FT, News Corp | Various licensing agreements |
| Citation style | Inline clickable links | Inline links in text | Source cards with URLs |
| Crawl-delay support | Yes | Not documented | No |
What You Can Do Today
Based on Claude's architecture, these are the things that make the biggest difference:
1. ALLOW CLAUDE-SEARCHBOT
This is the gate to Claude visibility. Blocking this bot reduces your presence in Claude's search answers.
2. OPTIMIZE FOR BRAVE
With 86.7% alignment, ranking in Brave Search is effectively ranking in Claude. Ensure Brave can index your content.
3. CLEAN YOUR HTML
Claude's dynamic filtering strips boilerplate. Clean semantic HTML with content-first structure gives you an edge.
4. WRITE CONVERSATIONALLY
Claude favors content that matches conversational query patterns. Write naturally, not keyword-stuffed.
-
Use semantic HTML
(
<article>,<main>,<section>) to help Claude's filtering understand your content structure - Server-side render your content. Claude's crawlers cannot execute client-side JavaScript
- Keep content concise and current. Claude filters for relevance and freshness
- Add structured data. While Claude relies on Brave, structured data improves Brave rankings which flows through to Claude. Across platforms, schema markup shows a +73% selection rate in AI Overviews and sites with FAQPage schema are 8× more likely to be cited by ChatGPT
- Maintain an XML sitemap. Aids content discovery for all crawlers including Claude-SearchBot
Wrapping Up
Of the three major AI platforms, Claude's source selection is the most transparent. 86.7% alignment with Brave Search. No mystery about how to get cited: rank well in Brave, allow Claude-SearchBot, and write clean, well-structured content that matches how people naturally ask questions.
The advantages of optimizing for Claude: Anthropic respects all robots.txt directives (including for user-initiated browsing), offers the most fine-grained crawler control, and has no preferred publisher list. A level playing field where content quality and technical execution determine visibility.
For the full picture across AI platforms, read our analyses of how ChatGPT chooses which websites to cite and how Google AI Overviews selects sources . For broader trends, see key insights from Vercel's 2026 AEO report .
Sources
- Anthropic: Crawler Documentation -Three-crawler system and robots.txt guidance
- Anthropic: Web Search Tool API Documentation -Official search tool specifications
- Anthropic: Web Search Now Available Globally -Official launch announcement
- Anthropic: Improved Web Search with Dynamic Filtering -Dynamic filtering technical details
- Anthropic: Introducing Citations API -Document-level citations feature
- TechCrunch: Anthropic Uses Brave for Web Search -Brave Search backend confirmation
- BrightEdge: The Ultimate Guide to Claude Search -86.7% Brave Search alignment analysis
- Groundy: Claude's Web Search Changes Everything -Content selection behavioral analysis
- Search Engine Journal: Anthropic's Claude Bots -Three-bot system analysis
- Search Engine Land: Anthropic Clarifies Claude Bots -Crawler documentation updates
- Ropes & Gray: Anthropic Copyright Settlement -$1.5B settlement analysis