How Google AI Overviews Selects Sources to Cite
Google AI Overviews now appear for roughly 30% of U.S. desktop searches. On mobile, frequency is growing 475% year-over-year. These AI-generated summaries sit above traditional search results and cite specific websites. That changes the value of organic rankings quite a bit.
But Google's official documentation says "there are no additional requirements" to appear in AI Overviews. Is that the whole story? I analyzed official sources, large-scale studies covering 500+ million keywords, and 16 months of longitudinal data to find out what actually gets you cited. This is part of our series on how AI platforms select sources, alongside our guides on how ChatGPT selects sources and how Claude selects sources .
The Query Fan-Out Technique
The most distinctive thing about Google AI Overviews is query fan-out. Instead of running a single search, the Gemini-powered system breaks your question into subtopics and runs multiple searches simultaneously.
From the official Google blog :
With the Gemini 3 upgrade , this technique became even more sophisticated. Google's latest model "more intelligently understands user intent" and "can find new content that it may have previously missed."
A study analyzing 173,902 URLs and 33,000 fan-out queries quantified the impact:
- Pages ranking for both the main query and at least one fan-out query accounted for 51% of AI Overview citations
- Ranking for fan-out queries is 49% more likely to earn a citation than ranking for the head term alone
- Spearman correlation of 0.77 between fan-out breadth and citation likelihood
What this means in practice: comprehensive content that covers related subtopics performs significantly better. A page that answers the main question and naturally covers related angles is far more likely to be cited than one narrowly focused on a single keyword. The exact opposite of the hyper-focused, single-keyword content that dominated traditional SEO.
The Organic Ranking Connection
Multiple large-scale studies show a strong but nuanced relationship between traditional organic rankings and AI Overview citations.
What the Studies Found
seoClarity analyzed 500+ million keywords and found that 97% of AI Overviews cite at least one source from the top 20 organic results. The #1 position appears more than half the time.
Originality.ai found that 52% of AI Overview citations come from top-10 Google results. The top-ranked document alone has a 58% chance of being cited. By the top 30, nearly 90% of all citations are covered.
BrightEdge's 16-month longitudinal study across 9 industries found that AI Overview citations from organically-ranking pages grew from 32.3% to 54.5% (a 69% relative increase). Only 16.7% of citations came from top-10 results, meaning positions 21-100 drive most of the overlap growth.
The surprising finding: 68% of cited pages didn't rank in the top 10 for either the main query or any fan-out query. AI Overviews give deeper-ranking, authoritative content a platform it never had in traditional search.
Industry Variation Is Significant
The BrightEdge study showed major differences by industry:
- Healthcare, Education, B2B Tech, Insurance: 68-75% overlap between AI Overview citations and organic rankings (trust-sensitive YMYL content)
- E-commerce: Only 22.9% overlap with virtually no change over 16 months
- Restaurants and Travel: Under 24% overlap
What Google Says vs. What the Data Shows
Google's official documentation keeps things deliberately vague about what it takes to appear in AI Overviews:
- "There are no additional requirements to appear in AI Overviews or AI Mode."
- "You don't need to create new machine readable files, AI text files, or markup."
- "There's also no special schema.org structured data that you need to add."
- To be eligible, a page must be indexed and eligible to be shown in Google Search with a snippet.
But third-party research keeps showing the same thing: certain factors dramatically improve your chances. There's a gap between what's required (nothing special) and what actually works (quite a lot):
Structured Data: +73% Selection Rate
Google says no special schema is needed. But Wellows' analysis found that schema markup correlates with a +73% selection rate for AI Overview citations. A Search Engine Land analysis found that "only the page with well-implemented schema appeared in an AI Overview and achieved the best organic ranking, suggesting that schema quality, not just its presence, may play a role."
So how does that work? Google says no special structured data is required. But structured data improves organic rankings (the primary pathway to AI Overview citations), helps Google understand entities, and makes content more machine-parseable. All of which improve citation likelihood indirectly. The pattern holds across platforms: ChatGPT research shows sites with FAQPage schema are 8× more likely to be cited than those without. More on why structured data matters for all AI platforms in our guide to AI agent readiness .
E-E-A-T: 96% of Citations Come from Strong Signals
Wellows' research found that 96% of AI Overview citations come from sources with strong E-E-A-T signals (Experience, Expertise, Authoritativeness, Trustworthiness).
- Pages with expert authorship are 3.2x more likely to be cited than generic staff-written content ( Relixir )
- Author Schema that connects content to real human experts in the Knowledge Graph strengthens these signals
- For YMYL topics (healthcare, finance), E-E-A-T verification is especially important. These industries show the highest overlap with organic rankings
Entity Richness: 4.8x Higher Selection
Pages with 15+ recognized entities show 4.8x higher AI Overview selection probability. Google's Knowledge Graph stores entities and their relationships: people, products, organizations, concepts. Content with lots of well-connected entities is simply easier for Google's AI to verify and cite.
Google's Livegraph system assigns confidence weights to every identified triple (subject-predicate-object). Pages without strong entity signals get filtered out early, even if they're well-written. The AI needs to connect content to verified entities in the Knowledge Graph.
Content Format: Answer Units of 134-167 Words
Content that matches how AI Overviews are structured performs best:
- Self-contained answer units of 134-167 words perform best
- Pages using lists, tables, or FAQs align with how AI summaries are structured
- 44.2% of citations come from the first 30% of text . Front-load key information
- Multi-modal content (text + images + video) shows +156% selection rate
- Content scoring 8.5/10+ on semantic completeness is 4.2x more likely to be cited
The Google-Extended Confusion
A common misconception: blocking
Google-Extended
in
robots.txt
does not
affect AI Overviews.
- Google-Extended is a robots.txt user-agent token specifically for AI model training data collection
- AI Overviews use standard Googlebot for crawling, not Google-Extended
- Blocking Google-Extended has no impact on search rankings, indexation, or AI Overview visibility
- Only blocking Googlebot itself would remove you from search entirely
As confirmed by Playwire's analysis , you can safely block AI training crawlers without hurting your AI search visibility. The opt-out situation is still moving though. As of early 2026, Google is exploring ways to let sites opt out of AI Overviews specifically, separate from traditional search.
Most Cited Domains in AI Overviews
The SurferSEO AI Citation Report (36 million AI Overviews, 46 million citations) shows that video content dominates:
- YouTube (~23.3%), the most cited domain across every vertical
- Wikipedia (~18.4%)
- Google.com (~16.4%)
- Reddit, LinkedIn, Facebook round out the top tier
Domain-specific experts like NIH, Shopify, and ScienceDirect show up as trusted names within their niches. AI Overviews distribute citations more evenly among niche sites than ChatGPT does.
The Semrush study (150,000+ citations) found that the top 20 domains account for 66% of all citations. Still concentrated, but leaving real room for specialized, authoritative content.
The Semrush AI Overviews study (10+ million keywords tracked from January-November 2025) found that 84% of AI Overviews appear for informational queries, 12.5% for transactional keywords (rising trend), and just 0.01% for local keywords.
When Do AI Overviews Appear?
AI Overviews don't appear for every search. Knowing the trigger patterns helps you focus your optimization:
- seoClarity : 30% of U.S. desktop keywords trigger AI Overviews (September 2025)
- Mobile AI Overview frequency grew 475% year-over-year
- AI Overviews peaked at ~25% in July 2025, then declined to 15.69% by November. Google is being more selective
- Average AI Overview text length dropped 70% (from ~5,300 to ~1,600 characters), producing shorter, more focused summaries
Google also claims positive engagement: according to the official Google blog , "when people click from search results pages with AI Overviews, these clicks are higher quality (meaning, users are more likely to spend more time on the site)."
What You Can Do Today
1. ADD QUALITY SCHEMA
Implement JSON-LD with Organization, Article, FAQPage, and Author schema. Quality matters more than quantity - schema with errors can hurt.
2. COVER SUBTOPICS
Write comprehensive content that covers related angles. The fan-out technique rewards breadth, not just depth on a single keyword.
3. BUILD E-E-A-T
Add author bios, expert quotes, and connect content to real entities. Expert authorship provides a 3.2x citation boost.
4. FORMAT FOR EXTRACTION
Use lists, tables, FAQs, and self-contained answer units of 134-167 words. These formats align with how AI Overviews are structured.
- Front-load key information . 44% of citations come from the first third of content
- Use specific entities . Pages with 15+ recognized entities show 4.8x higher selection
- Add images and video. Multi-modal content shows +156% selection rate
- Don't block Googlebot. Google-Extended is separate and safe to block
- Server-side render your content . Make sure important information is in the HTML source. See our deep dive on how AI agents see your website
Wrapping Up
Google officially says "just do good SEO." The data says that's necessary but not sufficient. Structured data, E-E-A-T signals, entity richness, comprehensive content, and the right formatting all measurably improve citation likelihood. Especially outside YMYL topics, where organic ranking overlap is lowest.
The biggest opportunity? 68% of cited pages don't rank in the top 10. AI Overviews are giving deeper-ranking authoritative content a platform. If you've been stuck on page two of Google, AI Overviews might be your way onto page one's equivalent. More on the shift from traditional SEO to AI optimization in our SEO vs AEO comparison .
Sources
- Google: AI Features and Your Website -Official documentation on AI Overviews requirements
- Google Blog: Expanding AI Overviews and AI Mode -Official description of the fan-out technique
- Google Blog: Gemini 3 in Search -Latest model upgrades for AI Overviews
- Google Blog: AI in Search Driving Higher Quality Clicks -Click quality from AI Overviews
- Search Engine Land: Fan-Out Rankings Study -173,902 URLs analysis
- seoClarity: AI Overviews Impact Study -500+ million keywords analysis
- Originality.ai: Google Ranking and AI Citations -Organic ranking correlation
- BrightEdge: 16 Months of AI Overviews -Longitudinal overlap study by industry
- Wellows: Google AI Overviews Ranking Factors -Schema, E-E-A-T, and entity correlation
- Relixir: How to Earn Citations in AI Overviews -Expert authorship impact
- SurferSEO: AI Citation Report -36 million AI Overviews analyzed
- Semrush: Most Cited Domains by AI -150,000+ citations analysis
- Semrush: AI Overviews Study -10+ million keywords tracking
- Search Engine Land: Schema and AI Overviews Visibility -Schema quality impact
- DataDome: Google-Extended Explained -Training vs. search crawler distinction