Does Schema Markup Get You Cited by AI? What the Data Actually Shows

June 09, 2026 • 9 min read

Bart Waardenburg

AI Agent Readiness Expert & Founder

Open almost any guide to AI search optimization and the same advice sits near the top: add JSON-LD, mark up your FAQs, and the AI engines will cite you. It sounds intuitive. Structured data is machine-readable, AI is a machine, so structured data must help AI find and cite you. In 2026 that assumption finally got tested at scale, and the result is uncomfortable for anyone selling schema as a citation lever.

This is the post that reconciles the data. We separate what structured data is actually proven to do from what it is merely correlated with, using a controlled study of 1,885 pages, a retrieval experiment across five AI systems, and the citation research everyone keeps quoting. If you want the broader picture first, start with the guide on SEO vs AEO .

The experiment: schema barely moved citations

In 2026, Ahrefs ran the test the industry had been avoiding. They tracked 1,885 pages that added JSON-LD schema between August 2025 and March 2026, matched them against roughly 4,000 control pages that did not, and measured the change in citations across Google AI Overviews, Google AI Mode, and ChatGPT. If schema were a citation lever, the pages that added it should have pulled ahead.

They did not.

Google AI Overviews

-4.6%

Citations after adding schema, versus controls

Google AI Mode

+2.4%

Too small to separate from random noise

ChatGPT

+2.2%

Too small to separate from random noise

AI Overviews actually dipped, and the small positive moves on AI Mode and ChatGPT were within the range of random variation. Adding schema produced no meaningful lift on any platform.

Why correlation looked like causation

If schema does not cause citations, why does every case study show cited sites covered in structured data? Because the sites that get cited tend to be well-built, and well-built sites tend to have schema. The markup rides along with everything that actually earns the citation: authority, depth, freshness, clean structure. Three figures get quoted constantly, and all three are correlations, not levers.

FAQPage presence

6.2% vs 0.8%

Cited vs non-cited ChatGPT sites (Insightland), an 8x gap

Schema selection rate

+73%

Correlation in Google AI Overviews (Wellows)

Strong E-E-A-T

96%

Share of AI Overview citations (Wellows)

Read these as descriptions of what cited pages look like, not as instructions that produce citations. The 8x FAQPage gap from Insightland means cited sites carry FAQPage schema far more often, not that bolting on FAQPage schema makes you 8x more likely to be cited. The Wellows +73% and 96% figures were measured in Google AI Overviews specifically, and E-E-A-T is a Google quality concept, not proof that any single markup signal causes a citation.

What AI crawlers actually read

Here is the mechanism most guides skip. When an AI assistant fetches your page in real time to answer a question, it does not parse your JSON-LD. A 2025 searchVIU experiment tested ChatGPT, Claude, Perplexity, Gemini, and Google AI Mode, and during direct retrieval every one of them extracted only the visible HTML. JSON-LD, hidden Microdata, and hidden RDFa were all ignored.

So structured data does not reach the model through the page it fetches. Its path is indirect: schema feeds the search index (rich results in Google and Bing), and those indexes feed AI search products. A real pathway, but second-hand, which is exactly why adding schema to an already-indexed page does so little.

So is schema useless? No.

Dropping structured data would be the wrong lesson. It still does three concrete jobs, none of which is "directly cause AI citations."

Rich results

Schema earns rich results in Google and Bing, the indexes that feed AI search products. The benefit is indirect but real.

Agent parseability

When an agent does parse structured data, typed entities and Q&A pairs let it extract facts without guessing them out of prose.

Discovery

For pages AI systems have not seen yet, schema can help them get crawled, parsed, and indexed in the first place.

What actually drives AI citations

If schema is a minor, indirect factor, where should your effort go? Toward the signals the citation research keeps surfacing. These are still mostly correlational, but they describe cited content far more reliably than markup does.

Authority and brand. Cited domains skew heavily toward established, frequently-referenced sites. This is the strongest pattern across every study.
Comprehensive content. Pages that answer the main question and naturally cover related angles are cited more often than narrow, single-keyword pages.
Question-led structure. In Kevin Indig / Growth Memo's analysis of 3M ChatGPT responses, among citations tied to a question, 78.4% came from a heading, and 44.2% of citations came from the first 30% of the page. AI tends to treat an H2 as a prompt and the text beneath it as the answer.
Freshness. Recently updated content is associated with substantially more citations. Treat it as a strong correlate, not a guaranteed multiplier.
Being in the index. Allow the right crawlers (OAI-SearchBot and friends) so you can be retrieved at all. This is the one true gate.

Where to invest your effort

A practical priority order that matches the evidence:

Make sure the right AI crawlers can reach you. No access means no citation, no debate.
Write comprehensive, well-structured content with question-led headings and the answer up top.
Keep important pages fresh, with clear dateModified signals.
Build authority the slow way: become the source that other sources reference.
Add structured data, but for the right reasons: rich results and agent parseability, not a citation multiplier.

The bottom line

Schema markup is table stakes for machine-readability and a sensible investment for rich results and agent parseability. It is not a switch you flip to get cited by AI. The controlled data is clear: adding it to a page does not move citations. So keep your structured data clean, then spend the rest of your energy on the things that actually correlate with being cited: authority, comprehensiveness, structure, and freshness. Our scanner scores all of these, and it is honest about which are levers and which are correlates.

Curious how the individual AI systems choose sources? Read the companion posts on how ChatGPT chooses which websites to cite , how Google AI Overviews selects sources , and what AI agent readiness means .

Sources

Ahrefs: We Tracked 1,885 Pages Adding Schema -Controlled study of schema's effect on AI citations
searchVIU: What AI assistants actually parse -Retrieval test across five AI systems
Kevin Indig / Growth Memo: The science of how AI picks its sources -3M responses, 30M citations, content-structure analysis
Insightland: Structured Data and AI Search -FAQPage schema visibility correlation
Wellows: AI Overview Ranking Factors -Schema and E-E-A-T correlations in Google AI Overviews

Does Schema Markup Get You Cited by AI? What the Data Actually Shows

The experiment: schema barely moved citations

Why correlation looked like causation

What AI crawlers actually read

So is schema useless? No.

Rich results

Agent parseability

Discovery

What actually drives AI citations

Where to invest your effort

The bottom line

Sources

SCAN YOUR WEBSITE

RELATED ARTICLES

Content Negotiation for AI Agents: Why Sentry Serves Markdown Over HTML

Cloudflare /crawl Endpoint: One API Call to Crawl Any Website

AI Crawlers Ignore llms.txt — But AI Agents Don't

EXPLORE MORE

RANKINGS

COMPARE

ABOUT