In March 2025, Google and Microsoft made something official: they actively use schema markup during AI response generation. At SMX Munich 2025, Microsoft's Principal Product Manager stated directly that "schema markup helps Microsoft's LLMs understand your content."
This wasn't a surprise to anyone paying attention. But the public confirmation changed the conversation. Schema markup isn't just a nice-to-have for rich snippets anymore. It's a direct input into whether AI engines understand your content well enough to cite it.
What Is Schema Markup, and Why Does It Matter for AI?
Schema markup is structured data you add to your HTML — typically as JSON-LD in the <head> — that provides machine-readable metadata about your content.
For traditional search, schema enabled rich results: star ratings, FAQ dropdowns, how-to steps, and recipe cards in the SERP. For AI search, the stakes are higher.
Here's the key difference: When a user asks ChatGPT or Perplexity a question, the AI doesn't just read your content the way a human does. It processes the page, extracts meaning, and decides whether it trusts the source enough to cite it. Schema markup compresses that process. It tells the engine:
- What type of content this is (article, FAQ, product, organization)
- Who wrote it and when
- What entity or business is behind it
- What the page is definitively about
Without schema, AI engines have to infer all of this from unstructured text — and they often infer wrong, or not at all.
Why JSON-LD Specifically
There are three formats for structured data: JSON-LD, Microdata, and RDFa. Google recommends JSON-LD. The AI ecosystem has aligned on it.
The reason is practical: JSON-LD lives in a separate <script> block in the <head>, completely independent of your HTML structure. You can add, update, or remove it without touching your visual content. It's maintainable, easy to validate, and doesn't interfere with CSS or layout.
Microdata requires embedding attributes directly in your HTML elements — messy, fragile, and hard to maintain at scale. For AI optimization, JSON-LD is the format. Use it.
The Schema Types That Matter Most for AI Citations
1. Organization
What it tells AI engines: Who you are as a business — your name, what you do, where to find you, and how to verify your identity across the web.
Why it matters: AI models need to establish that your brand is a real, credible entity. Organization schema with sameAs links (pointing to your LinkedIn, Twitter/X, Crunchbase, Wikipedia, and other authoritative profiles) gives AI engines a web of cross-references to verify against.
Minimum viable implementation: Add Organization schema with name, url, description, logo, sameAs (your social profiles), and contactPoint to every page of your site, particularly the homepage and About page.
2. FAQPage
What it tells AI engines: That this page contains question-answer pairs with direct, structured answers — exactly the format AI engines extract and cite.
Why it matters: FAQPage schema pages are 3.2× more likely to appear in Google AI Overviews — the highest citation multiplier of any schema type tested. This is the single highest-ROI schema type for GEO because AI engines are built to extract and present Q&A content. When you provide it in structured form, you eliminate ambiguity.
Best practice: Keep answers between 50–150 words. Write them to answer the question fully on their own — not as hooks to read the rest of the article. AI engines extract the answer text verbatim.
3. Article
What it tells AI engines: Who wrote this, when it was published, when it was last updated, and what it's about.
Why it matters: Freshness and authorship are primary trust signals for AI engines. Without Article schema, an AI crawler has to infer the publish date from visible text (unreliable) and the author from byline text (often inconsistent). Schema makes both machine-readable and authoritative.
Key fields: headline, author (linked to a Person schema), datePublished in ISO 8601 format, dateModified (update whenever you update the content), and publisher linked to your Organization schema.
Pro tip: When you update old content, updating dateModified in your schema signals freshness to AI crawlers even if the URL doesn't change. This is one of the lowest-effort ways to maintain citation eligibility for evergreen content.
4. Person (Author Schema)
What it tells AI engines: That the content was written by a real, credentialed human being — not anonymously generated.
Why it matters: Perplexity rarely cites anonymous content. Google's E-E-A-T framework (which feeds directly into AI Overviews) treats author identity and demonstrated expertise as core trust signals. 96% of Google AI Overview citations come from sources with strong E-E-A-T.
Create an author profile page for each contributor. Link their Article schema to their Person schema. AI engines follow these links to verify credibility.
5. HowTo
What it tells AI engines: This page contains a step-by-step process for accomplishing a specific goal.
Why it matters: How-to queries are among the highest-volume AI search interactions. HowTo schema makes your page the unambiguous answer. Use it on any page that walks through a defined process. Include step objects with clear name and text for each step.
The Critical Warning: Generic Schema Hurts More Than It Helps
Research from a 2026 empirical study of 730 AI citations found something counterintuitive: generic, partially-filled schema produces an 18-percentage-point citation penalty compared to having no schema at all.
The reasoning: AI engines interpret incomplete schema as a mismatch between what you claim and what you deliver. A Product schema with only name filled in is a false signal. It looks like a bot scraped the markup, not that a real business is properly describing itself.
The rule is simple: if you implement a schema type, implement it fully. Every required and recommended field should have a real value. If you don't have the data to fill it in properly, leave that schema type out until you do.
The Implementation Priority Order
If you're starting from scratch, tackle schema in this order:
- Organization — on every page. This is the foundation.
- FAQPage — on your top 5 pages by traffic. Highest citation lift.
- Article — on every blog post. Always include datePublished, dateModified, and author.
- Person — create author pages. Link them from Article schema.
- HowTo — on instructional content. High-value for tutorial-style queries.
Schema Is the Floor, Not the Ceiling
Schema markup tells AI engines what your content is. But AI engines also care about what your content says and whether anyone else confirms it.
A perfectly schema'd page with weak original content will still lose to a well-written, authoritative piece with basic schema. Schema raises your floor — it eliminates the technical barriers to being understood. The ceiling is set by content quality, brand authority, and the trust signals you build over time.
That's why the best GEO strategy combines technical excellence (schema, robots.txt, page speed, llms.txt) with content strategy (original research, answer-first structure, genuine expertise).