GEO: The SEO Specialist's Guide to Ranking in AI Search Engines
Tom Fry

The Search Landscape Has Split in Two
If you've spent the last decade mastering SEO, here's the uncomfortable truth: the rules are changing. Not replacing what you know - but forking.
Traditional search engines still matter. But an increasing share of commercial research now happens through AI answer engines - ChatGPT, Perplexity, Google Gemini, and Claude. When a potential buyer asks "What's the best project management tool for remote teams?", they're increasingly asking an LLM, not typing into Google.
The difference? Google returns a list of links. An LLM returns a direct answer with named recommendations. If your brand isn't in that answer, you don't exist in that conversation.
This is where Generative Engine Optimisation (GEO) comes in. And for SEO specialists, the good news is: most of the foundations you've already built transfer directly. The bad news? There are new signals you're probably ignoring that determine whether AI models can find, parse, and cite your content.
What Is GEO, Exactly?
GEO is the practice of optimising web content so that large language models (LLMs) and AI-powered search engines can reliably:
- Discover your content during retrieval
- Parse it accurately into structured knowledge
- Cite it when generating answers to relevant queries
- Attribute it correctly to your brand or entity
Think of traditional SEO as optimising for a librarian who indexes books by title, author, and subject. GEO is optimising for a reader who needs to understand, quote, and recommend your content in conversation.
How AI Visibility Is Actually Measured
Before diving into tactics, it helps to understand what "visibility" means in the AI context. At Agentcy, we measure AI Visibility using a composite score built from four components:
- Mention Rate (40% weight) - What percentage of relevant prompts result in your brand being mentioned? This is the most important signal. If AI models don't name you, nothing else matters.
- Position Score (25% weight) - When you are mentioned, where do you appear in the response? First recommendation carries a score of 100, second drops to 60, third to 40. This follows a CTR-decay curve similar to traditional SERP position value.
- Citation Score (25% weight) - Are AI models linking to your actual website? This measures whether your domain appears in source citations, which is the equivalent of earning a backlink in traditional SEO.
- Share of Voice (10% weight) - Of all brand mentions across all responses, what percentage are yours? This contextualises your performance relative to the total competitive landscape.
This formula makes one thing clear: getting mentioned is the game. But getting cited - with an actual link to your content - is what drives traffic.
Technical Foundations: Can AI Models Access Your Content?
Before worrying about content quality, you need to ensure AI systems can actually reach, render, and extract your pages. These are the foundational factors - get any of them wrong and nothing else matters.
URL Structure
AI retrieval systems use URL tokens as signals for topical inference, clustering, and disambiguation. A URL like /docs/payments/fraud-prevention/ pre-explains itself before the model reads a single word of content. A URL like /page/12345/ provides zero signal.
Action: Ensure URLs are descriptive, hierarchical, and topic-revealing. Avoid opaque numeric IDs in public-facing URLs.
Semantic HTML Structure
LLMs reconstruct meaning from structure. This is where many modern websites fail spectacularly - they look great visually but are structurally meaningless to a machine reader.
Requirements:
- Exactly one
<h1>per page describing the primary topic - Logical heading hierarchy (
h1 → h2 → h3) - Primary content wrapped in
<main> - Semantic elements (
<article>,<section>,<header>,<footer>,<nav>) - Lists with
<ul>/<ol>/<li>, tables with<table>/<thead>/<tbody>
Div-soup layouts make feature extraction unreliable. If your comparison table is a grid of styled divs, AI models may not recognise it as a comparison at all.
Boilerplate & Noise Control
High boilerplate-to-content ratio causes extraction to grab the wrong text. If your cookie banner, navigation menu, and footer collectively contain more text than your main content, AI models may struggle to identify what actually matters.
Key rule: Main content must dominate <main>. Key definitions must not be hidden behind JavaScript-only tabs or accordions.
Machine-Readable Facts
Many AI systems do not reliably parse PDFs or images. If your pricing table is an embedded image, your product specs are in a PDF, or your comparison data lives in an infographic - AI models can't read it.
HTML-first approach: Core facts must exist in HTML text. Use tables and lists for specs, comparisons, and structured data.
Technical Hygiene
Standard technical SEO applies directly:
sitemap.xmlwith canonical URLs- No duplicate content without canonical resolution
- No infinite crawl spaces from faceted navigation
- Reliable performance and fast response times
Robots & Snippet Control
This is the most overlooked blocker. A page can be public but effectively invisible to AI:
noindexprevents discoverynosnippetprevents quotingmax-snippet:0suppresses reuse- Aggressive bot-blocking (403/429 patterns) prevents crawling
Audit your robots directives. We regularly find companies that have accidentally blocked AI crawlers from their highest-value pages.
Content & Entity Signals: Can AI Models Understand You?
Once your content is accessible, the next challenge is making it understandable. AI models need to confidently identify what you are, what you do, and why you're relevant - before they can recommend you.
Entity Clarity
This is arguably the most important GEO factor and the one SEO specialists most often overlook. AI systems must ground your entity before they can reason about it.
Near the top of your main content, you must clearly answer three questions:
- What is this company/product/service?
- Who is it for?
- What problem does it solve?
Write in plain, unambiguous language suitable for direct quotation. If an AI model can't extract a clean one-sentence definition of what you do, it can't confidently recommend you.
Content Chunking & Answer Shape
Retrieval-augmented generation (RAG) systems don't read your entire page. They extract chunks. Your content needs to be structured so that individual sections are self-contained and answerable.
Include clearly labelled sections:
- TL;DR / Summary
- What it is / How it works
- Key features / capabilities
- Use cases
- FAQs
Use question-oriented headings that mirror real user queries: "How does X work?" rather than "Our Approach". Question-shaped headings act as retrieval anchors and align directly to AI search queries.
Authorship & Freshness
AI systems prefer current information and identifiable authors. This is a trust signal - content with clear attribution is safer to quote.
- Identify a real author or accountable organisation
- Include author pages with role, bio, and credentials
- Display "Last updated" dates where facts can change
- Include
dateModifiedin structured data
Trust & Provenance
For pages that present factual claims - stats, benchmarks, compliance information:
- Provide a methodology or sources/references section
- Show editorial ownership (author, organisation, contact)
- Include accountability signals
AI models are increasingly cautious about unverified claims. Content with provenance signals gets cited more confidently.
Structured Data & Identity: Can AI Models Identify You?
This is where GEO most significantly diverges from traditional SEO. Getting identified correctly - and distinguished from similarly-named entities - is critical for consistent AI citations.
Metadata & Canonical Signals
Familiar territory for SEO specialists, but worth reinforcing: metadata often influences first-pass summaries and classification in AI systems.
- Unique, descriptive
<title>aligned to page topic - Accurate
<meta name="description">in plain language - Correct
<link rel="canonical"> - Open Graph and Twitter Card metadata for cross-platform consistency
Structured Data (JSON-LD)
In traditional SEO, structured data helps with rich snippets. In GEO, it's how AI models resolve your identity.
Priority schema types:
- Organisation/Corporation - with
name,url,logo, andsameAs(linking to official profiles) - Product/SoftwareApplication - with
name,description,offers,applicationCategory - Article/BlogPosting - with
author,datePublished,dateModified - FAQPage - makes FAQ content directly usable in AI answers
- HowTo - structured step-by-step content
Our audits consistently find that pages with comprehensive JSON-LD are cited more frequently by AI models than equivalent content without it.
Entity Disambiguation
If your company shares a name with anything else (a common word, another company, a product), disambiguation is critical:
- Stable
@idin JSON-LD for your Organisation/Product entities sameAslinking to official profiles and authoritative IDs (Wikipedia, Wikidata, LinkedIn, Crunchbase)
Internationalisation
AI systems can mix regions and languages, citing the wrong version of your content. Strong language signals prevent this:
- Correct
<html lang="...">attribute hreflangalternates for multi-locale sites- Locale-correct canonicals
Site Architecture & Citability: Can AI Models Recommend You?
The final layer is about making your content easy to reference, link to, and surface through topical authority signals.
Internal Linking & Site Graph
Internal links create a topic graph that signals topical authority. This works similarly to traditional SEO, but with one key difference: AI models care about topical coherence, not just link equity.
- Use descriptive anchor text (never "click here")
- Build topical hub pages that link to supporting content
- Implement breadcrumbs with schema markup
Anchors & Citability
AI systems cite sections, not just pages. If your headings don't have stable id anchors, models can't deep-link to the specific content they're referencing.
Add stable IDs to key headings. This is a quick win that most sites don't implement.
AI-Optimised Signals
- RSS/Atom feeds - Easy to ingest, keeps content updates discoverable
- llms.txt - An emerging (not yet standardised) file at your domain root that lists authoritative content sections for AI agents
Anti-Patterns That Hurt AI Visibility
These will actively work against you:
- Schema spam - FAQ markup without real FAQs, review markup without real reviews
- Hidden text / cloaking - Major mismatches between source HTML and visible content
- Doorway pages - Near-duplicate pages targeting keyword variants
- Facts only in PDFs/images - No HTML equivalent for key information
- Misleading metadata - Title and description that don't reflect actual content
How AI Models Differ (And Why It Matters)
One of the less obvious aspects of GEO is that different AI models weight signals differently. Our analysis across ChatGPT, Perplexity, Gemini, and Claude reveals meaningful variation:
- Perplexity aggressively cites sources and tends to favour content with clear structure and explicit citations. If you optimise for citability, Perplexity is often the first to reward it.
- ChatGPT draws from a broader knowledge base and weighs brand authority heavily. Established brands with consistent mentions across authoritative sources tend to perform well.
- Gemini has strong integration with Google's index and tends to surface content that performs well in traditional search. Your existing SEO work transfers most directly here.
- Claude is more cautious about citations and tends to favour content with clear provenance and factual grounding.
This means a model-specific strategy matters. If you're performing well on Gemini but poorly on Perplexity, the fix is likely structural (better chunking, clearer sections, more explicit citations) rather than content-level.
The GEO Audit Scoring Framework
When we audit pages for GEO compliance, we score them on a weighted scale from 0 to 100 across these categories:
| Category | Weight | What It Covers |
|---|---|---|
| Accessibility & Rendering | 20% | Can AI models access and render the content? |
| Entity Clarity & Definition | 20% | Is the entity clearly defined and grounded? |
| Extractable Structure & Answer Shape | 15% | Is content structured for extraction? |
| Structured Data & Entity IDs | 15% | JSON-LD, schema types, disambiguation |
| Crawl/Index Hygiene | 15% | Sitemaps, canonicals, no duplicates |
| Internal Linking & Site Graph | 10% | Topical hubs, descriptive anchors |
| Trust/Provenance & Freshness | 5% | Authorship, dates, methodology |
Critically, the framework includes blockers that override the score entirely:
- Key page is
noindex - Primary content requires JavaScript to render
- Canonical points to wrong page
- Content is inaccessible due to bot-blocking
A page can score 90/100 on content quality but still be invisible if it has a single critical blocker.
Quick Wins for SEO Specialists
If you're an SEO specialist looking to start with GEO, here are the highest-impact actions ranked by effort-to-impact ratio:
- Audit robot directives - Check every high-value page for
noindex,nosnippet, and bot-blocking. This is the single most common blocker we find. - Add entity definitions - Put a clear, quotable one-sentence definition of what your company/product does near the top of key pages.
- Implement JSON-LD schema - Organisation on homepage, Product on product pages, Article on blog posts. This alone can meaningfully improve AI parsing.
- Add heading IDs - Give stable
idattributes to key headings for deep-linking and citability. - Structure content as Q&A - Rewrite section headings as questions that match how people actually ask AI assistants.
- Wrap content in <main> - If your primary content isn't in a
<main>element, add one. This is a quick fix that significantly improves extraction. - Convert image-only data to HTML - Any pricing tables, comparison charts, or feature lists that exist only as images need HTML equivalents.
The Buyer Journey Dimension
Our analysis consistently shows that AI visibility varies dramatically across the buyer journey. Most companies have reasonable visibility at the decision stage (when people ask "Is [Brand] good for X?") but are nearly invisible at the awareness stage (when people ask "What are the best tools for X?").
This matters because awareness-stage queries are where recommendations are formed. By the time someone is asking about your brand by name, the AI already has an opinion about you. The question is whether it had your content available when it was forming that opinion.
The fix: Create comprehensive, authoritative content that answers category-level questions, not just brand-level ones. "What is cloud cost management?" matters more than "Why choose [Brand] for cloud cost management?" in the awareness stage.
What GEO Means for Your SEO Strategy
GEO doesn't replace SEO. It extends it. The core competencies - technical auditing, content strategy, structured data, link building - all transfer. But the emphasis shifts:
| SEO Focus | GEO Extension |
|---|---|
| Rank for keywords | Be mentioned in AI answers |
| Earn backlinks | Earn AI citations |
| Rich snippets | Entity resolution & disambiguation |
| Page speed | Content accessibility & parseability |
| Content for humans | Content for humans and machines |
| Keyword research | Prompt research (what people ask AI) |
| SERP features | AI answer positioning |
The companies that will win in AI search are the ones that make their content easy to find, easy to parse, easy to quote, and easy to attribute. That's GEO in a sentence.
Getting Started
The best place to start is with data. Run an AI Visibility audit on your brand to understand your current baseline - how often AI models mention you, where you rank, which competitors are ahead of you, and which of your pages are actually being cited.
From there, the sections above give you a structured framework for improvement. Start with the technical foundations (robots, accessibility), move to entity clarity and structured data, then work through content structure and citability.
The SEO specialists who add GEO to their toolkit now will have a significant advantage. The field is early, the competition is thin, and the fundamentals are things you already understand. The question isn't whether to start - it's how fast you can move.