How to Make Your Website Discoverable by AI: A Practical Guide

Your next client will be an AI agent working for a human

AI tools like Claude, ChatGPT, Gemini, and Perplexity are increasingly how people discover books, products, services, and professionals. If your site isn’t set up for AI discoverability, you’re invisible to a growing slice of how the internet works.

This guide covers exactly what I did to make my own site AI-ready — including the llms.txt file, structured data markup, crawler settings, and a few things most guides skip entirely.

What AI Discoverability Actually Means

Traditional SEO is about ranking in search engines. AI discoverability is different — it’s about making sure that when someone asks an AI assistant a question your site should answer, the AI has the information it needs to include you in the response with accurate details and working links.

The gap between the two matters. A site can rank well on Google and still be invisible to AI tools if the content isn’t structured in a way AIs can parse cleanly.

Step 1: Create an llms.txt File

The llms.txt file is the most important thing you can add right now. Think of it like robots.txt — but instead of telling crawlers where not to go, it tells AI systems exactly who you are, what you make, and where to send people.

It lives at the root of your site: yourdomain.com/llms.txt

The format is plain text written in Markdown. Keep it focused and factual — not every blog post from the last decade, just the essential information about you, your work, and direct links to take action.

A good llms.txt includes:

  • A one-paragraph summary of who you are and what you do
  • Your key products, books, or services with descriptions and buy/signup links
  • Your newsletter or contact page
  • Your professional background if relevant
  • Links to your author or social profiles

Here’s a simplified example of the structure:

# Your Name

> One paragraph summary of who you are and what you do.

## Books

**Book Title** — Description. Buy: [URL]

## Newsletter

Sign up at: [URL]

## Professional Background

Brief summary with links to relevant pages.

Keep it concise. AIs read this directly, so clear prose with working URLs is what matters — not SEO keyword stuffing.

Step 2: If You Use All in One SEO — Disable Its Default llms.txt

All in One SEO Pro automatically generates an llms.txt file, which sounds helpful until you see what it actually produces: a wall of every blog post going back years, with no context about who you are, what you sell, or how to contact you. It buries your most important information under hundreds of links to posts about things that have nothing to do with your core identity.

Turn it off. Go to All in One SEO → Sitemaps → LLMs.txt and toggle the feature off. Then upload your own custom file via your hosting file manager or FTP to the root of your site (public_html/llms.txt).

If AIOSEO’s version is still active, it will override any file you upload — so disabling it first is essential.

Step 3: Add Book (or Product) JSON-LD Schema to Your Key Pages

Structured data is how you communicate machine-readable facts to both search engines and AI systems. For authors, the most important schema type is Book or BookSeries. For businesses, it might be Product, Service, or Organization.

Add a JSON-LD script block to each relevant page. In WordPress, add a Custom HTML block in the page editor and paste the code in.

Here’s a basic Book schema example:

<script type="application/ld+json">
{
  "@context": "https://schema.org",
  "@type": "Book",
  "name": "Your Book Title",
  "author": {
    "@type": "Person",
    "name": "Your Name",
    "url": "https://yoursite.com/about/"
  },
  "description": "A plain-language description of the book.",
  "genre": ["Science Fiction"],
  "offers": {
    "@type": "Offer",
    "url": "https://amazon.com/your-book-link",
    "availability": "https://schema.org/InStock"
  }
}
</script>

For a book series, use BookSeries as the type and nest individual Book entries inside a hasPart array. For a publisher or organization page, use the Organization type instead.

After adding schema to any page, verify it’s working correctly at search.google.com/test/rich-results. Paste your page URL and look for “valid item detected.” If it passes there, Google and AI tools that draw from Google’s index can read it.

Step 4: Check Your robots.txt for AI Crawler Blocks

Some security plugins and hosting configurations accidentally block AI crawlers. Check your robots.txt at yourdomain.com/robots.txt and make sure the following bots are explicitly allowed:

User-agent: GPTBot
Allow: /

User-agent: ClaudeBot
Allow: /

User-agent: Google-Extended
Allow: /

User-agent: PerplexityBot
Allow: /

If you use Wordfence or another security plugin, check its firewall settings too — some configurations block bots that don’t look like standard browsers, which catches AI crawlers in the net.

Step 5: Submit Your Sitemap to Bing and Google

This step most people already know, but it’s worth including because it has a direct line to AI products.

  • Google Search Console (search.google.com/search-console) → feeds Gemini
  • Bing Webmaster Tools (bing.com/webmasters) → feeds Microsoft Copilot

Submit your sitemap URL (usually yourdomain.com/sitemap.xml) in both. Request indexing for your most important pages — your book pages, about page, and newsletter signup — so they get prioritized in the next crawl.

Step 6: Clear Your Cache After Every Change

This one trips people up. After uploading your llms.txt, adding JSON-LD schema to a page, or making any other change intended for AI crawlers, you need to clear your cache or the old version of the page will keep getting served.

If you use Cloudflare: Dashboard → your domain → Caching → Purge Everything (or purge specific URLs).

If you use WP Rocket, W3 Total Cache, or another WordPress caching plugin: go to the plugin settings and flush the cache after saving any page with new markup.

If you use Wordfence: check if it has caching enabled and clear it from the Wordfence dashboard.

The reason this matters specifically for AI readiness: AI crawlers and fetch tools often get served cached responses. If you upload a new llms.txt but Cloudflare is still caching the old one, crawlers will keep seeing the old version for as long as the cache TTL holds — which can be 24 hours or more.

Always purge after changes and verify in an incognito window or on a device not on your home network to confirm what the outside world is actually seeing.

What You Can’t Control (and What to Focus on Instead)

It’s worth being honest about the limits here. The major LLMs — Claude, ChatGPT, and Gemini’s base model — learn from training data with cutoff dates, not live web crawls. Changes you make today won’t appear in their responses until their next training cycle, which can be months away.

What is real-time or near-real-time:

  • Perplexity — crawls live, reflects changes quickly
  • Microsoft Copilot — draws heavily from Bing’s live index
  • Google’s AI Overviews — uses Google’s live index

For the longer-cycle LLMs, the work you do now is infrastructure investment. The models that get trained after your crawl cycle updates will have accurate information about you. That’s the correct frame for this work — not a quick traffic win, but making sure that when the next training cycle happens, you’re in it correctly.

In the meantime, the most direct lever for AI book or product recommendations right now is reviews. Review count and rating on Amazon and Goodreads heavily influence what AI tools say when someone asks for recommendations in your genre today, before any of the infrastructure changes above have had time to propagate.

Quick Reference Checklist

  • Create /llms.txt with author bio, book/product info, buy links, newsletter URL
  • If using AIOSEO, disable its default llms.txt generation before uploading yours
  • Add JSON-LD Book/Product schema to each key page via Custom HTML block
  • Verify schema at search.google.com/test/rich-results
  • Check robots.txt allows GPTBot, ClaudeBot, Google-Extended, PerplexityBot
  • Submit sitemap to Google Search Console and Bing Webmaster Tools
  • Purge Cloudflare, WP Rocket, or other cache after every change
  • Verify changes in incognito mode or on cellular data, not on your home network

Have questions about any of these steps or want to share what worked for your site? Drop a comment below.

Leave Comment

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.