AEO Strategy

What is llms.txt? How It Affects AI Visibility

Simos Christodoulou

·Mar 25, 2026·13 min

llms.txt gives LLMs a curated Markdown map of a website's most important content, proposed by Jeremy Howard (co-founder of Answer.AI and fast.ai) in September 2024 to solve a specific problem: LLM context windows are too small to process most websites in their entirety, and converting complex HTML into usable text is imprecise. llms.txt lives at the root domain (example.com/llms.txt), where it directs LLMs to the content that matters most.

Important: llms.txt is a proposed standard, not an adopted protocol. No major AI provider (OpenAI, Anthropic, Google, Perplexity) has confirmed parsing llms.txt files during content retrieval or response generation. The information in this article reflects the specification as of March 2026.

A companion variant called llms-full.txt provides complete page content in Markdown so LLMs can process the information without visiting individual URLs.

llms.txt matters because AI systems must find and interpret your content before they can cite it. AI visibility — how your brand appears in AI-generated responses — starts with content discoverability.

Start Tracking Your AI Visibility Monitor your brand across 8+ AI platforms. No credit card required.

[VISUAL: Hero image: llms.txt file concept with AI systems reading website content | Alt: "llms.txt file format - how LLMs discover website content" | File: what-is-llms-txt-hero.webp]

Why llms.txt Exists: The AI Crawler Limitation

AI crawlers can access your content but cannot determine what matters most — llms.txt fills that gap. llms.txt is primarily useful at inference time, when a user asks a question and the model needs to retrieve relevant content. Four dedicated crawlers currently index web content for LLMs:

[VISUAL: AI crawler table: GPTBot, ClaudeBot, PerplexityBot, GoogleOther | Alt: "AI crawlers - GPTBot, ClaudeBot, PerplexityBot, GoogleOther" | File: ai-crawlers-overview.webp]

Crawler	Operator	Purpose
GPTBot	OpenAI	Crawls web content for ChatGPT training and retrieval
ClaudeBot	Anthropic	Crawls web content for Claude training and retrieval
PerplexityBot	Perplexity	Crawls web content for real-time RAG (Retrieval-Augmented Generation) responses
GoogleOther	Google	Crawls web content for AI training and non-search purposes

These crawlers collect content for LLM training and real-time retrieval. As AI usage scales (800M+ weekly ChatGPT users), structured content discovery becomes a competitive concern. Understanding how AI platforms choose sources reveals that AI crawler access directly affects whether your content appears in AI-generated responses.

Website owners control AI crawler access through robots.txt directives — but robots.txt is binary. It either allows or blocks access. It cannot indicate which content is most relevant or how a site is structured for AI consumption.

This gap is what llms.txt aims to fill. Where robots.txt says "do not crawl this," llms.txt says "here is what we offer and where to find it." You can learn more about managing AI crawler access for your website.

What llms.txt Actually Does

llms.txt provides a curated Markdown summary of a website's key content, structured specifically for LLMs to read at inference time. The file differs from robots.txt and sitemap.xml in both purpose and format.

[VISUAL: Comparison table: llms.txt vs robots.txt vs sitemap.xml | Alt: "llms.txt vs robots.txt vs sitemap.xml comparison" | File: llms-txt-vs-robots-txt-comparison.webp]

Feature	robots.txt	sitemap.xml	llms.txt
Purpose	Control crawler access	List all indexable URLs	Curate key content for LLMs
Format	Plain text directives	XML	Markdown
Audience	Search engine crawlers	Search engine crawlers	Large language models
Content	Allow/Disallow rules	URL list with metadata	Page summaries + structure
Scope	Access control	Full site index	Curated highlights

robots.txt controls which pages AI crawlers can access on a website. sitemap.xml lists all indexable pages for search engine crawlers. llms.txt curates the most important pages with context and descriptions so LLMs understand what a site covers.

The file includes a site name, brief description, content sections with links, and optional page descriptions. llms.txt discovers and summarizes a site's key content for LLMs — it does not control access like robots.txt. Together, these files form the technical AI visibility stack: robots.txt (access control), sitemap.xml (content index), structured data (entity clarity), and llms.txt (curated content guide). No single file is sufficient on its own.

The specification also defines a special "Optional" H2 section. URLs listed under this heading signal that the content can be skipped when a shorter context is needed — useful for supplementary resources that are not required to understand the core content.

Implementing schema markup for AI visibility adds another layer — machine-readable entity and relationship signals that AI systems use during retrieval.

Does llms.txt Improve AI Visibility?

No major AI platform has confirmed using llms.txt as a ranking or citation factor, so it will not directly improve AI visibility today. The value is indirect: content audit discipline, structural clarity, and low-risk future-proofing.

Growing adoption signals interest — over 844,000 websites have implemented llms.txt (BuiltWith, October 2025), including Anthropic, Cloudflare, Stripe, Shopify, Vercel, and Hugging Face. Developer tools like Cursor use llms.txt for documentation context. The standard is gaining traction among technical teams, even without confirmed platform support.

What the Data Shows

Server log analysis tells a more nuanced story. Semrush tested llms.txt on Search Engine Land from August to October 2025: the file received zero visits from GPTBot, ClaudeBot, PerplexityBot, or Google-Extended. Traditional crawlers like Googlebot visited the file, but only a few times.

Google's John Mueller compared llms.txt to the keywords meta tag: "This is what a site-owner claims their site is about... At that point, why not just check the site directly?"

One counter-signal: LangChain's internal benchmarks found that AI agents using llms.txt outperformed both vector search and context stuffing for documentation retrieval. Mintlify reports up to 10x token reduction when serving Markdown instead of HTML. And Profound's data shows AI agents visit llms-full.txt over twice as often as llms.txt.

Our assessment draws on: the official llms.txt specification (llmstxt.org), BuiltWith adoption data, Semrush's server log analysis on Search Engine Land, LangChain's documentation retrieval benchmarks (via Latent Space), Profound's agent traffic analysis, and Google's John Mueller public statements.

3 indirect benefits make llms.txt worth considering:

Content audit value - Creating an llms.txt file forces you to identify and prioritize your most important content. This audit process benefits AI visibility regardless of whether platforms read the file.
Structural clarity signal - The file demonstrates that your site is organized and AI-ready. Structured data, robots.txt configuration, and llms.txt together form the technical AI visibility stack.
Low effort, low risk - Implementation takes 30-60 minutes for most sites. The time investment is minimal compared to structured data implementation or content quality improvements.

Visiblie tracks AI visibility metrics across ChatGPT, Gemini, Perplexity, Claude, and 4 additional AI platforms. In our monitoring data, we have not observed a correlation between llms.txt implementation and improved citation rates or mention frequency. Sites that improve AI visibility typically do so through structured data, content quality, and entity authority — not through llms.txt alone.

Google included llms.txt in their Agent2Agent (A2A) protocol in April 2025 — though this amounts to embedding one proposed standard inside another, and Google has not committed to crawling llms.txt files.

The broader principle: AI platforms select sources based on authority, relevance, recency, and structural clarity. llms.txt addresses structural clarity but does not replace the other 3 factors. Avoiding common AI visibility mistakes delivers more immediate impact than llms.txt alone.

Get Your Free AI Visibility Report - See how your brand appears across ChatGPT, Gemini, and Perplexity - in 60 seconds.

Want to see how AI talks about your brand?

Join 500+ companies tracking their AI visibility. Get started in 2 minutes.

Start Free Trial

Should You Add llms.txt to Your Website?

llms.txt delivers the most value for three site types: large content sites, SaaS developer documentation, and publishers — lower priority for small sites under 50 pages.

Large content sites (hundreds or thousands of pages) — AI crawlers benefit from a curated guide to navigate extensive content libraries
SaaS documentation sites — Structured content discovery directly improves developer-facing AI answers
Publisher sites — llms.txt signals AI-readiness while maintaining editorial control over which content is surfaced

Small sites (under 50 pages) benefit less from llms.txt because AI crawlers can already process all their content. Sites without basic AI visibility foundations — structured data, clear entity definitions, quality content — gain more from addressing those first.

llms.txt complements structured data, entity authority, and content quality — it does not replace them. Implement llms.txt after addressing higher-impact items like structured data, content quality, and entity authority. Read the full guide on how to improve AI visibility for a prioritized approach.

How to Create an llms.txt File

Creating an llms.txt requires a Markdown file with an H1 title, optional description, and H2 sections linking to your key content. The process takes 5 steps:

Create the file - Make a new file named llms.txt in your site's root directory
Add a site title - Use an H1 heading in Markdown format: # Your Site Name
Add a description - Write a brief site description in blockquote format: > Your site does X for Y audience
Add content sections - List your most important pages with brief descriptions, organized by category
Deploy and verify - Upload to your root domain and confirm the file loads at your-domain.com/llms.txt

llms.txt Example:

This file serves as the high-level map. It provides the essential context and links so an LLM can decide which parts of your site are relevant to a user's query:

-----------

Visiblie

Visiblie is an AI visibility monitoring and optimization platform that tracks brand mentions across 8+ LLMs.

Resources

Visiblie MCP Server: Integrate with your existing tech stack.

Blog

What is AI Visibility: Complete guide to understanding AI visibility.
What is GEO: Generative Engine Optimization (GEO) explained.

Free Tools

AI Visibility Report: Free brand visibility analysis tool.
AI SEO Audit: Free AI search readiness check.

-----------

Optional: create a companion llms-full.txt with complete page content in Markdown for sites that want to provide full content access to LLMs.

Validate your implementation by visiting your-domain.com/llms.txt in a browser.

llms.txt Generator Tools

Several llms.txt generator tools automate file creation:

Mintlify and GitBook auto-generate and maintain llms.txt for all hosted documentation sites
Wordlift LLMs.txt Generator creates the file from any URL — enter your domain, select pages, and export
Firecrawl scrapes your site and generates the file programmatically
SiteSpeak offers a basic llms.txt generator for quick starts

For WordPress sites, the Hostinger plugin adds a one-click llms.txt toggle. Alternatively, create the file manually and upload it to your public_html root directory.

llms-full.txt and .md Page Variants

llms-full.txt compiles your entire site's documentation content into a single Markdown file. Where llms.txt serves as a table of contents, llms-full.txt provides the full text — allowing LLMs to process all content without visiting individual URLs.

Profound's data shows AI agents visit llms-full.txt over twice as frequently as llms.txt, suggesting that full-content files may deliver more practical value.

The llms.txt specification also proposes per-page Markdown variants: append .md to any page URL to serve a clean Markdown version. For example, example.com/docs/api would also be available at example.com/docs/api.md. This gives AI systems a clean alternative to parsing HTML on individual pages.

How to Monitor llms.txt Performance

After deploying llms.txt, monitor server logs for AI crawler requests to assess whether bots access the file. Filter for user agents: GPTBot, ClaudeBot, PerplexityBot, Google-Extended, and CCBot.

Ahrefs Bot Analytics (free, beta) tracks visits across 12 AI bot categories using server-side data via Cloudflare. For manual monitoring, check your server access logs for requests to /llms.txt and /llms-full.txt.

If your logs show zero AI bot visits after 30 days (as Semrush observed on Search Engine Land), the file is not yet being consumed — but the low maintenance cost means keeping it deployed is still reasonable.

Get Started Free Track your brand across ChatGPT, Gemini, Perplexity, and more. No credit card required.

llmllms.txtai visibility

Simos Christodoulou

Head of SEO & GEO

Expert in search engine optimization, generative engine optimization, and AI visibility strategies. Experienced in technical SEO, structured data implementation, semantic SEO, and optimizing brand presence across AI platforms.