LLMs.txt: The New robots.txt for AI — What the Data Actually Shows

In September 2024, Jeremy Howard, co-founder of Answer.AI and creator of fast.ai, published a proposal for a new web standard called /llms.txt. The idea was simple: give AI systems a structured, machine-readable summary of your website, the same way robots.txt tells search engine crawlers which pages to index. Within 18 months, over 844,000 websites adopted the standard. But the real question for ecommerce stores is whether it actually moves the needle.

This guide covers what llms.txt is, what the research says about its impact, exactly how to implement it on Shopify and other platforms, and where it fits into a broader GEO strategy.

The Origin: Jeremy Howard's Proposal

Howard's September 2024 proposal identified a specific technical problem. Today's websites serve two audiences — humans and language models — but lack any standardized way to provide LLM-friendly content. When an AI crawler hits a typical ecommerce store, it encounters JavaScript-rendered pages, cookie banners, navigation menus, and thousands of product pages. It has to spider the entire site, parse every page, and figure out the structure on its own.

Howard's argument: "Site authors know best, and can provide a list of content" for LLM consumption.

The solution has two parts. First, a /llms.txt file at the site root: a Markdown document providing curated links to documentation and resources. It is programmatically parseable while remaining human-readable. Second, Markdown page variants (e.g., page.html.md) containing clean, LLM-friendly versions of individual pages.

The specification is deliberately simple. An llms.txt file contains:

An H1 heading with the project or store name (required)
A blockquote with a brief summary
Optional explanatory paragraphs
H2 sections containing curated link lists
Each link follows the pattern: [name](url): optional details
An "Optional" section for secondary content that can be skipped when context windows are limited

llms-full.txt: The Comprehensive Variant

The standard encompasses two complementary files. The base llms.txt is a curated navigation file that provides structure and highlights priority content. The companion llms-full.txt contains your entire documentation compiled into a single Markdown file, giving an AI crawler a high-signal ingestion point instead of forcing it to stitch together many separate pages.

For developer tools and SaaS platforms, llms-full.txt has proven more popular with AI agents. Server log analysis shows AI agents visit llms-full.txt over twice as frequently as the base llms.txt file, suggesting that full-content files deliver more practical value when AI tools have large context windows to work with. Anthropic's documentation, for example, publishes a slim llms.txt index that links to a comprehensive llms-full.txt Markdown export containing their full docs. This lets tools choose the appropriate level of depth — a quick index for real-time AI assistants, or a full export for IDE integrations and RAG systems.

For ecommerce stores, a full catalog dump is rarely practical. Your llms.txt should be the curated summary. If you have extensive buying guides or educational content, an llms-full.txt containing those guides in Markdown can be valuable.

Adoption: 844,000 Sites and Counting

Adoption was slow initially. A scan of the top one million websites in May 2025 found only 105 valid llms.txt files — a 0.011% adoption rate. Then Mintlify rolled out llms.txt support across all documentation sites it hosts, and practically overnight thousands of docs sites — including Anthropic and Cursor — began serving the file. By October 2025, BuiltWith tracked over 844,000 websites with llms.txt files.

SE Ranking's analysis of nearly 300,000 domains found 10.13% had an llms.txt file in place. Adoption was remarkably consistent across traffic levels:

Low traffic sites (0–100 visits): 9.88% adoption
Mid traffic sites (1,001–5,000 visits): 10.54% adoption
High traffic sites (100,001+ visits): 8.27% adoption

This spread suggests llms.txt is not concentrated among high-performing domains. It is being adopted broadly, often by smaller sites hoping for an edge. Among the top 1,000 websites globally, however, not a single major platform had implemented the standard as of late 2025.

Who Is Using llms.txt: Real Examples

The strongest adoption has come from developer-focused companies with deep documentation:

Anthropic publishes a slim llms.txt index linking to a comprehensive llms-full.txt. Their docs serve dual use cases: quick real-time answers ("How do I call the Claude API?") and deep ingestion by IDEs and agents that want comprehensive context.

Cloudflare runs one of the most extensive llms.txt files, covering 20-plus products with substantial depth. Each section includes Getting Started, Configuration, API Reference, and Tutorials. The structure solves a specific orientation problem: "which Cloudflare product applies to my use case?"

Stripe organizes around major product areas — Payments, Checkout, Webhooks, Testing — with each link including descriptive text, mirroring their API architecture to help models navigate a sprawling ecosystem.

Vercel groups documentation by product (Next.js, AI SDK, Blob storage) and surfaces quickstarts and high-intent guides, optimized for discovery rather than exhaustive listing.

Cursor and Windsurf organize around developer workflows and product lifecycle stages, emphasizing setup, configuration, and troubleshooting tied to in-product usage.

LangGraph has been called the gold standard for IDE-embedded documentation patterns, providing a slim index, a comprehensive full export, and separate files per programming language, with explicit guidance for tool builders on parsing strategies.

A notable anecdote: in December 2025, someone spotted Google had quietly added llms.txt to their own Search Central docs. When called out publicly, Google's John Mueller responded "Hmmn :-/" and the file disappeared within hours.

The Hard Truth: Does llms.txt Improve AI Citations?

This is where the data gets uncomfortable.

SE Ranking's 300,000-domain study used XGBoost machine learning analysis, Spearman correlation testing, and SHAP analysis to determine whether having llms.txt correlated with being cited by LLMs. The result: no statistically significant correlation. When they removed the llms.txt variable from their predictive model, accuracy actually improved. The file "was actually adding noise or confusing information" to the model.

OtterlyAI's 90-day GEO study tracked server logs on a site with an implemented llms.txt file. Out of 62,100-plus AI bot visits, only 84 reached the llms.txt file — 0.1% of total AI bot traffic. The average content page received roughly 265 AI bot visits, meaning llms.txt performed three times worse than a typical page. Their conclusion: "the presence of a correctly implemented llms.txt file did not correlate with any noticeable uptick in overall AI bot activity."

Inconsistent crawler behavior complicates the picture further. One hosting provider managing 20,000 sites reported that GPTBot was not fetching llms.txt files at all. Another developer showed screenshots of GPTBot pinging his llms.txt every 15 minutes. A 2025 audit found Bing made only seven requests to llms.txt across one thousand domains, while OpenAI's search bot made ten calls total. Profound's GEO tracking shows Microsoft and OpenAI bots actively fetching both llms.txt and llms-full.txt — but crawling a file is not the same as using it.

As of mid-2025, Google's John Mueller stated plainly: "No AI system currently uses llms.txt." The three major AI platform providers — OpenAI, Google, and Anthropic — have not confirmed native support for llms.txt in their primary AI products.

What actually drives AI citations? The same SE Ranking study found that sites with over 32,000 referring domains are 3.5 times more likely to be cited by ChatGPT than those with up to 200 referring domains. The brands winning in AI search win because of genuine topical authority, consistent mentions across high-quality external sources, structured content that answers questions directly, and strong entity signals.

Why You Should Still Implement It

Despite the lack of direct citation impact, there are practical reasons to create an llms.txt file:

The standard is early, not dead. Google included an llms.txt file in their Agent-to-Agent (A2A) protocol specification, signaling relevance for agent-to-agent communication. As AI agents become more autonomous — shopping on behalf of users, comparing products, making purchasing recommendations — having a structured machine-readable summary of your store becomes increasingly valuable.

It costs almost nothing. Writing a well-structured llms.txt takes an hour at most. The downside risk is zero. The upside, even if marginal today, compounds as AI systems evolve.

Developer tools are already using it. Cursor, Windsurf, and other AI-powered IDEs actively consume llms.txt and llms-full.txt for context. If your store has an API, developer documentation, or integration guides, these tools will find and use your file.

It forces you to articulate your store's identity. The exercise of writing a concise, structured summary of what you sell, what makes you different, and where your best content lives is valuable in itself.

AI Crawlers: Who Is Visiting Your Store

Understanding which AI crawlers exist — and how aggressively they operate — is essential context for any llms.txt strategy.

Cloudflare's 2025 report, analyzing traffic across their network, found an 18% increase in combined AI and search crawler traffic from May 2024 to May 2025. The growth among AI-specific crawlers was dramatic:

GPTBot (OpenAI training): 305% increase in raw requests, share grew from 2.2% to 7.7%
ChatGPT-User (OpenAI browsing): 2,825% surge in requests
PerplexityBot: 157,490% growth from a minimal baseline
ClaudeBot (Anthropic): declined 46% in requests, share fell from 11.7% to 5.4%
Bytespider (ByteDance): plummeted 85% in volume

For ecommerce specifically, the numbers are striking. An analysis of roughly 200 retail and ecommerce websites found that for every single visit OpenAI's systems deliver to a retail site, those systems perform 198 crawls. Google, by comparison, generates one visit for every six crawls. AI-driven bot traffic across these retail sites increased 5.4 times throughout 2025, with food and grocery seeing a 29-times increase and home and DIY seeing an 11-times increase.

The major AI crawler user agents you need to know:

| Crawler | Operator | Purpose | |---|---|---| | GPTBot | OpenAI | Training data collection | | ChatGPT-User | OpenAI | Real-time browsing for ChatGPT | | OAI-SearchBot | OpenAI | SearchGPT results | | ClaudeBot | Anthropic | Training data collection | | PerplexityBot | Perplexity | Real-time search answers | | Google-Extended | Google | AI training (separate from Googlebot) | | Applebot-Extended | Apple | Apple Intelligence training | | Meta-ExternalAgent | Meta | AI training | | Amazonbot | Amazon | Alexa and AI features | | Bytespider | ByteDance | TikTok AI training |

Blocking vs. Welcoming: The Ecommerce Dilemma

Among the top 10,000 domains, 14% had robots.txt directives targeting AI bots as of mid-2025. GPTBot was the most blocked, disallowed by 312 domains (250 fully, 62 partially). It was also the most explicitly allowed, permitted by 61 domains — reflecting genuine strategic disagreement about whether to feed or starve AI crawlers.

The broader trend is toward more blocking. AI-blocking by reputable sites increased from 23% in September 2023 to nearly 60% by May 2025. As of Q1 2026, GPTBot blocking has plateaued, but ClaudeBot saw the largest blocking increase at +0.39 percentage points, overtaking CCBot to become the second-most referenced AI crawler in robots.txt files.

For ecommerce stores, the calculus is different from publishers. Publishers lose direct traffic when AI systems answer questions using their content. Ecommerce stores gain when AI systems recommend their products. If your robots.txt blocks GPTBot, ClaudeBot, and PerplexityBot, your llms.txt file is irrelevant — those crawlers will never see it. Make sure your robots.txt allows the AI crawlers you want to reach you.

Complete LLMs.txt Example for an Ecommerce Store

Here is what a well-structured llms.txt file looks like for a specialty ecommerce store:

# Clayworks Coffee Co.

> Clayworks Coffee Co. is an online store specializing in handmade ceramic coffee brewing equipment. We sell pour-over drippers, coffee mugs, and brewing accessories. Based in Portland, Oregon. Shipping worldwide.

## Main Product Categories

- [Pour-Over Drippers](https://clayworkscoffee.com/collections/pour-over-drippers): Handmade ceramic pour-over drippers in multiple sizes and colors
- [Coffee Mugs](https://clayworkscoffee.com/collections/mugs): Ceramic coffee mugs with ergonomic handles, 8oz to 16oz
- [Brewing Accessories](https://clayworkscoffee.com/collections/accessories): Filters, scales, kettles, and other coffee brewing tools

## Best-Selling Products

- [Classic Pour-Over Dripper](https://clayworkscoffee.com/products/classic-pour-over): Our flagship ceramic dripper, 4.8 stars from 312 reviews, $38
- [Double-Wall Ceramic Mug](https://clayworkscoffee.com/products/double-wall-mug): Insulated ceramic mug, keeps coffee hot 2x longer, $28
- [Starter Brewing Kit](https://clayworkscoffee.com/products/starter-kit): Complete pour-over kit with dripper, mug, filters, and guide, $89

## Guides and Resources

- [How to Brew the Perfect Pour-Over](https://clayworkscoffee.com/blogs/guides/perfect-pour-over): Step-by-step pour-over brewing guide
- [Ceramic vs Glass Drippers](https://clayworkscoffee.com/blogs/guides/ceramic-vs-glass): Detailed comparison of dripper materials
- [Coffee Grind Size Guide](https://clayworkscoffee.com/blogs/guides/grind-size): Which grind size to use for every brewing method
- [Choosing Your First Pour-Over Setup](https://clayworkscoffee.com/blogs/guides/first-pour-over): Beginner buying guide

## Policies

- [Shipping Policy](https://clayworkscoffee.com/policies/shipping): Free US shipping on orders over $50, international flat rate $12
- [Return Policy](https://clayworkscoffee.com/policies/returns): 30-day satisfaction guarantee, free US returns
- [FAQ](https://clayworkscoffee.com/pages/faq): Common questions about products, shipping, and care instructions

## About

- [About Us](https://clayworkscoffee.com/pages/about): Our story, workshop, and ceramic crafting process
- [Contact](https://clayworkscoffee.com/pages/contact): Email support@clayworkscoffee.com, phone (503) 555-0142

How to Create and Deploy LLMs.txt on Shopify

Shopify does not natively support serving files from the site root or /.well-known/ paths. As of April 2026, there is no indication Shopify plans to add native llms.txt support, despite a public feature request on the Shopify community forums. Here are the practical options:

Approach 1: Shopify apps (easiest)

Multiple Shopify apps now handle llms.txt generation. Options on the Shopify App Store include LLMs.txt Generator, LLMS.txt Agent, Arc, and autoLLMs, most launched between June and August 2025. These apps generate the file from your product catalog and serve it at the correct path, typically using a URL redirect from the root. Some keep the file in sync automatically as you add or remove products.

The tradeoff: apps use redirects to serve the file since Shopify does not allow placing files at the domain root. A redirect from yourdomain.com/llms.txt to the app-hosted file works, but it is not the canonical path from the specification.

Approach 2: Cloudflare Worker (most control)

If your domain uses Cloudflare (which many Shopify stores do for CDN and security), create a Cloudflare Worker that intercepts requests to yourdomain.com/llms.txt and serves the content directly:

Write your llms.txt content and host it as a static file (GitHub Gist, S3 bucket, or Cloudflare KV store).
Create a Cloudflare Worker that intercepts the path and returns the file with Content-Type: text/plain.
The file appears at the correct root path with no redirect.

This approach gives you full control over the content and serves the file at the standard location.

Approach 3: Shopify page as a fallback

Create a page at /pages/llms-txt with your content. This is not the standard path and most AI crawlers will not find it automatically, but you can link to it from your sitemap or homepage meta tags. Treat this as a stopgap only.

How to Create LLMs.txt on Other Platforms

WordPress / WooCommerce:

Create a file called llms.txt and upload it to your web root via FTP or your hosting file manager. Alternatively, add a rewrite rule in your .htaccess:

RewriteRule ^\.well-known/llms\.txt$ /llms.txt [L]

Several WordPress plugins now generate llms.txt from your WooCommerce catalog automatically.

Next.js / Headless commerce:

Serve the file as a static route or API route:

// app/llms.txt/route.ts
export async function GET() {
  const content = `# Your Store Name
> Description of your store...

## Products
- [Product 1](https://yourstore.com/product-1): Description
`;
  return new Response(content, {
    headers: { "Content-Type": "text/plain" },
  });
}

Custom platforms:

Place the file at /llms.txt in your web root. Ensure your server returns it with Content-Type: text/plain and a 200 status code.

What Information to Include vs Exclude

Always include:

Store name and a one-line description of what you sell
Main product categories with URLs
Top 5–10 best-selling or flagship products with prices and brief descriptions
Links to your best educational content
Key policies (shipping, returns) as links
Contact information

Include when relevant:

Physical store locations
Notable press mentions or certifications
Seasonal or sale information
Newly launched products

Never include:

Customer personal data
Internal admin URLs
API endpoints or developer documentation
Prices or inventory data you do not want publicly indexed
Pages behind authentication

The Relationship Between LLMs.txt and robots.txt

LLMs.txt and robots.txt serve complementary purposes:

| | robots.txt | LLMs.txt | |---|---|---| | Purpose | Tells crawlers what to avoid | Tells AI what to prioritize | | Audience | Search engine bots | AI language models | | Tone | Restrictive (disallow) | Promotional (here is our best content) | | Format | Custom syntax | Markdown | | Location | /robots.txt | /llms.txt |

They do not conflict. Your robots.txt might block crawlers from /admin/ and /cart/ pages while your llms.txt highlights your best product and content pages. Use both.

One critical point: if your robots.txt blocks AI crawlers, your llms.txt is invisible to them. Among the top 10,000 domains, 14% already block AI bots via robots.txt. Before implementing llms.txt, audit your robots.txt to make sure you are not blocking the crawlers you want to reach.

Keeping LLMs.txt Current

Your llms.txt file should be updated whenever you:

Add or remove a major product category
Launch a flagship product
Publish significant educational content
Change shipping or return policies
Update contact information

A stale llms.txt is better than no llms.txt, but a current one signals to AI systems that the information is reliable. If possible, automate the generation from your product catalog and CMS. Several Shopify apps handle this automatically, regenerating the file as your catalog changes.

The Bottom Line

The research is clear: llms.txt does not currently drive AI citations in any measurable way. SE Ranking's 300,000-domain study, OtterlyAI's 90-day experiment, and multiple independent audits all point to the same conclusion. No major AI platform has confirmed it reads these files for citation or ranking purposes.

But the standard is barely 18 months old, adoption crossed 844,000 sites, Google referenced it in their A2A protocol, and AI agents are becoming more autonomous every quarter. The cost of implementation is an hour of work. The cost of ignoring it — if the standard gains traction with AI shopping agents — could be significant.

Implement llms.txt. Just do not expect it to be a silver bullet. The stores that win AI citations are the ones with genuine authority: deep content, strong backlink profiles, structured data, and consistent mentions across the web. LLMs.txt is one small piece of that puzzle, and right now it is a bet on the future more than a lever for today.