The truth about llms.txt

Three independent studies have now looked at whether llms.txt improves AI search visibility. All three found the same thing: it doesn't.

Semrush ran a controlled crawl test and found zero visits from GPTBot, ClaudeBot, or PerplexityBot to their llms.txt file over three months. SE Ranking analysed 300,000 domains and found no statistical correlation between having an llms.txt and being cited by AI systems, their prediction model actually improved when they removed llms.txt presence as a variable. OtterlyAI tracked 90 days of crawler logs across hundreds of sites and found the file was requested in just 0.1% of all AI crawler visits.

Then Google made it official. At Google Search Central Live Toronto in April 2026, with Danny Sullivan on stage, one of the confirmed takeaways was explicit: "There is no benefit to creating a llms.txt file for SEO." This followed Gary Illyes saying at the Search Central Deep Dive APAC event in July 2025 that Google doesn't support llms.txt and isn't planning to.

Two primary Google statements. Three independent empirical studies. Zero evidence of citation impact.

And yet: a lot of very smart people are implementing it anyway. Including Shopify, who shipped it to millions of stores in May 2026 without a single announcement.

Here's the thing. Shopify isn't doing it for SEO. Understanding that distinction is the whole point of this article.

What llms.txt actually is

llms.txt is a plain-text file that lives at the root of a website, yourdomain.com/llms.txt. It gives AI systems a clean, structured summary of what a site is about, what pages matter, and what content is worth reading.

Jeremy Howard, co-founder of fast.ai, proposed the standard on 3 September 2024. The core observation: websites are built for humans and Google. Navigational menus, cookie banners, JavaScript-rendered content, footer links, when an AI system tries to understand your business, it has to wade through all of it. llms.txt cuts through that noise and hands the AI a curated briefing document instead.

Think of it as the AI equivalent of robots.txt. Where robots.txt tells crawlers where not to go, llms.txt tells AI systems where to start.

An optional companion file, /llms-full.txt, goes further: it contains the full content of your site in clean Markdown. A complete briefing document for any AI that encounters your brand.

That's the spec. It's not a schema standard. It's not a ranking protocol. It's a structured text file.

What the research actually says

Let me stack the three studies side by side, because the cumulative weight matters more than any individual finding.

Study	Method	Sample	Finding
Semrush crawl test	Server log analysis	Own site, 3 months	Zero visits from GPTBot, ClaudeBot, PerplexityBot to llms.txt
SE Ranking study	Correlation analysis	300,000 domains	No correlation between llms.txt and AI citation frequency; model improved when llms.txt was removed as a variable
OtterlyAI AI Citations Report	Crawler log audit	90 days, multiple sites	File requested in 0.1% of all AI crawler visits

The SE Ranking data is worth pausing on. They didn't just find no positive correlation. Their ML prediction model for AI citation frequency got noisier when llms.txt presence was included. The file isn't a neutral signal, it's an active distraction from what actually predicts citations.

And the adoption breakdown tells a similar story. SE Ranking found llms.txt on 10.13% of the 300,000 domains they analysed. But the adoption rate was almost identical across low-traffic sites (9.88%), mid-traffic sites (10.54%), and high-traffic sites (8.27%). High-performing domains don't use it at a higher rate. The signal correlates with hype adoption, not performance outcomes.

Cyrus Shepard's meta-analysis of 23 AI ranking factors scored llms.txt at 2.0 out of 10, the lowest of all factors tested. That's not a rounding error. It's a clear signal about where it sits in the priority stack.

The honest summary: AI crawlers can read llms.txt when they choose to. They mostly choose not to.

Why Shopify shipping it doesn't mean what you think

In early May 2026, Shopify silently deployed three new endpoints to every storefront on its platform. No announcement. No email to merchants. No mention in the changelog.

Every Shopify store now serves:

/llms.txt, which 301-redirects to /agents.md
/agents.md, a machine-readable file pointing AI agents to the store's search API, product catalogue, sitemap, and checkout flow
/.well-known/ucp, the Universal Commerce Protocol, which allows AI agents to add items to carts and initiate checkouts without a human navigating the UI at all

That last one is the tell. UCP isn't an SEO tool. It's a transaction protocol. When an AI agent is browsing on a buyer's behalf, comparing products, checking availability, finalising a purchase, the /.well-known/ucp endpoint is what makes that possible on a Shopify store.

Shopify didn't ship llms.txt because it helps them rank. They shipped it as infrastructure for a world where AI agents are the shopper.

This is the distinction the industry keeps missing. The question "will llms.txt help my SEO?" is the wrong question. The right question is: "am I ready for AI agents to transact on my site?" Those are completely different problems with completely different solutions.

Shopify's rollout is the clearest real-world signal that agentic commerce is being built now, at platform level, whether or not individual businesses are paying attention. That's worth implementing for. Not rankings.

The one case where it actually helps

There is a narrow, specific use case where llms.txt delivers genuine, measurable value. It's just not the one most people are implementing it for.

AI coding assistants.

Tools like Cursor, GitHub Copilot, and Claude retrieve documentation in real time when developers are writing code. When a developer asks "how do I authenticate with your API?", the AI assistant fetches your docs, parses them, and incorporates the answer into its suggestion.

A well-structured llms.txt helps that process. It reduces token waste by pointing the AI to the right pages immediately, rather than having it crawl through your entire documentation tree. The improvement is measurable: faster responses, more accurate answers, fewer hallucinated API calls.

Anthropic publishes a comprehensive llms.txt and llms-full.txt for exactly this reason. So do Cloudflare, Stripe, Vercel, and Supabase. These are developer-facing products where AI-assisted code completion is a primary use case. The file earns its keep.

For a standard business website, a law firm, a retail brand, a B2B SaaS that doesn't serve developer tooling, this use case doesn't apply. The value is real, but it's narrow.

The risk no one mentions: llms-full.txt and duplicate content

This is the part that doesn't get talked about, and it should.

Several SEO plugins now auto-generate /llms-full.txt files, a complete Markdown mirror of every page on your site. The idea is to give AI systems a full briefing document. In practice, if this file is indexable (and it often is by default), you've just created a duplicate content issue at scale.

Google's crawler will find that file. It will attempt to index it. You now have a Markdown version of every article, every product page, and every service page competing with the HTML originals.

If you're implementing llms-full.txt, check three things:

Is it blocked in your robots.txt? If not, add it: Disallow: /llms-full.txt
Does it carry a noindex X-Robots-Tag in the HTTP response header?
Has your site been crawled since the file went live? Check Search Console for unexpected new indexation patterns.

This isn't a theoretical risk. It's an easy mistake to make with auto-generated implementations, and most of the advice circulating about llms.txt doesn't mention it.

The 82% gap, and what it actually tells us

As part of StudioHawk's AI Visibility Audit, we tested 83 of Australia's most-visited business websites across retail, finance, travel, and hospitality. The llms.txt adoption rate: 18%.

82% of those businesses, including Medibank, RACV, Telstra, and most major AU retailers, have no llms.txt file.

Here's the honest read on that number: most of them are fine. They're not losing AI citations because of a missing text file. They're losing citations (if they are) because of crawler access issues, thin structured data, weak entity signals, or content that AI systems can't easily extract and attribute.

The 82% gap is an adoption stat, not a performance gap. Don't read it as "82% of Australian businesses have an easy win available." Read it as "this space is moving fast and most businesses haven't made a deliberate choice either way yet."

We track llms.txt in our AI Visibility Audit benchmark but don't include it in individual scores. Not because it's irrelevant, but because it's not yet proven enough to penalise businesses for missing it, and it distracts from the signals that genuinely matter.

What actually moves the needle

Both Google's AI optimisation guide and Microsoft's AI search guide are explicit about this. The signals that earn AI citations and agent-readiness are the same signals that earn good traditional SEO outcomes. The foundation doesn't change.

Across StudioHawk's AI Visibility Audit, the highest-impact gaps we find, consistently, across industries, are:

AI crawler access. GPTBot, ClaudeBot, and PerplexityBot blocked in robots.txt is a full stop before any other optimisation matters. A llms.txt file does nothing if the bot that might read it is disallowed on the way in. We find this blocked, intentionally or by accident, on a significant share of audited sites.

Server-side content ratio. If your product descriptions, service pages, or key content only loads via JavaScript, AI systems reading raw HTML see an empty page. Technical SEO for AI starts here.

Structured data coverage and accuracy. Organisation schema with correct sameAs links, Article markup with a real author entity, LocalBusiness schema for location-based services. These are the signals that tell AI systems who you are and what you do. Missing or incorrect schema is a consistent finding across our benchmark.

Passage-level citability. AI cites passages, not pages. H2/H3 density and paragraph length determine whether your content can be extracted as a discrete, attributable answer. Walls of text and vague headings are invisible to the extraction layer.

Entity clarity. sameAs links from your Organisation schema pointing to Wikipedia, Wikidata, Crunchbase, and LinkedIn are the signals that tell AI systems you're a real, established entity. Microsoft's guide frames this as "earned trust", AI visibility goes to brands the model already knows exist.

Fix those five things before you worry about a text file at your domain root.

Should you implement llms.txt?

Yes. If you've got a spare afternoon, do it.

It's about 20 lines of Markdown. It costs nothing to maintain. It positions you correctly for the agentic web, even if that infrastructure isn't fully live yet. And if you're a Shopify merchant, you already have it, check yourstore.myshopify.com/llms.txt.

A minimal implementation for any business site:

# [Your Business Name]

> [One sentence: what you do and who you serve.]

## Key pages

- [Homepage](https://yourdomain.com/): What we do
- [Services](https://yourdomain.com/services/): Full service listing
- [Contact](https://yourdomain.com/contact/): Get in touch

## What we do

[2-3 sentences. Plain language. What you do, who you do it for.]

Upload it to yourdomain.com/llms.txt. Serve it as text/plain. That's it.

What you shouldn't do: treat it as a citation lever, prioritise it over structured data and crawler access, or auto-generate an indexable llms-full.txt without checking for duplicate content exposure.

It's a cheap infrastructure bet on a future that's arriving. Make the bet. Then go fix your robots.txt.

FAQ

Does llms.txt help you rank in AI search?

No. Three independent studies, Semrush's crawl test, SE Ranking's 300,000-domain analysis, and OtterlyAI's 90-day log audit, all found no measurable correlation between llms.txt and AI citation frequency. Google confirmed this at Search Central Live Toronto in April 2026: "There is no benefit to creating a llms.txt file for SEO."

Why did Shopify add llms.txt to every store?

Shopify added llms.txt as part of an agentic commerce infrastructure rollout in May 2026. The file redirects to /agents.md, which points AI agents to the store's product catalogue, search API, and checkout flow. The /.well-known/ucp endpoint that ships alongside it allows AI agents to add items to carts and complete purchases without human navigation. It's a transaction protocol, not an SEO tactic.

Can llms.txt cause duplicate content problems?

The llms.txt file itself (a short index document) won't cause issues. The companion llms-full.txt, a Markdown mirror of your full site content, can create duplicate content at scale if it's indexable. Block it in robots.txt and add a noindex X-Robots-Tag in the HTTP response header if you're generating one.

Who should definitely have a llms.txt?

Developer documentation sites and SaaS products with technical docs get the most value. AI coding assistants like Cursor and GitHub Copilot retrieve documentation in real time, and a well-structured llms.txt reduces token waste and improves the accuracy of their suggestions. For standard business websites, it's a low-cost bet on future agent infrastructure rather than a current performance lever.

What should I fix before llms.txt?

In order of impact: AI crawler access in robots.txt, server-side content ratio (JavaScript-dependent content), structured data coverage and accuracy, passage-level citability (H2/H3 density and paragraph length), and entity clarity (sameAs links in Organisation schema). These five signals consistently produce the highest-impact findings in our AI visibility audits.

What llms.txt actually is

What the research actually says

Why Shopify shipping it doesn't mean what you think

The one case where it actually helps

The risk no one mentions: llms-full.txt and duplicate content

The 82% gap, and what it actually tells us

What actually moves the needle

Should you implement llms.txt?

FAQ

Does llms.txt help you rank in AI search?

Why did Shopify add llms.txt to every store?

Can llms.txt cause duplicate content problems?

Who should definitely have a llms.txt?

What should I fix before llms.txt?

Sources & Further Reading

Soaring Above Search

What llms.txt actually is

What the research actually says

Why Shopify shipping it doesn't mean what you think

The one case where it actually helps

The risk no one mentions: llms-full.txt and duplicate content

The 82% gap, and what it actually tells us

What actually moves the needle

Should you implement llms.txt?

FAQ

Does llms.txt help you rank in AI search?

Why did Shopify add llms.txt to every store?

Can llms.txt cause duplicate content problems?

Who should definitely have a llms.txt?

What should I fix before llms.txt?

Sources & Further Reading

Soaring Above Search

Keep Reading

Search Console Platform Properties Guide

Instagram Topical Map: The 2026 Method

TikTok Topical Map: The 2026 Method

X Topical Map: The 2026 Method