llms.txt is a markdown file placed at a website's root (/llms.txt) that provides AI crawlers and large language models with a curated, structured guide to the site's most important canonical content.
llms.txt in plain English
llms.txt is a plain-text convention that lives at yourdomain.com/llms.txt and tells AI systems which pages on a site represent the authoritative version of key information. For an ecommerce store, that means linking to category pages, sizing guides, shipping policies, brand story, and flagship product collections in a single markdown file โ so an LLM answering 'what does [brand] sell?' pulls from the curated list instead of guessing from scattered crawled pages.
The file uses standard markdown: an H1 with the site name, a blockquote summary, then sections of linked URLs with short descriptions. AI crawlers fetch /llms.txt the same way they fetch /robots.txt or /sitemap.xml, parse the markdown structure, and treat the linked pages as priority context. Unlike robots.txt, which restricts access, llms.txt actively recommends content. Unlike sitemap.xml, which lists every URL for indexing, llms.txt selects only the canonical handful that summarize the site.
A well-built llms.txt is short, hierarchical, and points to clean content pages โ collection pages with real descriptions, policy pages written in full sentences, an About page that names the brand and its categories. A poorly built one dumps hundreds of product URLs, links to thin pages, or contradicts the actual site structure. The first gives AI engines a clear map; the second produces hallucinated answers or no citation at all.
For most ecommerce sites, llms.txt should stay under 50 curated links across 4-8 sections: brand overview, top collections, buying guides, policies, and contact. Anything longer dilutes the signal the file exists to send.
Why llms.txt matters for ecommerce
Ecommerce buyers increasingly ask ChatGPT, Perplexity, and Google AI Overviews questions like 'best running shoes for flat feet under $150' or 'does [brand] ship to Canada'. When an AI answers, it pulls from sources it can parse confidently. Stores with a clean llms.txt feed the model the exact product collections, policies, and brand positioning to cite โ which means correct product recommendations, accurate shipping claims, and brand mentions in generated answers. Stores without one rely on the model crawling JavaScript-heavy PDPs and stitching together fragments, which produces wrong prices, outdated inventory claims, or worse, no mention at all while competitors get cited.