Skip to main content
Shopify guide

robots.txt for Shopify Stores

By ยท Updated ยท 6 min read

How Shopify Handles robots.txt Differently From Other Platforms

Shopify generates its robots.txt file automatically at yourdomain.com/robots.txt. Unlike WordPress or custom-built storefronts where you can overwrite the file directly on a server, Shopify controls the base template. Before the 2.0 theme architecture, store owners had zero ability to edit this file โ€” crawl rules were entirely dictated by Shopify's defaults.

Since the Online Store 2.0 update, Shopify exposes a robots.txt.liquid template inside the theme code editor. Merchants can add custom rules, disallow additional paths, or insert sitemap references โ€” but only by editing this Liquid file. The underlying default rules generated by Shopify still load first, and custom rules are appended beneath them.

This architecture means Shopify already disallows paths like /admin, /cart, /orders, /checkout, and /account by default. These are sensible crawl guards. The practical challenge is that Shopify does not let you remove or override those built-in directives through robots.txt.liquid โ€” you can only add new ones.

Shopify's Default robots.txt Rules and What They Block

Shopify's auto-generated robots.txt disallows several URL groups that should never be indexed: /admin, /cart, /checkout, /orders, /cgi-bin, /account, and internal search result pages like /search. These defaults protect crawl budget from being wasted on non-indexable, login-gated, or near-duplicate pages.

The default file also includes a Sitemap directive pointing to your XML sitemap, which Shopify generates automatically at /sitemap.xml. This matters because search engine crawlers reading robots.txt will discover the sitemap without relying solely on Google Search Console submission.

One common gap in Shopify's defaults is the /collections path combined with sort and filter parameters. Faceted navigation URLs โ€” for example, /collections/shoes?sort_by=price-ascending or /collections/shoes/mens โ€” can generate thousands of near-duplicate URLs. Shopify's default robots.txt does not block these, so high-SKU catalogs on Shopify often leak significant crawl budget unless store owners add explicit disallow rules for parameterized collection URLs.

Editing robots.txt.liquid in the Shopify Theme Editor

To edit Shopify's robots.txt, navigate to Online Store โ†’ Themes โ†’ Edit Code, then open the templates folder and locate robots.txt.liquid. If the file does not exist, create it by adding a new template of type robots.txt. Once created, this file overrides Shopify's default output entirely, so the first step is always to include the default Shopify-generated rules using {% render 'robots' %} or the equivalent Liquid tag documented in Shopify's developer reference โ€” omitting this resets crawl protections.

Inside robots.txt.liquid, standard robots.txt syntax applies. You can add User-agent blocks targeting specific bots โ€” for example, disallowing AI training crawlers like GPTBot or CCBot by adding a User-agent: GPTBot block with Disallow: /. You can also add sitemap references for supplemental sitemaps if using third-party apps that generate their own.

One practical limit: Shopify does not support server-side redirects from within robots.txt.liquid, and the file is served as plain text regardless of the Liquid code used. Complex conditional logic beyond standard robots.txt syntax has no effect on how crawlers parse the file.

Shopify Apps and Third-Party Tools for robots.txt Management

Several Shopify SEO apps surface robots.txt editing through a UI rather than requiring direct Liquid editing. Apps in the Shopify App Store โ€” including dedicated SEO suites โ€” provide a text editor for robots.txt rules with validation, which reduces the risk of syntax errors that would silently break crawl directives. These apps write back to robots.txt.liquid under the hood.

For stores using headless Shopify architecture with a custom storefront on a separate domain or subdomain, robots.txt management moves outside Shopify entirely. The headless front-end framework โ€” Next.js, Remix, Hydrogen โ€” controls its own robots.txt, and the Shopify admin URL typically gets its own separate file. Operators running Hydrogen storefronts configure robots.txt inside the Hydrogen project's public directory or via the framework's built-in route handling.

Google Search Console remains the authoritative diagnostic tool for verifying what Googlebot can and cannot access on a Shopify store. After editing robots.txt.liquid, use the robots.txt tester in Search Console to confirm rules apply correctly to specific URL patterns before assuming crawl behavior has changed.

Critical robots.txt Mistakes Specific to Shopify Stores

The most damaging mistake on Shopify is accidentally disallowing /collections or /products in robots.txt.liquid. Because all product and collection pages share these path prefixes, a single Disallow: /products or Disallow: /collections directive blocks every product page from being crawled. Stores have lost organic traffic from this error after routine theme edits.

A second Shopify-specific pitfall involves duplicate content from international markets. Shopify Markets appends locale prefixes like /en-us/ or /fr-fr/ to URLs. If robots.txt blocks these paths โ€” or fails to include hreflang-aware sitemap entries alongside proper crawl access โ€” international stores fragment their crawlability and indexation across markets.

Parameterized collection URLs created by Shopify's native filtering system (built on URL parameters or metafield-based filtering) are the most common source of crawl budget waste on large Shopify catalogs. Adding Disallow rules for the most common filter parameter patterns โ€” checking your actual filter URL structure first in the browser โ€” is the highest-leverage robots.txt customization available to Shopify operators running 10,000+ SKU catalogs.

Actionable robots.txt Priorities for Shopify Store Operators

Start by verifying your current robots.txt at yourdomain.com/robots.txt and cross-referencing it against the URL Report in Google Search Console to identify which crawled-but-not-indexed URLs are consuming budget. For most Shopify stores, collection filter parameters and internal search variants are the primary offenders.

Open robots.txt.liquid in the theme editor, confirm the Shopify default rules render correctly, then add targeted Disallow rules for filter parameter patterns, any staging or preview URL paths, and any third-party app directories that generate non-indexable pages. Test each change with Search Console's URL Inspection tool before assuming it propagates.

Revisit robots.txt after every major theme update, after installing new apps that generate storefront URLs, and after enabling Shopify Markets or new sales channels. Platform updates occasionally reset or modify robots.txt.liquid, and app installations can add new URL patterns that bypass existing disallow rules.

Frequently asked questions

Can you fully customize robots.txt on Shopify?

Shopify allows customization through the robots.txt.liquid template file in Online Store 2.0 themes. You can add new Disallow rules, User-agent blocks, and Sitemap directives. However, you cannot remove Shopify's built-in default directives โ€” those always render as part of the base output. The customization window is additive, not a full replacement of the file.

What does Shopify block by default in robots.txt?

Shopify's default robots.txt blocks /admin, /cart, /checkout, /orders, /account, /cgi-bin, and internal search result pages. It also includes a Sitemap directive pointing to the auto-generated sitemap.xml. These defaults are sensible but do not cover parameterized collection URLs created by faceted navigation, which is a gap operators must address manually.

How do you edit robots.txt on Shopify?

Go to Online Store โ†’ Themes โ†’ Edit Code and open or create the robots.txt.liquid file inside the templates folder. Write standard robots.txt syntax directly in this file, keeping the Liquid tag that renders Shopify's default rules intact. Save the file and verify changes at yourdomain.com/robots.txt. Use Google Search Console's robots.txt tester to confirm the rules parse correctly.

Does editing robots.txt affect Shopify's SEO performance?

Editing robots.txt affects crawl budget allocation, not direct rankings. Blocking high-volume low-value URLs โ€” such as sort and filter parameter variants โ€” frees Googlebot to crawl product and collection pages more efficiently. Incorrectly disallowing /products or /collections causes immediate indexation loss. The impact on a large Shopify catalog can be significant in both directions.

Does robots.txt on Shopify apply to AI crawlers like GPTBot?

Yes. Robots.txt applies to any crawler that respects the standard, including GPTBot (OpenAI), CCBot (Common Crawl), and similar AI training bots. Adding User-agent: GPTBot with Disallow: / inside robots.txt.liquid signals those crawlers to avoid the store's content. Compliance is voluntary โ€” robots.txt is not an access control mechanism and cannot technically block determined crawlers.

MG
Written by

Matt is the founder of RunOctopus. He built All Angles Creatures from zero to page-1 rankings in reptile feeder insects in under 60 days using exactly this method โ€” turning a hard, entrenched niche into RunOctopus's proof store for programmatic SEO and AI search citation.

Connect on LinkedIn →

See what Otto would build for your store

Free architecture preview. No card required. Five minutes.

Generate Preview →