How Shopify Handles robots.txt Differently From Other Platforms
Shopify generates its robots.txt file automatically at yourdomain.com/robots.txt. Unlike WordPress or custom-built storefronts where you can overwrite the file directly on a server, Shopify controls the base template. Before the 2.0 theme architecture, store owners had zero ability to edit this file โ crawl rules were entirely dictated by Shopify's defaults.
Since the Online Store 2.0 update, Shopify exposes a robots.txt.liquid template inside the theme code editor. Merchants can add custom rules, disallow additional paths, or insert sitemap references โ but only by editing this Liquid file. The underlying default rules generated by Shopify still load first, and custom rules are appended beneath them.
This architecture means Shopify already disallows paths like /admin, /cart, /orders, /checkout, and /account by default. These are sensible crawl guards. The practical challenge is that Shopify does not let you remove or override those built-in directives through robots.txt.liquid โ you can only add new ones.
Shopify's Default robots.txt Rules and What They Block
Shopify's auto-generated robots.txt disallows several URL groups that should never be indexed: /admin, /cart, /checkout, /orders, /cgi-bin, /account, and internal search result pages like /search. These defaults protect crawl budget from being wasted on non-indexable, login-gated, or near-duplicate pages.
The default file also includes a Sitemap directive pointing to your XML sitemap, which Shopify generates automatically at /sitemap.xml. This matters because search engine crawlers reading robots.txt will discover the sitemap without relying solely on Google Search Console submission.
One common gap in Shopify's defaults is the /collections path combined with sort and filter parameters. Faceted navigation URLs โ for example, /collections/shoes?sort_by=price-ascending or /collections/shoes/mens โ can generate thousands of near-duplicate URLs. Shopify's default robots.txt does not block these, so high-SKU catalogs on Shopify often leak significant crawl budget unless store owners add explicit disallow rules for parameterized collection URLs.
Editing robots.txt.liquid in the Shopify Theme Editor
To edit Shopify's robots.txt, navigate to Online Store โ Themes โ Edit Code, then open the templates folder and locate robots.txt.liquid. If the file does not exist, create it by adding a new template of type robots.txt. Once created, this file overrides Shopify's default output entirely, so the first step is always to include the default Shopify-generated rules using {% render 'robots' %} or the equivalent Liquid tag documented in Shopify's developer reference โ omitting this resets crawl protections.
Inside robots.txt.liquid, standard robots.txt syntax applies. You can add User-agent blocks targeting specific bots โ for example, disallowing AI training crawlers like GPTBot or CCBot by adding a User-agent: GPTBot block with Disallow: /. You can also add sitemap references for supplemental sitemaps if using third-party apps that generate their own.
One practical limit: Shopify does not support server-side redirects from within robots.txt.liquid, and the file is served as plain text regardless of the Liquid code used. Complex conditional logic beyond standard robots.txt syntax has no effect on how crawlers parse the file.
Shopify Apps and Third-Party Tools for robots.txt Management
Several Shopify SEO apps surface robots.txt editing through a UI rather than requiring direct Liquid editing. Apps in the Shopify App Store โ including dedicated SEO suites โ provide a text editor for robots.txt rules with validation, which reduces the risk of syntax errors that would silently break crawl directives. These apps write back to robots.txt.liquid under the hood.
For stores using headless Shopify architecture with a custom storefront on a separate domain or subdomain, robots.txt management moves outside Shopify entirely. The headless front-end framework โ Next.js, Remix, Hydrogen โ controls its own robots.txt, and the Shopify admin URL typically gets its own separate file. Operators running Hydrogen storefronts configure robots.txt inside the Hydrogen project's public directory or via the framework's built-in route handling.
Google Search Console remains the authoritative diagnostic tool for verifying what Googlebot can and cannot access on a Shopify store. After editing robots.txt.liquid, use the robots.txt tester in Search Console to confirm rules apply correctly to specific URL patterns before assuming crawl behavior has changed.
Critical robots.txt Mistakes Specific to Shopify Stores
The most damaging mistake on Shopify is accidentally disallowing /collections or /products in robots.txt.liquid. Because all product and collection pages share these path prefixes, a single Disallow: /products or Disallow: /collections directive blocks every product page from being crawled. Stores have lost organic traffic from this error after routine theme edits.
A second Shopify-specific pitfall involves duplicate content from international markets. Shopify Markets appends locale prefixes like /en-us/ or /fr-fr/ to URLs. If robots.txt blocks these paths โ or fails to include hreflang-aware sitemap entries alongside proper crawl access โ international stores fragment their crawlability and indexation across markets.
Parameterized collection URLs created by Shopify's native filtering system (built on URL parameters or metafield-based filtering) are the most common source of crawl budget waste on large Shopify catalogs. Adding Disallow rules for the most common filter parameter patterns โ checking your actual filter URL structure first in the browser โ is the highest-leverage robots.txt customization available to Shopify operators running 10,000+ SKU catalogs.
Actionable robots.txt Priorities for Shopify Store Operators
Start by verifying your current robots.txt at yourdomain.com/robots.txt and cross-referencing it against the URL Report in Google Search Console to identify which crawled-but-not-indexed URLs are consuming budget. For most Shopify stores, collection filter parameters and internal search variants are the primary offenders.
Open robots.txt.liquid in the theme editor, confirm the Shopify default rules render correctly, then add targeted Disallow rules for filter parameter patterns, any staging or preview URL paths, and any third-party app directories that generate non-indexable pages. Test each change with Search Console's URL Inspection tool before assuming it propagates.
Revisit robots.txt after every major theme update, after installing new apps that generate storefront URLs, and after enabling Shopify Markets or new sales channels. Platform updates occasionally reset or modify robots.txt.liquid, and app installations can add new URL patterns that bypass existing disallow rules.