How Crawl Errors Behave Differently on Shopify
Shopify's hosted infrastructure creates crawl error patterns that differ from self-hosted platforms. Because Shopify manages the server layer, store owners cannot edit the web server configuration, adjust crawl rate at the server level, or implement custom redirect rules via .htaccess. Every URL decision flows through Shopify's Liquid templating system, the admin's URL redirect tool, or third-party apps โ and each layer introduces its own failure modes.
The most distinctive Shopify crawl error is the duplicate-URL 404. Shopify automatically generates multiple URL paths for the same product: /products/[handle], /collections/[collection]/products/[handle], and occasionally /collections/all/products/[handle]. When a collection is deleted or a product is removed from a collection, the collection-scoped URL returns a 404 while the canonical /products/ URL remains live. Search engines crawling old sitemap entries or backlinks hit these dead ends persistently.
Shopify also enforces a canonical tag that points collection-scoped product URLs back to /products/[handle]. This means the SEO signal consolidates correctly when a page loads, but it does nothing for the 404 that appears before the page renders โ the crawl error is logged regardless of canonical intent.
The Most Common Shopify Crawl Errors and Their Sources
404 errors from deleted products are the highest-volume crawl error for most Shopify stores. When a product is deleted rather than hidden or set to 'Draft', its URL disappears entirely. Shopify does not auto-create a redirect. Any external link, internal link, or sitemap entry pointing to that URL generates a 404 that Google Search Console reports under the Coverage or Pages report.
Soft 404s appear when Shopify theme search pages or filtered collection pages return a 200 status with 'no results found' content. Google treats these as soft 404s because the page signals empty content despite the HTTP success code. This is common on stores using Shopify's default collection filtering or apps that generate parameterized URLs for faceted navigation.
Redirect chains are another Shopify-specific issue. Shopify's built-in URL redirect tool does not automatically detect chains โ if /old-url redirects to /mid-url, and /mid-url later gets another redirect to /new-url, the chain persists in the database. Crawlers waste crawl budget following multi-hop chains, and some terminate after a set number of hops, logging a crawl error on the intermediate URLs.
Pagination errors surface on stores using the older ?page=2 parameter style after migrating to Shopify's newer section-based infinite scroll themes. The old paginated URLs no longer resolve, leaving crawl errors from any indexed or linked page-2+ URLs.
Shopify-Specific Tools for Diagnosing Crawl Errors
Google Search Console remains the primary diagnostic tool. Connect it via the Shopify admin under Online Store > Preferences > Google Search Console. The Pages report under Indexing shows Not Indexed URLs broken down by reason โ 'Not found (404)', 'Soft 404', and 'Redirect error' are the three categories most relevant to Shopify stores. Export the full list and cross-reference it against the store's current product and collection handles.
Shopify's built-in URL redirects section (admin/redirects) shows existing redirects but provides no crawl data. For stores with large catalogs, the Bulk Redirects app and similar tools in the Shopify App Store allow CSV import of redirect rules โ essential when migrating from another platform or restructuring collections. Screaming Frog SEO Spider and Sitebulb both crawl Shopify stores accurately when given the sitemap URL at [storename].myshopify.com/sitemap.xml.
Shopify automatically generates a sitemap.xml that includes active products, collections, pages, and blog posts. It does not include URLs that have been deleted or set to Draft. If crawl errors appear for URLs that are in the sitemap, the issue is a rendering or redirect problem, not a sitemap problem. If errors appear for URLs absent from the sitemap, the source is external links, old sitemaps submitted to Search Console, or internal links not yet updated.
Limitations Shopify Store Owners Must Work Around
Shopify does not allow robots.txt editing in the traditional sense. From late 2021, Shopify gave merchants access to a robots.txt.liquid file through the theme editor, allowing Disallow rules and custom User-agent blocks to be added. However, this file cannot block individual URL patterns dynamically based on product metafields or inventory status โ it is a static template rendered at request time.
The platform enforces canonical tags automatically and does not allow them to be overridden per-URL without custom Liquid code in the theme. This is mostly protective, but it means that intentional duplicate structures โ such as region-specific storefronts on Shopify Markets using subfolders โ require careful Liquid theme customization to ensure canonical and hreflang tags are set correctly, or crawl errors accumulate across locales.
Shopify's URL structure is fixed. Products must live under /products/, collections under /collections/, and pages under /pages/. There is no way to flatten the URL hierarchy or place a product at the root level. Stores migrating from WooCommerce or custom-built sites often have indexed URLs at different paths, and every one of those must be redirected manually or via bulk redirect import โ Shopify does not perform automatic path matching.
Fixing Shopify Crawl Errors: A Prioritized Approach
Start with the 404 export from Google Search Console. Filter for URLs that were receiving organic traffic in the past six months โ this is visible in the Performance report when filtered by page. For each high-traffic 404, create a redirect in the Shopify admin pointing to the most relevant current product, collection, or page. For URLs with no traffic history, a redirect to the closest category page or the homepage is acceptable.
Address redirect chains by auditing the full redirects database. Export it as a CSV, load it into a spreadsheet, and use a lookup formula to identify any 'to' URL that also appears as a 'from' URL in another row. Flatten those chains so every old URL points directly to its final destination in a single hop. Re-import the cleaned CSV via the bulk redirect tool.
For soft 404s on filtered collection pages, the fix depends on the filtering app in use. Apps like Boost Commerce Filter & Search and SearchPie generate canonical tags that consolidate filtered URLs to the base collection page, which resolves most soft 404 signals. If the store uses Shopify's native predictive search without a filtering app, adding a noindex tag to empty-result pages via Liquid theme code prevents Search Console from reporting them.