Skip to main content
Glossary

Crawl Budget

By ยท Updated
Quick definition

Crawl budget is the number of URLs a search engine bot will crawl on a website within a given timeframe, determined by the site's crawl capacity limit and crawl demand.

Crawl Budget in plain English

Crawl budget is the ceiling on how many pages Googlebot or another search engine crawler will fetch from a domain over a set period. For example, a Shopify store with 50,000 product, collection, and filter URLs may only have 8,000 pages crawled per day, meaning the full catalog takes nearly a week to refresh in the index.

Crawl budget is set by two inputs: crawl capacity limit and crawl demand. Capacity is governed by server response time and error rates โ€” fast, stable servers earn more simultaneous connections, while slow or 5xx-heavy responses force the crawler to back off. Demand is driven by URL popularity, freshness signals, and the size of the known URL set. The crawler pulls from a scheduling queue, prioritizing URLs with higher PageRank, recent updates, and inbound internal links.

Done well, crawl budget is concentrated on canonical, indexable, revenue-driving URLs: product pages, category pages, and editorial content. Logs show Googlebot hitting these URLs within hours of publish or price change. Done poorly, the budget is burned on faceted navigation parameters, internal search results, session IDs, paginated duplicates, and soft-404 product variants โ€” leaving real products stale in the index for weeks.

Crawl budget becomes a material concern at roughly 10,000+ unique URLs, per Google's own guidance. Below that threshold, most sites are crawled completely without intervention. Above it โ€” typical for any catalog with faceted filters or large variant counts โ€” crawl waste compounds quickly and requires active management through robots.txt, canonical tags, and internal linking discipline.

Why crawl budget matters for ecommerce

For ecommerce operators, crawl budget directly controls how fast price changes, new arrivals, restocks, and out-of-stock signals reach Google. A store with 200,000 URLs from color and size filter combinations will see Googlebot spend 70% of its visits on parameter junk while flagship products go un-recrawled for a month โ€” meaning seasonal launches miss the index window and discontinued SKUs keep ranking. Stores that prune URL bloat, block facets in robots.txt, and consolidate variants under canonical parents get faster indexation of new products, more accurate inventory status in SERPs, and quicker recovery after site migrations or replatforms.

Deeper dives on this term

Focused pages that go deeper than the definition โ€” comparisons, platform-specific guides, operational walkthroughs.

Compare

Crawl Budget vs Canonical URL: What's the Difference?

Crawl budget and canonical URL are both crawl-efficiency tools, but they work differently. See the direct comparison and when to u

Read →
Compare

Crawl Budget vs Internal Linking: What's the Difference?

Crawl budget and internal linking are related but distinct SEO levers. Learn exactly how they differ, overlap, and interact for ec

Read →
Compare

Crawl Budget vs robots.txt: What's the Difference?

Crawl budget and robots.txt both control how Googlebot visits your site โ€” but they work differently. Learn the exact distinction a

Read →
Compare

Crawl Budget vs Sitemap.xml: What's the Difference?

Crawl budget and sitemap.xml are not the same thing. Learn how each works, where they overlap, and which one actually controls Goo

Read →
Compare

Crawl Budget vs Topical Authority: What's the Difference?

Crawl budget vs topical authority: clear definitions, mechanical differences, overlap points, and when each one actually determine

Read →
Platform

Crawl Budget for Shopify Stores

How crawl budget works specifically on Shopify stores โ€” platform quirks, duplicate URL patterns, app impacts, and fixes for 6โ€“8-fi

Read →
Platform

Crawl Budget for Wix Stores

How crawl budget works on Wix stores, including platform-specific limits, Wix SEO tools, and workarounds for duplicate URLs and Ja

Read →
Platform

Crawl Budget for WooCommerce Stores

How crawl budget works specifically on WooCommerce stores โ€” platform quirks, URL bloat causes, plugins, and fixes for 6โ€“8 figure o

Read →
How-to

How to implement crawl budget for an Ecommerce Store

A step-by-step operational guide to implementing crawl budget for ecommerce stores. Concrete actions to ensure Googlebot indexes y

Read →
Checklist

Crawl Budget Checklist: 12 Items Every Ecommerce Store Should Audit

A 12-item crawl budget audit checklist for ecommerce stores. Each check includes clear pass/fail criteria to help Googlebot index

Read →

Frequently asked questions

What is crawl budget in SEO?

Crawl budget is the number of URLs a search engine crawler will fetch from a site in a given timeframe. It is the product of crawl capacity limit (how much the server can handle) and crawl demand (how much the crawler wants to fetch based on URL popularity and freshness). Sites with more than 10,000 URLs need to manage it actively.

How many URLs trigger crawl budget concerns?

Google states that crawl budget becomes a practical concern for sites with more than 10,000 unique URLs, or for sites that generate URLs dynamically through parameters. Smaller sites are typically crawled in full without intervention. Large ecommerce catalogs with faceted navigation routinely exceed millions of crawlable URLs and require strict management.

Crawl budget vs index budget: what's the difference?

Crawl budget is how many URLs Googlebot fetches. Index budget is how many of those fetched URLs are actually stored in the search index. A page can be crawled and then dropped during indexing for being duplicate, thin, or low quality. Crawl precedes indexing, but the two are governed by separate systems and separate signals.

How do you optimize crawl budget for an ecommerce store?

Block faceted navigation parameters and internal search URLs in robots.txt, return clean 404 or 410 status codes for removed products, consolidate variants under canonical parent URLs, fix redirect chains, eliminate soft 404s, and submit accurate XML sitemaps segmented by content type. Server response time under 200ms also raises the crawl capacity limit Google assigns.

Does crawl budget actually matter for most stores?

It matters for any ecommerce site above roughly 10,000 URLs or any site with faceted filters, search pages, or large variant counts. For a 500-SKU boutique, crawl budget is a non-issue. For a multi-category store generating filter combinations, ignoring crawl budget leads to stale prices in SERPs, delayed indexation of new products, and zombie URLs ranking for discontinued items.

MG
Written by

Matt is the founder of RunOctopus. He built All Angles Creatures from zero to page-1 rankings in reptile feeder insects in under 60 days using exactly this method โ€” turning a hard, entrenched niche into RunOctopus's proof store for programmatic SEO and AI search citation.

Connect on LinkedIn →

See what Otto would build for your store

Free architecture preview. No card required. Five minutes.

Generate Preview →