Skip to main content
Glossary

Sitemap.xml

By · Updated
Quick definition

Sitemap.xml is an XML file located at the root of a domain that lists every important URL on a site along with metadata like the last modified date, giving search engines a complete map of indexable pages.

Sitemap.xml in plain English

Sitemap.xml is a machine-readable inventory of a website's URLs, served at /sitemap.xml. For an ecommerce store, it lists product pages, category pages, blog posts, and other indexable URLs, each wrapped in a <url> tag with a <loc> for the address and a <lastmod> date showing when the page last changed.

Search engine crawlers fetch the sitemap to discover URLs and prioritize re-crawling. When a bot reads the file, it parses each <url> entry, compares the <lastmod> timestamp against its index, and queues changed or new URLs for crawling. The sitemap location is declared in robots.txt or submitted directly through Google Search Console and Bing Webmaster Tools. Large sites split URLs across multiple sitemap files referenced by a sitemap index, since a single file is capped at 50,000 URLs or 50MB uncompressed.

A well-built sitemap contains only canonical, indexable, 200-status URLs with accurate <lastmod> values that update when page content actually changes. A poor sitemap includes redirected URLs, 404s, noindexed pages, parameterized duplicates, or stale lastmod dates that never change. Crawlers downgrade trust in sitemaps with unreliable signals, which slows discovery of legitimately updated pages.

The 50,000-URL ceiling per file matters for ecommerce catalogs at scale. A store with 200,000 SKUs needs a sitemap index pointing to at least four child sitemaps, typically segmented by type: products, categories, blog, static pages. This segmentation also makes it easier to diagnose indexation issues in Search Console, where coverage reports break down by submitted sitemap.

Why sitemap.xml matters for ecommerce

Ecommerce catalogs change constantly: products go out of stock, prices update, new SKUs launch, seasonal collections rotate. Without an accurate sitemap.xml, search engines rely on internal links and external signals to find these changes, which delays indexation by days or weeks. Stores that maintain clean sitemaps with truthful lastmod dates get new products indexed faster, recover faster from site migrations, and surface coverage problems early through Search Console reports. Stores that ignore the sitemap, or auto-generate one filled with redirect chains and noindexed URLs, train Google to trust the file less, which compounds into slower crawling and missed revenue on time-sensitive inventory.

Deeper dives on this term

Focused pages that go deeper than the definition — comparisons, platform-specific guides, operational walkthroughs.

Compare

Sitemap.xml vs Canonical URL: What's the Difference?

Sitemap.xml vs canonical URL: clear differences in what each does, when to use each, and how both work together for ecommerce SEO.

Read →
Compare

Sitemap.xml vs Crawl Budget: What's the Difference?

Sitemap.xml vs crawl budget: a direct comparison of definitions, mechanics, and how ecommerce SEOs use each to control how Google

Read →
Compare

Sitemap.xml vs llms.txt: What's the Difference?

Sitemap.xml guides crawlers to URLs. llms.txt guides AI models to content summaries. See how they differ, overlap, and work togeth

Read →
Compare

Sitemap.xml vs robots.txt: What's the Difference?

Sitemap.xml guides search engines to your pages; robots.txt tells them where not to go. Learn the exact difference and how both fi

Read →
Compare

Sitemap.xml vs Topical Authority: What's the Difference?

Sitemap.xml vs Topical Authority: a direct comparison of definitions, mechanics, and how ecommerce stores should use both to drive

Read →
Platform

Sitemap.xml for Shopify Stores

How Shopify generates and limits sitemap.xml files, which apps extend them, and what workarounds ecommerce operators need to know.

Read →
Platform

Sitemap.xml for Wix Stores

How sitemap.xml works on Wix stores: auto-generation, SEO panel limits, collection pages, and workarounds for ecommerce operators.

Read →
Platform

Sitemap.xml for WooCommerce Stores

How sitemap.xml works on WooCommerce: built-in limits, plugin options, product taxonomy traps, and what to configure before Google

Read →
How-to

How to implement sitemap.xml for an Ecommerce Store

A step-by-step guide to implementing sitemap.xml for an ecommerce store — covering generation, structure, submission, and ongoing

Read →
Checklist

Sitemap.xml Checklist: 12 Items Every Ecommerce Store Should Audit

A 12-item sitemap.xml audit checklist for ecommerce stores—each check includes a clear pass/fail criterion to fix crawl and indexa

Read →

Frequently asked questions

What is sitemap.xml?

Sitemap.xml is an XML file hosted at a site's root that lists indexable URLs along with metadata such as last modified date. Search engines use it to discover pages and detect changes. It follows the sitemaps.org protocol, supports up to 50,000 URLs per file, and is referenced in robots.txt or submitted through search engine webmaster tools.

How many URLs can a sitemap.xml file contain?

A single sitemap.xml file is limited to 50,000 URLs and 50MB uncompressed. Sites exceeding either limit must split URLs across multiple sitemap files and reference them in a sitemap index file. The index file itself can reference up to 50,000 child sitemaps, supporting catalogs with billions of URLs.

Sitemap.xml vs robots.txt: what's the difference?

Robots.txt controls what crawlers are allowed to access, using directives like Disallow and Allow. Sitemap.xml does the opposite: it tells crawlers which URLs exist and should be considered for indexing. Robots.txt restricts; sitemap.xml invites. Robots.txt typically contains a Sitemap: directive pointing to the sitemap.xml location, linking the two files together.

How do I create a sitemap.xml for an ecommerce store?

Most ecommerce platforms generate sitemap.xml automatically. Shopify creates one at /sitemap.xml with child sitemaps for products, collections, pages, and blogs. WooCommerce and Magento require an SEO plugin or extension. For custom builds, generate the file dynamically from the database, include only canonical 200-status URLs, set accurate lastmod values, and submit the URL in Google Search Console.

Does sitemap.xml actually affect rankings?

Sitemap.xml does not directly influence rankings. It affects discovery and crawl efficiency, which determine whether pages get indexed and how quickly updates are reflected. Pages not in the sitemap can still rank if linked internally, and pages in the sitemap are not guaranteed indexation. The value is faster, more reliable indexation of large or frequently changing catalogs.

MG
Written by

Matt is the founder of RunOctopus. He built All Angles Creatures from zero to page-1 rankings in reptile feeder insects in under 60 days using exactly this method — turning a hard, entrenched niche into RunOctopus's proof store for programmatic SEO and AI search citation.

Connect on LinkedIn →

See what Otto would build for your store

Free architecture preview. No card required. Five minutes.

Generate Preview →